# Caching strategies
> ref: https://codeahoy.com/2017/08/11/caching-strategies-and-how-to-choose-the-right-one/

Caching is one of the easiest ways to increase system performance.

### When?
- When the system write heavy and reads less frequently. (e.g. time based logs)
- When data written once and read multiple times. (e.g. User Profile)
- When data returned always unique. (e.g. search queries)


### Cache-Aside
The cache sits on the side and the application directly talks to both the cache and the database.

1. The application first checks the cache.
2. If the data is found in cache, we’ve cache hit. The data is read and returned to the client.
3. If the data is not found in cache, we’ve cache miss. The application has to do some extra work. It queries the database to read the data, returns it to the client and stores the data in cache so the subsequent reads for the same data results in a cache hit

![Image](../images/cache-aside.png)

Tech: Memcached and Redis

*Pros:* 
- Work best for read-heavy workloads.
- Resilient to cache failures
- Data model in cache can be different than the data model in database.
  
*Cons:*
- The most common write strategy is to write data to the database directly. When this happens, cache may become inconsistent with the database.
- Use time to live (TTL) and continue serving stale data until TTL expires. 

### Read-Through
Read-through cache sits in-line with the database. When there is a cache miss, it loads missing data from database, populates the cache and returns it to the application.
- Both cache-aside and read-through strategies load data lazily, that is, only when it is first read.

Tech: DynamoDB Accelerator (DAX)

![Image](../images/read-through.png)

*Pros:*
- The data model in read-through cache cannot be different than that of the database.
- Work best in read-heavy workloads when the same data is requested many times.

*Cons:*
- When the data is requested the first time, it always results in cache miss and incurs the extra penalty of loading data to the cache.
- It is also possible for data to become inconsistent between cache and the database
- Deal with this by ‘warming’ or ‘pre-heating’

# Write-Through 
Data is first written to the cache and then to the database. The cache sits in-line with the database and writes always go through the cache to the main database.

Tech: DynamoDB Accelerator (DAX)

![Image](../images/write-through.png)

*Pros:*
- Can be paired with read-through caches, so we get all the benefits of read-through and we also get data consistency guarantee.

*Cons:* 
- Extra write latency

# Write-Around
Data is written directly to the database and only the data that is read makes it way into the cache.

*Pros:*
- Write-around can be combine with read-through and provides good performance in situations where data is written once and read less frequently or never.
- For example, real-time logs or chatroom messages. Likewise, this pattern can be combined with cache-aside as well.
 

# Write-Back
The application writes data to the cache which acknowledges immediately and after some delay, it writes the data back to the database.
Sometimes called write-behind as well.

Tech: DAX aws

![Image](../images/write-back.png)

*Pros:*
- Good for write-heavy workloads.  
- Good at combined with read-through.
- Resilient to database failures and can tolerate some database downtime.
-  If batching or coalescing is supported, it can reduce overall writes to the database, which decreases the load and reduces costs
  
*Cons:*
- if there’s a cache failure, the data may be permanently lost.
> Most relational databases storage engines (i.e. InnoDB) have write-back cache enabled by default in their internals. Queries are first written to memory and eventually flushed to the disk.