# Caching Fundamentals

## What is Caching?

Caching is a technique used to store frequently accessed data in a fast storage layer (such as memory or a distributed cache) so that future requests for the same data can be served quickly, without repeatedly fetching it from a slower source like a database, file system, or external API.

**Key points:**
- Caching saves data closer to the application to improve speed and reduce load.
- It acts as a buffer between the application and the slower data source.
- Common cache types: in-memory (e.g., Redis, Memcached), browser cache, CDN edge cache.
- Used in web apps, databases, operating systems, and more.


**Caching minimize the**
- Latency
- Load on primary data source
- Network calls
- CPU and I/O usage

## Why is Caching Needed?
Caching is essential for building scalable, high-performance systems. It helps reduce latency, offload backend resources, and improve user experience.

**Without caching:**
- Every request hits the database or backend service
- Higher response time for users
- Increased CPU, memory, and database load
- Poor scalability and higher infrastructure costs
- Risk of bottlenecks and service degradation during peak loads
- DB becomes a bottleneck
- Without caching to handle larger load on db --> Horizantal scaling will be required
- So ultimately it all increase the cloud cost


**With caching:**
- Faster responses (data served from memory or local storage)
- Reduced database and backend calls
- Better scalability and cost efficiency
- Improved user experience and perceived performance
- Ability to handle higher traffic with the same resources

**Example:**
A news website caches the homepage articles. Instead of querying the database for every user, it serves the cached version, reducing load and delivering content instantly.

## How Does Cache Work?

Applications typically store data in a database, which requires network calls and I/O operations—these are time-consuming and can slow down response times.

A cache acts as a high-speed data layer between the application and the database. When a request is made:
1. The application first checks the cache for the required data.
2. If the data is found (cache hit), it is returned immediately, avoiding a database call.
3. If the data is not found (cache miss), the application fetches it from the database, stores it in the cache for future requests, and then returns it to the user.

**Benefits:**
- Reduces network and database load
- Improves API and application performance
- Enables systems to scale efficiently under heavy traffic

**Illustration:**
- Without cache: Every request → Database → Slower response
- With cache: Most requests → Cache → Fast response; only occasional requests go to the database

## Types of Caching 

### 1. In-Memory Cache
- **Local (per-JVM):** Fastest access, data stored in the application's memory (e.g., Caffeine, Guava Cache). No network overhead, but not shared across nodes. Suitable for small datasets and stateless services.
- **Distributed:** Data is stored in a cluster of cache servers (e.g., Redis, Memcached, Hazelcast, Apache Ignite). Enables horizontal scaling and high availability. Used for large-scale, multi-node Java applications.

### 2. Persistent/Hybrid Cache
- Combines memory and disk storage (e.g., Ehcache, Ignite). Retains data across restarts, useful for large datasets or when cache warm-up is expensive.

### 3. CDN (Content Delivery Network) Cache
- Caches static assets (images, JS, CSS) at edge locations close to users. Reduces latency and offloads backend servers. Not Java-specific, but critical for web-scale systems.

### 4. Application-Level Cache
- Caching at the service or repository layer (e.g., using Spring Cache abstraction). Can cache method results, database queries, or expensive computations. Supports annotations and AOP for declarative caching.

### 5. Database Cache
- **Query cache:** Caches results of frequent DB queries (e.g., MySQL Query Cache, Hibernate 2nd Level Cache).
- **Row/Entity cache:** Caches individual rows/entities (e.g., JPA/Hibernate L2 cache, Redis as a backing store).
- **Write-through/read-through:** Integrates cache with DB operations for consistency.

### 6. HTTP Cache
- Caches HTTP responses at the client, proxy, or gateway (e.g., using Cache-Control headers, reverse proxies like NGINX, or API gateways).
- Useful for REST APIs and microservices to reduce repeated processing.

### 7. OS/Page Cache
- Operating system caches disk blocks in memory. Impacts DB and file I/O performance. Not directly controlled by Java, but important for backend performance tuning.

## Caching in Distributed Systems:

### 1. Cache Consistency Models
- **Strong Consistency:** Guarantees that all nodes see the same data at the same time. Achieved via distributed locks or consensus protocols (e.g., Zookeeper, etcd), but can impact performance.
- **Eventual Consistency:** Updates propagate asynchronously; temporary staleness is tolerated. Most distributed caches (e.g., Redis, Memcached) use this model for scalability.
- **Read-Your-Writes Consistency:** Guarantees that a client always sees its own updates, even if other clients see stale data.

### 2. Cache Invalidation in Distributed Environments
- **Explicit Invalidation:** Application sends a message/event to all cache nodes to remove or update a key (e.g., using pub/sub in Redis).
- **Time-Based Expiry (TTL):** Each node independently expires data after a set time. Simple, but may serve stale data briefly.
- **Versioning:** Store a version/timestamp with each cache entry; clients ignore stale versions.

### 3. Preventing Cache Stampede and Avalanche
- **Locking/Request Coalescing:** Only one thread fetches data from the DB on a cache miss; others wait for the result.
- **Randomized TTLs:** Prevents many keys from expiring simultaneously, avoiding spikes in DB load.
- **Background Refresh:** Proactively refresh hot keys before they expire.

### 4. Monitoring and Observability
- Track cache hit/miss rates, evictions, and latency across all nodes.
- Use distributed tracing to correlate cache performance with end-to-end latency.
- Integrate with monitoring tools (Prometheus, Grafana, ELK stack).

### 5. Security and Data Protection
- Avoid caching sensitive data unless encrypted and access-controlled.
- Use network-level security (TLS, firewalls) for distributed cache clusters.

---

## Why Consider Database Caching?

Database caching is a critical technique for modern data architectures, especially in high-concurrency, low-latency environments. By storing frequently accessed data in memory, systems can dramatically reduce response times and backend load.

### Key Benefits
- **Reduced Latency:** Serving data from cache eliminates slow disk I/O and network round-trips to the database, resulting in sub-millisecond response times.
- **Scalability:** Caching enables systems to handle thousands or millions of concurrent users without overwhelming the database.
- **Cost Efficiency:** Reduces the need for expensive database scaling (vertical or horizontal), lowering infrastructure costs.
- **Improved Throughput:** Batch jobs and analytics queries run significantly faster when intermediate results are cached.
- **Resilience:** Caching can help absorb traffic spikes and protect the database from overload during peak usage.

### Real-World Impact
- Jobs that previously took hours can be reduced to minutes, and jobs that took minutes cut to seconds.
- E-commerce sites cache product catalogs and inventory to deliver instant search results.
- Social networks cache user profiles and feeds to support real-time interactions.