## Executor Memory in Spark
The executor memory in Spark refers to the memory allocated to each executor process. Executors are responsible for running tasks and storing intermediate data. 

Spark uses a unified memory management model that handles both execution and storage memory.

**Potential Interview Questions:**

- Why do we get OOM when data can be spilled to the disk?
- How Spark manages storage inside executor internally?
- How task is split in executor?
- Why do we need overhead memory?
- When do we get executor OOM?
- Types of memory manager in Spark?

### Components of Executor Memory
The total memory allocated to an executor (`spark.executor.memory`) is divided into` heap memory` and `overhead memory.`

- **JVM Heap Memory:** The main memory for data processing and storage (divided into execution and storage memory).
- **Off-Heap:** Memory	Memory for unmanaged storage (e.g., Tungsten optimizations).
- **Overhead Memory:** Additional memory for JVM processes, including shuffle buffers and garbage collection.

<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://raw.githubusercontent.com/rohish-zade/PySpark/refs/heads/main/materials/spark-executor-memory.png" alt="spark-executor-memory" style="width: 600px">
</div>

![imag](https://raw.githubusercontent.com/rohish-zade/PySpark/refs/heads/main/materials/spark-executor-memory.png)

### Overhead Memory
- It is used for non-JVM processes. 
- controlled by the `spark.executor.memoryOverhead` parameter.
- JVM overhead for garbage collection.
- Memory for shuffle operations and buffering.
- Native memory for off-heap operations.
- Spark internal processes (e.g., task dispatching).
- Default value: max(10% of `spark.executor.memory`, 384 MB).

### JVM Heap Memory
This is the main part of the executor memory, controlled by the `spark.executor.memory` parameter.

It is further divided into two areas:
- **Reserved Memory**
- **User Memory:**
- **Spark Memory:**


#### Reserved Memory
- To ensure Spark’s internal operations (e.g., monitoring, bookkeeping, and maintaining metadata) have sufficient memory to function even under heavy workloads.
- It is used to stored the spark internal objects.
- used by spark engine
- It is not available for either execution or storage tasks.
- `Default Value:` Spark reserves 300 MB of memory by default for this purpose, which cannot be reconfigured.


#### User Memory
25% of the allocated executor memory used for:
- Remaining heap memory outside Spark's managed memory
- Stores the data structures and used defined functions
- Metadata and data structures related to RDD partitions, Spark SQL structures, third-party libraries, and broadcast variables.

#### Spark Memory

Spark uses a unified memory model, which means the execution and storage memory share the same pool.
Unused storage memory can be borrowed by execution tasks and vice versa.

75% of the allocated executor memory, further divided into:
- **Execution Memory:** 
  - Used for shuffle, join, and sort operations. It spills to disk if memory limits are exceeded.
  - Execution memory is used for storing temporary data structures during task execution.
  - It's used for storing serialized task results, intermediate data, and shuffle data. 
  - If Storage Memory isn't fully utilized, unused memory can be borrowed for Execution Memory.

- **Storage Memory:** 
  - Used for caching RDDs, DataFrames, and broadcast variables. Can spill to disk if necessary.
  - Storage memory is used for caching RDDs, DataFrames, and Datasets. 
  - It's managed by the Memory Manager and takes up a portion of the heap and off-heap memory. 
  - The LRU `(Least Recently Used`) algorithm is used to evict cached data from storage memory when it's full.
  - Purpose:
    - To avoid recomputation of frequently accessed data.
    - To reduce I/O overhead by caching data in memory instead of reading it repeatedly from disk.
  - Configuration:
    - The size of storage memory is dynamically adjusted within the Spark Memory pool based on workload needs.
    - Example: If execution memory requires more space, it can borrow from the storage memory pool.

The boundary between execution and storage memory is flexible, allowing Spark to allocate memory dynamically based on current needs.

#### Why do we get executor OOM(Out of Memory) when data can be spilled to the disk?

- Executor OOM occurs because spilling to disk still requires memory for buffering and organizing data (e.g., for shuffle or sort operations) before the spill happens. 
- If the memory needed for this preparation exceeds the available memory, OOM occurs even before spilling starts.
- If the executor doesn’t have enough memory to handle these intermediate operations or if tasks process very large data due to improper partitioning or skew, the executor can run out of memory. Additionally, insufficient overhead memory for tasks and slow disk I/O can exacerbate the issue.

##### below can be the number of reasons for executor OOM(Out of Memory):
- **Insufficient Memory for Spill Preparation:** Before spilling data to disk, Spark organizes and buffers it in memory. If memory is too constrained, even temporary buffering fails.
- **Large Shuffle or Aggregation Operations:** When shuffle or aggregation results exceed the allocated memory and the spill process cannot keep up, OOM can occur.
- **Improper Memory Allocation:**
  - Insufficient memory for the task (executor memory too small).
  - Poor tuning of memory fractions (e.g., spark.memory.fraction).
- **Large Broadcast Variables:** If broadcast variables or dataframes are too large to fit into available memory.
- **Memory Leaks:** Accumulation of references in RDDs or DataFrames can prevent garbage collection.


#### 2. How Spark manages storage inside an executor internally?
Inside each executor, Spark divides memory into distinct areas:

**a) Unified Memory Management (default in Spark 1.6+):**
- Execution Memory: For computations like shuffles, joins, sorts, and aggregations.
- Storage Memory: For caching RDDs and DataFrames.
- Both share the same memory pool, which can dynamically adjust:
  - If execution memory demands increase, it can take from storage memory (if unoccupied).
  - Similarly, storage memory can expand into unused execution memory.

**b) Heap Memory Breakdown:**
- Executor Memory: Divided into:
  - User Memory: Available for user-defined objects.
  - Reserved Memory: Fixed memory Spark reserves for internal needs.
  - Spark Memory: 
    - Execution (task-specific computations).
    - Storage (caching/persistence).

**c) Disk Spill Mechanism:**
- When memory is insufficient, Spark spills intermediate data to disk.
- Spill occurs in stages (e.g., during shuffle, sort, or join operations).

#### Why do we need overhead memory?
Overhead memory is reserved for non-heap usage, such as:
- **Serialization/Deserialization:** Used to handle serialized data transfer between executors.
- **JVM Overheads:** Space required for JVM metadata, native memory, etc.
- **Task Overheads:** Buffering, intermediate data structures, and OS-level operations.
- Configuration:
  - Controlled by spark.yarn.executor.memoryOverhead (on YARN) or spark.executor.memoryOverhead:

Without sufficient overhead memory, tasks may fail, causing executor OOM errors.

#### When do we get executor OOM?
Executor OOM errors typically occur under the following conditions:
- Insufficient Memory Allocation:
  - Executor memory is smaller than required for task execution or caching.
- Large Shuffle Data:
  - Excessive data in shuffle spills can exceed memory limits.
- Improper Partitioning:
  - Too few partitions lead to large data chunks in a single task, overwhelming memory.
- Caching Large Data:
  - When attempting to persist or cache large RDDs/DataFrames with insufficient storage memory.
- Broadcast Variables:
  - Overly large broadcast variables can exhaust executor memory.
- Insufficient Overhead Memory:
  - Overhead memory isn’t enough for OS and JVM operations.

#### Types of Memory Manager in Spark
Spark has two types of memory management systems:

##### a) Static Memory Manager (Spark 1.5 and earlier)
- Fixed allocation for execution and storage memory.
- Poor utilization as memory is statically divided.
##### b) Unified Memory Manager (Default since Spark 1.6)
- Dynamic Allocation:
  - Execution and storage memory share a unified pool.
  - Unused memory in one area can be used by the other