# Memory Management in Spark

Spark memory management is divided into two types:

- Static Memory Manager (*Deprecated* since Spark 1.6).
- Unified Memory Manager (*default*).

## Unified Memory Manager (UMM)

### Architecture

![image.png](./images/Cd_Unified_Memory_Manager_5GB.jpg)

[Source](https://community.cloudera.com/t5/Community-Articles/Spark-Memory-Management/ta-p/317794)

Advantages of UMM:
- The boundary between storage memory and execution memory is not static, and in cases of memory pressure, the boundary would be moved, i.e., one region would grow by borrowing space from another one.
- When the application has no cache and is propagating, execution uses all the memory to avoid unnecessary disk overflow.
- When the application has a cache, it will reserve the minimum storage memory so that the data block is not affected.

JVM has two types of memory:

- On-Heap Memory.
- Off-Heap Memory.

There is one more segment of memory that is accessed by Spark, i.e., external process memory, mainly used for PySpark and SparkR applications, resides outside the JVM.

### On-Heap Memory (*default*)

The size of the on-heap memory is configured by the --executor-memory or spark.executor.memory parameter when the Spark application starts.

Two main configurations control executor memory allocation:

| Parameter | Description |
| -- | -- |
|spark.memory.fraction (default 0.6) | Fraction of the heap space used for execution and storage. The lower this is, the more frequently spills and cached data evictions occur. The purpose of this configuration is to set aside memory for internal metadata, user data structures, and imprecise size estimation in the case of sparse, unusually large records. |
|spark.memory.storageFraction (default 0.5) | The size of the storage region within the space set aside by spark.memory.fraction. Cached data may only be evicted if total storage exceeds this region. |

Apache Spark supports three memory regions:

- Reserved Memory (**hard-coded** 300MB):
    - Reserved for the system and is used to store Spark's internal objects.
    - If executor memory is less than 1.5 times the reserved memory (450MB), Spark will raise an error.

    ![image.png](./images/low_exe_memory_error.png)

- User Memory: *(Java Heap — Reserved Memory) * (1.0 — spark.memory.fraction)*
    - Used to store user-defined data structures, Spark internal metadata, any UDFs created by the user, and the data needed for RDD conversion operations, such as RDD dependency information, etc.
    - Is 40% of (Java Heap - Reserved Memory) by default.

- Spark Memory: *(Java Heap — Reserved Memory) * spark.memory.fraction*
    - Execution: used for shuffles, joins, sorts, and aggregations.
    - Storage: used to cache partitions of data.
    - Is 60% of (Java Heap - Reserved Memory) by default.