# PySpark: Zero to Hero
## Module 33: Spark Memory Management and OOM Analysis

In this module, we dive deep into **Spark Memory Management**. Understanding how Spark manages memory inside the JVM is crucial for tuning applications and avoiding the dreaded **Out Of Memory (OOM)** errors.

We will simulate a constrained environment (small executors) to mathematically verify Spark's memory division and intentionally trigger OOM errors to understand their root causes.

### Agenda:
1.  **Memory Theory:** Understand On-Heap, Off-Heap, Reserved, User, and Unified Memory.
2.  **Environment Setup:** Create a Spark Session with constrained memory (512MB per executor).
3.  **Memory Calculation:** Manually calculate available Storage/Execution memory and verify against Spark UI.
4.  **OOM Simulation 1:** Data Explosion using `explode()` on strings.
5.  **OOM Simulation 2:** Handling large single-record files (Skew).

In [None]:
from pyspark.sql import SparkSession
from pyspark.sql.functions import split, explode, lower, lit, count

# We configure a specific amount of memory to validate calculations easily.
# Executor Memory: 512MB
# Executor Cores: 4
# Note: In a local environment, 'spark.executor.memory' might not strictly apply 
# the same way as a cluster (YARN/K8s), but it helps simulate the limits.

spark = SparkSession.builder \
    .appName("Spark_Memory_Management_OOM") \
    .master("local[*]") \
    .config("spark.executor.memory", "512MB") \
    .config("spark.executor.cores", "4") \
    .config("spark.memory.fraction", "0.6") \
    .config("spark.memory.storageFraction", "0.5") \
    .config("spark.sql.adaptive.enabled", "false") \
    .getOrCreate()

print("Spark Session Created with constrained memory.")
print(f"Spark UI: {spark.sparkContext.uiWebUrl}")

## 1. Spark Memory Architecture

When you request `512MB` for an executor, you do not get the full 512MB for data storage and execution. The JVM and Spark partition this memory as follows:

1.  **JVM Heap:** The total memory requested (`512MB`).
2.  **Reserved Memory:** Spark reserves **300MB** for internal metadata and failure recovery. This is hardcoded.
3.  **Usable Memory:** `(JVM Heap - Reserved Memory)`.
    *   *Note:* The JVM itself has overhead. Spark typically considers only ~89-90% of the heap as "safe" to use before calculating the 300MB deduction, though exact behavior depends on the Spark version and deploy mode.

### The Unified Region (60/40 Split)
The remaining usable memory is split based on `spark.memory.fraction` (default 0.6):
1.  **Spark Memory (60%):** Unified region for **Storage** (cached data) and **Execution** (shuffles, joins, sorts).
2.  **User Memory (40%):** For your UDFs, RDD metadata, and custom data structures.

In [None]:
# Let's perform the calculation based on standard Spark Memory Management logic.

executor_memory_mb = 512

# 1. JVM Usable (Spark estimates ~89-90% of heap is actually available for its regions 
# due to object headers/GC overhead).
jvm_usable_fraction = 0.89  
jvm_usable_memory = executor_memory_mb * jvm_usable_fraction
print(f"JVM Usable Memory (~89%): {jvm_usable_memory:.2f} MB")

# 2. Subtract Reserved Memory
reserved_memory = 300
usable_memory = jvm_usable_memory - reserved_memory
print(f"Usable Memory after Reserve: {usable_memory:.2f} MB")

# 3. Spark Unified Memory (Default 60% of Usable)
# Config: spark.memory.fraction = 0.6
spark_memory_fraction = 0.6
unified_memory = usable_memory * spark_memory_fraction

print(f"Total Unified Spark Memory (Storage + Execution): {unified_memory:.2f} MB")

# 4. User Memory (Remaining 40%)
user_memory = usable_memory * (1 - spark_memory_fraction)
print(f"User Memory: {user_memory:.2f} MB")

# 5. Storage Memory (Default 50% of Unified, immune to eviction)
storage_pool = unified_memory * 0.5
print(f"Protected Storage Memory: {storage_pool:.2f} MB")

# Note: Check the 'Executors' tab in Spark UI to see the actual 'Storage Memory' column.
# It should match the 'Total Unified Spark Memory' calculation closely (~93 MB).

## 2. OOM Simulation: The `explode` Operation

One common cause of OOM is **Data Explosion**. A transformation like `explode` creates a new row for every element in an array.

If you have a dataset where a single row contains a massive array (or you generate one via `crossJoin` or string splitting), the resulting data volume can instantly exceed the **Execution Memory**, causing a spill to disk or an OOM crash if the serialized object is too large.

In [None]:
# We will create a dummy dataset or read a text file.
# For this demo, let's assume we have a text file with some content.
# If running locally without the file, we can generate a small DataFrame.

data = [("This is a sample line of text for demonstration purposes",)]
df_text = spark.createDataFrame(data, ["value"])

# Let's verify schema
df_text.printSchema()

In [None]:
# Logic:
# 1. Split the sentence into words (Array).
# 2. EXPLODE the array (Rows multiply).
# 3. Group and Count.

# To simulate OOM, imagine the line is massive or we duplicate the string many times.
# Here is the logic flow that usually consumes Execution Memory.

df_split = df_text.withColumn("splitted_val", split("value", " "))
    
df_exploded = df_split.withColumn("exploded_val", explode("splitted_val")) \
    .drop("value", "splitted_val")

df_count = df_exploded.groupBy("exploded_val").count()

print("Plan created. Running action...")
df_count.show()

# Note: If the source string was huge (MBs size) and explode created millions of rows
# inside a single partition, this would crash the tiny 512MB executor.

## 3. OOM Simulation: Large Single Records

Spark processes data partition by partition. However, to process a single row, that entire row must fit into memory. 

If you read a file format like JSON or Text where a **single record** (e.g., a single line of text without newlines) is larger than the available Execution Memory (e.g., > 100MB in our constrained setup), Spark cannot spill that single record to disk. It must exist in memory to be deserialized.

**Result:** Immediate `java.lang.OutOfMemoryError: Java heap space`.

In [None]:
# Simulating the reading of a "Single Line" file vs "Multi Line" file.

# Case A: Reading a file where the entire content is on one line.
# If 'text_file_singleline_xs.txt' is > 50MB (approx our execution memory per core),
# this operation will fail.

# Uncomment to run if file exists
# df_single = spark.read.text("path/to/text_file_singleline_xs.txt")
# df_single.show() 
# ^ This triggers OOM because the single row object > available memory.

print("To avoid OOM with large single records:")
print("1. Increase Executor Memory (Vertical Scaling).")
print("2. Pre-process file to split lines.")
print("3. Use 'wholetext' option with caution (reads file as one record - risky).")

In [None]:
# Case B: Reading the same data but formatted with newlines (Multi-line).
# Each record is small. Spark can process them one by one or batch them safely.

# df_multi = spark.read.text("path/to/text_file_xs.txt")
# df_multi.select(split(lower("value"), " ").alias("words")) \
#     .select(explode("words").alias("word")) \
#     .groupBy("word").count() \
#     .write.format("noop").mode("overwrite").save()

# Using 'noop' format is great for benchmarking (No Operation write).
# It forces the execution without writing output files.

print("Benchmarking execution complete.")

## Summary

1.  **Memory Division:**
    *   **Reserved:** 300MB (fixed).
    *   **Spark Fraction:** 60% of (Heap - Reserved).
    *   **Storage/Execution:** Share this 60% unified pool. Execution can evict Storage, but Storage cannot evict Execution.

2.  **OOM Causes:**
    *   **Data Explosion:** `explode`, `crossJoin` producing more data than fits in Execution memory.
    *   **Skew/Big Records:** A single record larger than the memory available to a single task/core.
    *   **GC Overhead:** If GC spends too much time clearing space (98% time) and recovers little memory (<2%), Spark throws `GC overhead limit exceeded`.

3.  **Solutions:**
    *   Tune `spark.memory.fraction` (rarely needed).
    *   Increase Executor Memory.
    *   Fix Code Logic (avoid massive single records, handle skew).
    *   Use **Off-Heap Memory** (managed by OS, requires explicit config) to reduce GC pressure for caching.