# Debug Malloc Stats Challenge

**Goal:** Interpret memory allocation patterns using `sys._debugmallocstats()`

**You will learn:**
- How Python's memory allocator works (arenas, pools, blocks)
- How to read and interpret `_debugmallocstats()` output
- Memory allocation patterns for different object types
- Why Python doesn't always return memory to the OS

## Part 1: Baseline - Initial Memory State

This shows Python's memory state after startup. Notice the size classes and how many blocks are already in use.

**Before running the next cell, look at the output and note:**
- How many arenas are allocated?
- Which size classes have the most blocks in use?
- How many free PyDictObjects are there?

In [None]:
import sys
import gc

print("BASELINE MEMORY STATE")
sys._debugmallocstats()

## Part 2: Creating 1000 Dictionaries

Now we'll create 1000 dictionaries and see how memory changes.

**Before running the next cell, predict:**
- Which size class will increase the most?
- Will new arenas be allocated?
- How many bytes will be allocated?

Write your predictions here:
- Size class prediction: ___________
- New arenas? (yes/no): ___________
- Bytes allocated: ___________

In [None]:
# Create 1000 dictionaries with some data
dicts = []
for i in range(1000):
    d = {
        'id': i,
        'name': f'item_{i}',
        'value': i * 2.5,
        'active': True
    }
    dicts.append(d)

print(f"Created {len(dicts)} dictionaries.")
print("\nMEMORY STATE AFTER CREATING 1000 DICTS")

sys._debugmallocstats()

### Analysis Questions - Part 2

Compare the output above with the baseline:

1. How many arenas are allocated now? Did it increase?
2. Which size classes showed the biggest increase in "blocks in use"?
3. Look at "free PyDictObjects" - did this number change? Why or why not?
4. Look at "# bytes in allocated blocks" - how much did it increase?

Write your observations in this cell (double-click to edit):
- 
- 
- 

## Part 3: Deleting All Dictionaries

Now let's delete all dictionaries and see what happens.

**Predict:**
- Will arenas be reclaimed (returned to OS)?
- Will the "free PyDictObjects" count increase?
- What happens to the "blocks in use" in the size classes?

Write your predictions:
- Arenas reclaimed? (yes/no): ___________
- Free PyDictObjects will: (increase/decrease/stay same): ___________
- Blocks in use will: (increase/decrease/stay same): ___________

In [None]:
# Delete all dictionaries
dicts.clear()
del dicts

print("Deleted all dictionaries.")
print("\nMEMORY STATE AFTER DELETION (before GC)")
print("=" * 70)
sys._debugmallocstats()

### Analysis Questions - Part 3

1. Were any arenas reclaimed? (Look at "# arenas reclaimed")
2. Did "free PyDictObjects" increase? By how much?
3. Did "blocks in use" for the size classes decrease?
4. Where did the memory go?

Write your observations:
- 
- 
- 

## Part 4: After Garbage Collection

Let's force garbage collection and see if anything changes.

In [None]:
collected = gc.collect()
print(f"GC collected {collected} objects.")
print("\nMEMORY STATE AFTER GARBAGE COLLECTION")
print("=" * 70)
sys._debugmallocstats()

### Analysis Questions - Part 4

1. Did GC collect many objects? Why or why not?
2. Did the memory state change significantly after GC?
3. Why didn't the arenas get reclaimed?

Write your observations:
- 
- 
- 

## Student Exercise Questions

Based on all the output above, answer these questions:

### 1. ARENA ANALYSIS:
- How many arenas were allocated initially?
- Did creating 1000 dicts cause new arenas to be allocated?
- After deleting, were any arenas reclaimed?
- Why or why not?

### 2. SIZE CLASS ANALYSIS:
- Which size class had the biggest increase after creating dicts?
- What does this tell you about dict object size?
- Did the "blocks in use" decrease after deletion?

### 3. MEMORY EFFICIENCY:
- Look at "bytes lost to quantization" - what does this mean?
- Look at "free PyDictObjects" - what's Python doing here?
- Calculate: What % of allocated memory was actually in use?

### 4. POOL ANALYSIS:
- How many pools were used initially?
- How many after creating dicts?
- How many "unused pools" after deletion?

### 5. PRACTICAL QUESTIONS:
- If you ran this program and exited, would memory return to OS?
- Would repeatedly creating/deleting dicts cause memory to grow forever?
- When WOULD Python return memory to the OS?

**Write your answers in the cells below! Then scroll down to see the answer key.**

### Your Answers Here

Double-click to edit and write your answers:

**Arena Analysis:**
- 
- 

**Size Class Analysis:**
- 
- 

**Memory Efficiency:**
- 
- 

**Pool Analysis:**
- 
- 

**Practical Questions:**
- 
- 

---

# ANSWER KEY

## Understanding the Output

## Key Concept 1: Python's Memory Hierarchy

```
Arenas (1MB each) - Requested from OS
   └─> Pools (16KB each) - Subdivisions of arenas
        └─> Blocks (varies) - Actual object storage
```

Think of it like:
- **Arenas** = Buildings
- **Pools** = Floors in the building
- **Blocks** = Apartments on each floor

## Key Concept 2: Size Classes

Python groups objects by size: 16, 32, 48, 64, 80, ... 512 bytes

- Objects ≤512 bytes use the "small block allocator"
- Larger objects use `malloc` directly

**Example:** A 50-byte object uses a 64-byte block (size class 3)
- The 14 bytes difference is "lost to quantization"
- This is a tradeoff: some waste for much faster allocation

## Key Concept 3: Why Arenas Aren't Reclaimed

**Python is GREEDY** - it holds onto memory for performance!

- Arenas are only freed if ALL their pools are empty
- Even one active object in an arena keeps the whole 1MB
- This is **intentional** - assumes you'll need memory again soon
- Memory **IS** returned when the Python process exits

**Why?** Requesting memory from the OS is expensive. Python avoids it by caching.

## Key Concept 4: Free Object Pools

Notice "free PyDictObjects", "free PyListObjects", etc.

- When you delete a dict, Python often **CACHES** it instead of freeing
- Next time you create a dict, it reuses a cached one
- This is much faster than `malloc`/`free`
- But it means memory usage stays higher

**Example from our exercise:**
```
Before: 55 free PyDictObjects
After deleting 1000 dicts: 175 free PyDictObjects
```

Python cached ~120 of your deleted dicts!

## Key Concept 5: Reading the Size Class Table

```
class   size   num pools   blocks in use   avail blocks
    3     64         933          198604          39311
```

- **class 3**: Size class ID
- **size 64**: Each block holds objects ≤64 bytes
- **num pools 933**: Number of 16KB pools allocated for this size
- **blocks in use 198,604**: Blocks actively storing objects
- **avail blocks 39,311**: Free blocks ready for reuse

**Total capacity:** 198,604 + 39,311 = 237,915 blocks of 64 bytes each

## Key Concept 6: Memory Breakdown

```
# bytes in allocated blocks        =           44,948,912
# bytes in available blocks        =            3,466,832
225 unused pools * 16384 bytes     =            3,686,400
# bytes lost to pool headers       =              142,800
# bytes lost to quantization       =              183,856
```

**Translation:**
- **44.9 MB:** Actual Python objects living in memory
- **3.5 MB:** Empty blocks ready for reuse (still in active pools)
- **3.7 MB:** Completely empty pools (could be freed but aren't)
- **143 KB:** Metadata overhead for managing pools
- **184 KB:** Wasted space from size rounding (quantization)

**Efficiency calculation:**
```
Useful data: 44.9 MB
Total allocated: 44.9 + 3.5 + 3.7 + 0.14 + 0.18 = 52.4 MB
Efficiency: 44.9 / 52.4 = 85.7%
```

## Key Concept 7: What You Should See After Creating 1000 Dicts

**Before creation:**
- "free PyDictObjects" starts low (maybe 50-100)

**After creation:**
- "free PyDictObjects" still low (dicts are in use)
- "blocks in use" for dict-sized classes increases significantly

**After deletion:**
- "free PyDictObjects" JUMPS UP (Python cached them!)
- "blocks in use" decreases back down
- "avail blocks" increases (blocks now available)
- Arenas: probably NO new arenas (1000 dicts isn't that much)
- Arenas reclaimed: still 0 (Python keeps them)

## Key Concept 8: Common Misconceptions

**"Deleting objects frees memory immediately"**
- → Nope. Python caches objects and holds onto arenas

**"gc.collect() returns memory to OS"**
- → It only breaks cycles, doesn't free arenas

**"My program has a memory leak!"**
- → Probably just Python being greedy. Check with tools like `tracemalloc` or `memory_profiler` for real leaks

**Reality:**
- Python optimizes for speed, not minimal memory usage
- It assumes you'll need memory again soon
- This is normal and expected behavior

## Key Concept 9: When Does Python Return Memory?

1. **When the process exits** - All memory returned to OS
2. **When an entire arena becomes empty AND Python decides to free it**
   - This is rare during program execution
3. **You can force it with custom memory management**
   - But usually not worth the complexity

## Key Concept 10: Practical Tips

**Don't worry about** "arenas reclaimed = 0" - this is normal

**DO worry if** memory grows unbounded with each iteration

**For monitoring:**
- Use `tracemalloc.get_traced_memory()` for high-level tracking
- Use `_debugmallocstats()` to understand allocation patterns
- For production, use memory profilers like `memory_profiler` or `pympler`

**Example with tracemalloc:**

In [None]:
import tracemalloc

tracemalloc.start()

# Create objects
test_dicts = [{f'key_{i}': i} for i in range(1000)]

current, peak = tracemalloc.get_traced_memory()
print(f"Current memory: {current / 1024 / 1024:.2f} MB")
print(f"Peak memory: {peak / 1024 / 1024:.2f} MB")

# Clean up
del test_dicts

current, peak = tracemalloc.get_traced_memory()
print(f"\nAfter deletion:")
print(f"Current memory: {current / 1024 / 1024:.2f} MB")
print(f"Peak memory: {peak / 1024 / 1024:.2f} MB")

tracemalloc.stop()

---

# Bonus Challenges

Try these experiments to deepen your understanding:

## Challenge 1: Scale Up

Create 10,000 dicts instead of 1,000 - do you get new arenas?

In [None]:
# Your code here


## Challenge 2: Different Dict Sizes

Create dicts with different sizes and see which size class each uses:
- Small: `{'a': 1}`
- Medium: `{f'key_{i}': i for i in range(10)}`
- Large: `{f'key_{i}': i for i in range(100)}`

In [None]:
# Your code here


## Challenge 3: Different Object Types

Compare memory allocation for:
- 1000 lists
- 1000 tuples
- 1000 sets
- 1000 custom class instances

Do they use different size classes?

In [None]:
# Your code here


## Challenge 4: Memory Leak Simulation

Create a REAL memory leak and watch it grow:
- Create a global list
- Keep appending dicts to it
- Watch memory grow without GC being able to help
- This is unbounded growth - a real problem!

In [None]:
# Your code here


---

# Conclusion

Now you better understand Python's memory allocator internals!

**This knowledge helps you:**
- Debug apparent 'memory leaks' that are just caching
- Understand performance characteristics
- Make informed decisions about object pooling
- Interpret memory profiler output
- Distinguish between Python being greedy vs actual memory problems