# Python Memory Usage Measurement

This notebook explores how to measure memory usage of various Python objects and data structures, along

In [None]:
import sys
import numpy as np
import pandas as

## Basic Types Memory Usage

Using `sys.getsizeof()` function to measure memory

In [None]:
# Integer memory size
x = 1
print(f"Integer {x} memory size: {sys.getsizeof(x)} bytes")

# Float memory size
y = 1.0
print(f"Float {y} memory size: {sys.getsizeof(y)} bytes")

# String memory size
s1 = "a"
s2 = "ab"
s3 = "abc"
print(f"String '{s1}' memory size: {sys.getsizeof(s1)} bytes")
print(f"String '{s2}' memory size: {sys.getsizeof(s2)} bytes")
print(f"String '{s3}' memory size: {sys.getsizeof(s

## Collection Types Memory Usage

Measuring memory usage of collection types like lists an

In [None]:
# Empty list memory size
empty_list = []
print(f"Empty list memory size: {sys.getsizeof(empty_list)} bytes")

# List with elements memory size
list_with_items = [1, 2, 3, 4, 5]
print(f"List with 5 elements memory size: {sys.getsizeof(list_with_items)} bytes")

# Empty dictionary memory size
empty_dict = {}
print(f"Empty dictionary memory size: {sys.getsizeof(empty_dict)} bytes")

# Dictionary with elements memory size
dict_with_items = {"a": 1, "b": 2, "c": 3}
print(f"Dictionary with 3 elements memory size: {sys.getsizeof(dict_with_

## Limitation: Nested Objects Memory Usage

A limitation of `sys.getsizeof()` function is that it doesn't calculate the total size of nested objects:

In [None]:
# Nested list elements are not included
nested_list = [[1, 2, 3], [4, 5, 6]]
print(f"Nested list memory size (excluding elements): {sys.getsizeof(nested_list)} bytes")

# Calculate size of each element in the list individually
print(f"First nested list memory size: {sys.getsizeof(nested_list[0])} bytes")
print(f"Second nested list memory size: {sys.getsizeof(nested_list[1])} bytes")

# Total (still not complete)
total_size = sys.getsizeof(nested_list) + sys.getsizeof(nested_list[0]) + sys.getsizeof(nested_list[1])
print(f"Calculated total size: {total

## Recursive Function to Calculate Object Size

Implementation of a function that recursively calculates the size of an object:

In [None]:
def get_size(obj, seen=None):
    """Function to recursively calculate object memory size"""
    # Handle circular references
    if seen is None:
        seen = set()
    
    # If object ID already seen, return 0
    obj_id = id(obj)
    if obj_id in seen:
        return 0
    
    # Record this object as seen
    seen.add(obj_id)
    
    # Get basic size
    size = sys.getsizeof(obj)
    
    # For dictionaries, add size of keys and values
    if isinstance(obj, dict):
        size += sum(get_size(k, seen) + get_size(v, seen) for k, v in obj.items())
    # For lists, tuples, sets, add size of elements
    elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
        size += sum(get_size(i, seen) for i in obj)
    
    return size

# Test with dictionaries and lists
test_dict = {"a": [1, 2, 3], "b": {"c": 4}}
print(f"Dictionary memory size (standard function): {sys.getsizeof(test_dict)} bytes")
print(f"Dictionary memory size (recursive function): {get_size(test_dict)} bytes")

## NumPy and Pandas Memory Usage

Measuring memory usage of NumPy and Pandas objects:

In [None]:
# NumPy array memory usage
arr1 = np.array([1, 2, 3, 4, 5])
print(f"NumPy array (int64) memory size: {arr1.nbytes} bytes")

# With different data type
arr2 = np.array([1, 2, 3, 4, 5], dtype=np.int32)
print(f"NumPy array (int32) memory size: {arr2.nbytes} bytes")

# Pandas DataFrame memory usage
df = pd.DataFrame({
    'A': np.random.rand(1000),
    'B': np.random.rand(1000),
    'C': np.random.rand(1000)
})
print(f"Pandas DataFrame memory usage: {df.memory_usage(deep=True).sum()}

## Memory Usage Optimization Techniques

Some techniques to optimize memory usage in Python:

In [None]:
# 1. Use appropriate data types
arr_float64 = np.ones(1000, dtype=np.float64)
arr_float32 = np.ones(1000, dtype=np.float32)
print(f"float64 array memory usage: {arr_float64.nbytes} bytes")
print(f"float32 array memory usage: {arr_float32.nbytes} bytes")
print(f"Reduction rate: {100 * (1 - arr_float32.nbytes / arr_float64.nbytes):.1f}%")

# 2. Use categorical type in Pandas
df_original = pd.DataFrame({
    'text': ['cat', 'dog', 'cat', 'fish', 'dog', 'cat'] * 100
})
df_category = pd.DataFrame({
    'text': pd.Categorical(['cat', 'dog', 'cat', 'fish', 'dog', 'cat'] * 100)
})
print(f"Regular string column memory usage: {df_original.memory_usage(deep=True).sum()} bytes")
print(f"Categorical type memory usage: {df_category.memory_usage(deep=True

## Summary

Main methods for measuring and optimizing memory usage in Python:

1. `sys.getsizeof()` function can be used for basic objects
2. Recursive approach is needed for complex objects (nested structures)
3. NumPy and Pandas have their own memory measurement methods
4. Choosing appropriate data types is important for reducing memory usage