
# Advanced Python Dictionaries for Data Analysis
This notebook explores **Python dictionaries** from intermediate to professional level, with a strong focus on **data analysis** workflows.

Topics covered:
- Core operations and key methods
- Nested dictionaries
- Advanced techniques and idioms
- Professional tips for real-world data projects


## 1. Basic Dictionary Operations

Dictionaries store key-value pairs and are essential for data manipulation.

In [None]:
# Creating dictionaries
sample_dict = {'name': 'Alice', 'age': 30, 'city': 'New York'}
print("Initial dictionary:", sample_dict)

# Accessing values
print("Name:", sample_dict['name'])
print("Age:", sample_dict.get('age'))

# Adding or updating entries
sample_dict['occupation'] = 'Data Scientist'
sample_dict['age'] = 31
print("Updated dictionary:", sample_dict)

# Removing entries
removed_value = sample_dict.pop('city')
print("Removed city:", removed_value)
print("After removal:", sample_dict)


Initial dictionary: {'name': 'Alice', 'age': 30, 'city': 'New York'}
Name: Alice
Age: 30
Updated dictionary: {'name': 'Alice', 'age': 31, 'city': 'New York', 'occupation': 'Data Scientist'}
Removed city: New York
After removal: {'name': 'Alice', 'age': 31, 'occupation': 'Data Scientist'}


## 2. Important Methods for Data Analysis

These methods simplify dictionary manipulations when dealing with complex data.

In [1]:
data = {'a': 1, 'b': 2, 'c': 3}

# keys(), values(), items()
print("Keys:", list(data.keys()))
print("Values:", list(data.values()))
print("Items:", list(data.items()))

# setdefault() - provide default value if key missing
data.setdefault('d', 4)
print("With setdefault:", data)

# update() - merge another dictionary
new_data = {'b': 20, 'e': 5}
data.update(new_data)
print("After update:", data)

# popitem() - remove and return last inserted item
key, value = data.popitem()
print(f"Popitem removed ({key}: {value}) ->", data)


Keys: ['a', 'b', 'c']
Values: [1, 2, 3]
Items: [('a', 1), ('b', 2), ('c', 3)]
With setdefault: {'a': 1, 'b': 2, 'c': 3, 'd': 4}
After update: {'a': 1, 'b': 20, 'c': 3, 'd': 4, 'e': 5}
Popitem removed (e: 5) -> {'a': 1, 'b': 20, 'c': 3, 'd': 4}


## 3. Nested Dictionaries

Useful for hierarchical data such as JSON or grouped statistics.

In [None]:
# Example: dataset with nested structure
students = {
    'Alice': {'math': 90, 'science': 85},
    'Bob':   {'math': 78, 'science': 82}
}

# Access nested values
print("Alice math score:", students['Alice']['math'])

# Iterating through nested data
for name, scores in students.items(): 
    avg = sum(scores.values()) / len(scores)
    print(f"{name}'s average: {avg:.2f}")


Alice math score: 90
Alice's average: 87.50
Bob's average: 80.00


## 4. Dictionaries in Data Analysis

Simulating a small dataset for analysis using dictionaries.

In [None]:
# Example: column-based dataset
dataset = {
    'names': ['Alice', 'Bob', 'Carol'],
    'ages': [25, 30, 27],
    'scores': [88, 92, 79]
}

# Compute basic statistics
average_age = sum(dataset['ages']) / len(dataset['ages'])
max_score = max(dataset['scores'])
print(f"Average age: {average_age:.0f}")
print("Max score:", max_score)

# Convert to list of dicts for flexible processing
records = [dict(zip(dataset.keys(), values))
        for values in zip(*dataset.values())]
print("Row-oriented records:", records)

Average age: 27
Max score: 92
Row-oriented records: [{'names': 'Alice', 'ages': 25, 'scores': 88}, {'names': 'Bob', 'ages': 30, 'scores': 92}, {'names': 'Carol', 'ages': 27, 'scores': 79}]


## 5. Advanced and Professional Techniques

Including dictionary comprehensions and merging strategies.

In [None]:
# Dictionary comprehension
squares = {x: x**2 for x in range(5)}
print("Squares:", squares)

# Filtering a dictionary
filtered = {k: v for k, v in squares.items() if v % 2 == 0}
print("Filtered even squares:", filtered)

# Merging multiple dictionaries (Python 3.9+)
dict_a = {'x': 1, 'y': 2}
dict_b = {'y': 3, 'z': 4}
merged = dict_a | dict_b  # 'y' is overwritten
print("Merged with | :", merged)

# Using collections.Counter for quick aggregations
from collections import Counter
data_points = ['red', 'blue', 'red', 'green', 'blue', 'blue']
counts = Counter(data_points)
print("Counter result:", counts)
print("Most common:", counts.most_common(2))


Squares: {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
Filtered even squares: {0: 0, 2: 4, 4: 16}
Merged with | : {'x': 1, 'y': 3, 'z': 4}
Counter result: Counter({'blue': 3, 'red': 2, 'green': 1})
Most common: [('blue', 3), ('red', 2)]


## 6. Professional Tips

Best practices when using dictionaries in real projects.

In [None]:
# Using defaultdict to avoid key errors
from collections import defaultdict

grouped = defaultdict(list)
records = [('NY', 100), ('CA', 200), ('NY', 150)]
for state, value in records:
    grouped[state].append(value)

print("Grouped with defaultdict:", dict(grouped))

# Immutable dictionary (types.MappingProxyType)
from types import MappingProxyType
config = {'threshold': 0.8, 'method': 'z-score'}
readonly_config = MappingProxyType(config)
print("Readonly config:", readonly_config)
# readonly_config['threshold'] = 0.9  # Raises TypeError if uncommented

# Memory and performance considerations
# Use tuples as keys for composite lookups
matrix_data = {}
for i in range(3):
    for j in range(3):
        matrix_data[(i, j)] = i + j
print("Matrix with tuple keys:", matrix_data)


Grouped with defaultdict: {'NY': [100, 150], 'CA': [200]}
Readonly config: {'threshold': 0.8, 'method': 'z-score'}
Matrix with tuple keys: {(0, 0): 0, (0, 1): 1, (0, 2): 2, (1, 0): 1, (1, 1): 2, (1, 2): 3, (2, 0): 2, (2, 1): 3, (2, 2): 4}
