# Python Lists for Data Analysis

This notebook explores **lists** in Python with a focus on **data analysis**.

Topics covered:
- Creation and basic operations
- Iteration and alignment
- List comprehensions
- Advanced slicing and nesting
- Data cleaning and transformation

In [1]:
import numpy as np
import pandas as pd

# --------------------------------------------------------
# Configuration constants (UPPER_CASE)
# --------------------------------------------------------
RANDOM_SEED = 42
NUMBERS = [10, 20, 30, 40, 50]
NAMES = ["Alice", "Bob", "Charlie", "David"]
MATRIX = [[1, 2], [3, 4], [5, 6]]

np.random.seed(RANDOM_SEED)

## 1. Basic List Operations

In [2]:
# Length and indexing
print("Length of NUMBERS:", len(NUMBERS))
print("First element:", NUMBERS[0])
print("Last element:", NUMBERS[-1])

# Slicing
print("First three elements:", NUMBERS[:3])
print("Every second element:", NUMBERS[::2])

# Adding and removing elements
numbers_copy = NUMBERS.copy()
numbers_copy.append(60)
numbers_copy.insert(1, 15)
numbers_copy.remove(30)
print("Modified list:", numbers_copy)

Length of NUMBERS: 5
First element: 10
Last element: 50
First three elements: [10, 20, 30]
Every second element: [10, 30, 50]
Modified list: [10, 15, 20, 40, 50, 60]


## 2. Iterating Over Lists

In [3]:
# Simple iteration
for name in NAMES:
    print("Name:", name)

# Enumerate for index-value pairs
for idx, value in enumerate(NUMBERS):
    print(f"Index {idx} -> Value {value}")

Name: Alice
Name: Bob
Name: Charlie
Name: David
Index 0 -> Value 10
Index 1 -> Value 20
Index 2 -> Value 30
Index 3 -> Value 40
Index 4 -> Value 50


## 3. Alignment and Zipping Lists

In [4]:
# Align two lists by index using zip
for name, number in zip(NAMES, NUMBERS):
    print(f"{name} -> {number}")

Alice -> 10
Bob -> 20
Charlie -> 30
David -> 40


## 4. List Comprehensions

In [5]:
# Create a new list of squared numbers
squared = [x ** 2 for x in NUMBERS]
print("Squared numbers:", squared)

# Filter elements greater than 25
filtered = [x for x in NUMBERS if x > 25]
print("Numbers > 25:", filtered)

Squared numbers: [100, 400, 900, 1600, 2500]
Numbers > 25: [30, 40, 50]


## 5. Advanced Slicing and Nested Lists

In [6]:
# Accessing nested elements
print("Element in second row, second column:", MATRIX[1][1])

# Flatten a matrix using comprehension
flattened = [item for row in MATRIX for item in row]
print("Flattened matrix:", flattened)

Element in second row, second column: 4
Flattened matrix: [1, 2, 3, 4, 5, 6]


## 6. Data Cleaning with Lists

In [7]:
# Simulate raw string data with inconsistent formatting
raw_names = ["   alice  ", "BOB", " charlie ", "DAVID"]

# Clean and normalize names
clean_names = [name.strip().title() for name in raw_names]
print("Clean names:", clean_names)

# Convert to a DataFrame for further analysis
df = pd.DataFrame({"Name": clean_names, "Score": np.random.randint(50, 100, size=len(clean_names))})
print(df)

Clean names: ['Alice', 'Bob', 'Charlie', 'David']
      Name  Score
0    Alice     88
1      Bob     78
2  Charlie     64
3    David     92


## 7. Joining and Concatenating Lists

In [10]:
# Using the + operator (creates a new list)
NUMBERS_A = [1, 2, 3]
NUMBERS_B = [4, 5, 6]
joined_list = NUMBERS_A + NUMBERS_B
print("Joined with + :", joined_list)

# Using extend() (in-place extension)
NUMBERS_C = [10, 20]
NUMBERS_D = [30, 40]
NUMBERS_C.extend(NUMBERS_D)
print("Extended list :", NUMBERS_C)

# Using unpacking (*) for merging multiple lists
list_x = [100, 200]
list_y = [300, 400]
merged_list = [*list_x, *list_y, 500]
print("Merged with unpacking:", merged_list)

Joined with + : [1, 2, 3, 4, 5, 6]
Extended list : [10, 20, 30, 40]
Merged with unpacking: [100, 200, 300, 400, 500]


## 8. Important Methods Not Previously Covered

In [9]:
SAMPLE = [10, 20, 30, 40, 50]

# insert(index, value) - Insert without replacing
SAMPLE.insert(2, 25)
print("After insert:", SAMPLE)

# pop(index) - Remove and return an item (default is last)
removed = SAMPLE.pop(3)
print("After pop:", SAMPLE, "Removed:", removed)

# count(value) - Count occurrences
count_20 = SAMPLE.count(20)
print("Occurrences of 20:", count_20)

# reverse() - Reverse in place
SAMPLE.reverse()
print("Reversed list:", SAMPLE)

# copy() - Shallow copy
copied_list = SAMPLE.copy()
print("Copied list:", copied_list)

After insert: [10, 20, 25, 30, 40, 50]
After pop: [10, 20, 25, 40, 50] Removed: 30
Occurrences of 20: 1
Reversed list: [50, 40, 25, 20, 10]
Copied list: [50, 40, 25, 20, 10]


## 9. Advanced Techniques

In [11]:
# List comprehensions with conditions
even_squares = [x**2 for x in range(20) if x % 2 == 0]
print("Even squares:", even_squares)

# Nested list comprehensions (matrix flattening)
matrix = [[1, 2, 3], [4, 5, 6]]
flattened = [item for row in matrix for item in row]
print("Flattened matrix:", flattened)

# Using zip() to combine elements
names = ["Alice", "Bob", "Carol"]
ages = [25, 30, 28]
paired = list(zip(names, ages))
print("Zipped pairs:", paired)

# Using enumerate() for indexed iteration
for idx, val in enumerate(names, start=1):
    print(f"Name {idx}: {val}")

Even squares: [0, 4, 16, 36, 64, 100, 144, 196, 256, 324]
Flattened matrix: [1, 2, 3, 4, 5, 6]
Zipped pairs: [('Alice', 25), ('Bob', 30), ('Carol', 28)]
Name 1: Alice
Name 2: Bob
Name 3: Carol


## 10. Professional Tips

In [12]:
# Efficient filtering using filter() and lambda
filtered_numbers = list(filter(lambda x: x > 15, [5, 10, 20, 30]))
print("Filtered numbers (>15):", filtered_numbers)

# Sorting with custom key (sort by string length)
words = ["data", "analysis", "AI", "python"]
words.sort(key=len)
print("Sorted by length:", words)

# Using sorted() to create a new sorted list
descending_numbers = sorted([5, 1, 9, 3], reverse=True)
print("Descending order:", descending_numbers)

# Remove duplicates using set, then convert back to list
with_duplicates = [1, 2, 2, 3, 3, 3, 4]
unique_values = list(set(with_duplicates))
print("Unique values:", unique_values)

# Generator expressions for memory efficiency
large_numbers = (x * x for x in range(10**6))
print("First squared value from generator:", next(large_numbers))

Filtered numbers (>15): [20, 30]
Sorted by length: ['AI', 'data', 'python', 'analysis']
Descending order: [9, 5, 3, 1]
Unique values: [1, 2, 3, 4]
First squared value from generator: 0
