# Python Lists: The Backbone of Data

In the world of AI and Machine Learning, **Lists** are everywhere. Whether you're handling a sequence of stock prices, a batch of images, or a collection of text sentences, you are likely using a list (or a library like NumPy that behaves like a super-charged list).

## 1. Creating Lists
A list is an ordered collection of items. It can hold anything: numbers, strings, or even other lists.

In [None]:
# Empty list
empty_list = []

# List of numbers (e.g., model accuracy scores)
scores = [0.88, 0.92, 0.95, 0.89]

# List of strings (e.g., class labels)
classes = ['cat', 'dog', 'bird']

# Mixed types (Python allows this, but try to avoid it in ML data)
mixed = [1, 'cat', True]

print(f"Scores: {scores}")
print(f"Classes: {classes}")

## 2. Accessing Elements (Indexing)
Python uses **0-based indexing**. This means the first item is at index 0.
- **Negative Indexing**: Access items from the end using `-1`.

> **Note for C# Developers**: Python's `-N` is equivalent to C#'s `^N`. So `filenames[-3]` is the same as `filenames[^3]`.

In [None]:
filenames = ['img1.jpg', 'img2.jpg', 'img3.jpg', 'img4.jpg']

print(f"First image: {filenames[0]}")
print(f"Third image: {filenames[2]}")

# Get the last image (useful when you don't know the length)
print(f"Last image: {filenames[-1]}")

# Get the 3rd from last (C# equivalent: filenames[^3])
print(f"3rd from last: {filenames[-3]}")

## 3. Slicing: The Art of Data Splitting
Slicing is one of the most important skills for a Data Scientist. You often need to split your data into **Training** (for the model to learn) and **Testing** (to evaluate the model).

**Syntax**: `list[start:stop:step]`
- `start`: Inclusive (default 0)
- `stop`: Exclusive (default end of list)
- `step`: How many items to jump (default 1)

In [None]:
data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

# First 80% for training (0 to 7)
train_set = data[:8]

# Last 20% for testing (8 to end)
test_set = data[8:]

print(f"Train: {train_set}")
print(f"Test:  {test_set}")

# Reversing a list using step -1
reversed_data = data[::-1]
print(f"Reversed: {reversed_data}")

## 4. Modifying Lists
Lists are **mutable**, meaning you can change them after creation.

> **Note for C# Developers**:
> - `append(x)` is like `Add(x)`
> - `insert(i, x)` is like `Insert(i, x)`
> - `remove(x)` is like `Remove(x)`
> - `pop()` is similar to `RemoveAt(Count - 1)`, but it also returns the item.

In [None]:
models = ['LinearRegression', 'SVM']

# 1. Append: Add to the end (Most common)
models.append('RandomForest')
print(f"After append: {models}")

# 2. Insert: Add at a specific position
models.insert(0, 'LogisticRegression') # Add to front
print(f"After insert: {models}")

# 3. Remove: Delete by value
# Remove first occurrence of 'SVM'
models.remove('SVM')
print(f"After remove: {models}")

# 4. Pop: Remove by index (default last) and return it
# Remove and return last element
last_model = models.pop()
print(f"Popped: {last_model}")
print(f"Remaining: {models}")

# Remove and return element at index 1
popped = models.pop(1)
print(f"Popped: {popped}")
print(f"Remaining: {models}")

After append: ['LinearRegression', 'SVM', 'RandomForest']
After insert: ['LogisticRegression', 'LinearRegression', 'SVM', 'RandomForest']
After remove: ['LogisticRegression', 'LinearRegression', 'RandomForest']
Popped: RandomForest
Remaining: ['LogisticRegression', 'LinearRegression']
Popped: LinearRegression
Remaining: ['LogisticRegression']


## 5. List Comprehensions: The Pythonic Way
This is a concise way to create lists. It is faster and more readable than a normal for-loop. You will see this everywhere in ML code for **Feature Engineering**.

Basics Syantax [expression for item in iterable]

with conditional logic [expression for item in iterable if condition]

Nested List Comprehension [expression for item1 in iterable1 for item2 in iterable2]

### C# LINQ vs Python List Comprehension
| Concept | C# LINQ | Python Equivalent |
| :--- | :--- | :--- |
| **Filter** | `.Where(x => x > 5)` | `[x for x in list if x > 5]` |
| **Map** | `.Select(x => x * 2)` | `[x * 2 for x in list]` |
| **First** | `.First(x => x > 5)` | `next(x for x in list if x > 5)` |
| **Any** | `.Any(x => x > 5)` | `any(x > 5 for x in list)` |
| **All** | `.All(x => x > 5)` | `all(x > 5 for x in list)` |

In [24]:
raw_prices = [100, 200, 300, 400]

# Goal: Apply a 10% discount to all prices

# Old way (For Loop)
discounted = []
for p in raw_prices:
    discounted.append(p * 0.9)

# Pythonic Way (List Comprehension)
# [expression for item in list]
discounted_fast = [p * 0.9 for p in raw_prices]

print(f"Discounted: {discounted_fast}")

# With a condition: Keep only prices > 250
expensive = [p for p in raw_prices if p > 250]
print(f"Expensive items: {expensive}")

# Old for loop way

lst = []
for i in range(11):
    lst.append(i)
print(lst)

# List Comprehension
lst = [i for i in range(11)]
print(lst)

#List Comprehension has 2 inputs, condition and expression or iteration
# basic List Comprehension Syntax [expression for item in iterable] - [expression, iteration]
# Square of numbers
square = [num**2 for num in lst]
print(f"Squared: {square}")

#List Comprehension with condition
# syntax [expression for item in iterable if condition] - [expression, iteration, condition]
even_num = [num for num in lst if num % 2 == 0]
print(f"Even numbers: {even_num}")

#List Comprehension with multiple conditions
# syntax [expression for item in iterable if condition1 if condition2] - [expression, iteration, condition1, condition2]
divisible_by_2_and_3 = [num for num in lst if num % 2 == 0 if num % 3 == 0]  # Conditions are treated with AND
print(f"Divisible by 2 and 3: {divisible_by_2_and_3}")

# Nested List Comprehension
# syntax [expression for item in iterable for item in iterable] - [expression, iteration, iteration]
lst1=[1,2,3,4,5]
lst2=[6,7,8,9,10]
lst3=['a','b','c','d','e']

pair=[(i,j,k) for i in lst1 for j in lst2 for k in lst3]  #creates a tuple of all possible combinations, cartesian product
print(pair)
print(type(pair))

Discounted: [90.0, 180.0, 270.0, 360.0]
Expensive items: [300, 400]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Squared: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
Even numbers: [0, 2, 4, 6, 8, 10]
Divisible by 2 and 3: [0, 6]
[(1, 6, 'a'), (1, 6, 'b'), (1, 6, 'c'), (1, 6, 'd'), (1, 6, 'e'), (1, 7, 'a'), (1, 7, 'b'), (1, 7, 'c'), (1, 7, 'd'), (1, 7, 'e'), (1, 8, 'a'), (1, 8, 'b'), (1, 8, 'c'), (1, 8, 'd'), (1, 8, 'e'), (1, 9, 'a'), (1, 9, 'b'), (1, 9, 'c'), (1, 9, 'd'), (1, 9, 'e'), (1, 10, 'a'), (1, 10, 'b'), (1, 10, 'c'), (1, 10, 'd'), (1, 10, 'e'), (2, 6, 'a'), (2, 6, 'b'), (2, 6, 'c'), (2, 6, 'd'), (2, 6, 'e'), (2, 7, 'a'), (2, 7, 'b'), (2, 7, 'c'), (2, 7, 'd'), (2, 7, 'e'), (2, 8, 'a'), (2, 8, 'b'), (2, 8, 'c'), (2, 8, 'd'), (2, 8, 'e'), (2, 9, 'a'), (2, 9, 'b'), (2, 9, 'c'), (2, 9, 'd'), (2, 9, 'e'), (2, 10, 'a'), (2, 10, 'b'), (2, 10, 'c'), (2, 10, 'd'), (2, 10, 'e'), (3, 6, 'a'), (3, 6, 'b'), (3, 6, 'c'), (3, 6, 'd'), (3, 6, 'e'), (3, 7, 'a'), (3, 7, 'b

## 6. Common List Operations
- `len()`: Get the size.
- `in`: Check if an item exists.
- `sort()`: Sort the list in-place.
- `sorted()`: Return a new sorted list.
- `reverse()`: Reverse the list in-place.
- `index()`: Find the index of an item.
- `count()`: Count occurrences of an item.

In [5]:
batch_sizes = [32, 16, 64, 8,64]

print(f"Number of batches: {len(batch_sizes)}")

if 64 in batch_sizes:
    print("Batch size 64 is present.")

# append()
batch_sizes.append(128)
print(f"Append: {batch_sizes}") # Adds an element to the end of the list

# in()
bool_value = 123 in batch_sizes
print(f"123 is in batch_sizes: {bool_value}") # Returns True if the element is present in the list, otherwise False

# Sorting
batch_sizes.sort()
print(f"Sort: {batch_sizes}")

# sorted()
print(f"Sorted: {sorted(batch_sizes)}") # Returns a new list containing all elements from the original list in sorted order

# index()
print(f"Index of 64: {batch_sizes.index(64)}") # Returns the first index at which a given element can be found in the list

# count()
print(f"Count of 64: {batch_sizes.count(64)}") # Returns the number of times a given element appears in the list

# reverse()
batch_sizes.reverse()
print(f"Reversed: {batch_sizes}") # Reverses the order of elements in the list

# clear()
batch_sizes.clear()
print(f"Clear: {batch_sizes}") # Removes all elements from the list




Number of batches: 5
Batch size 64 is present.
Append: [32, 16, 64, 8, 64, 128]
123 is in batch_sizes: False
Sort: [8, 16, 32, 64, 64, 128]
Sorted: [8, 16, 32, 64, 64, 128]
Index of 64: 3
Count of 64: 2
Reversed: [128, 64, 64, 32, 16, 8]
Clear: []


### Iterating over a list

In [8]:
numbers = [1, 2, 3, 4, 5,6,7,8,9,10]

for number in numbers:
    print(number)

#Iteration with index

for index,number in enumerate(numbers):
    print(index,number) # print index and value   

1
2
3
4
5
6
7
8
9
10
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10


## Practice Exercises

### Exercise 1: Data Cleaner
You have a list of sensor readings: `[10, -5, 20, -1, 30]`.
1. Create a new list that contains only the positive numbers.
2. Use a list comprehension.

### Exercise 2: Train/Test Splitter
You have a list of 20 items (create it using `list(range(20))`).
1. Slice the first 15 items into `train_data`.
2. Slice the last 5 items into `test_data`.
3. Print the length of both lists to verify.

In [1]:
# Exercise 1: Data Cleaner
readings = [10, -5, 20, -1, 30]
positive_readings = [x for x in readings if x > 0]
print(f"Positive readings: {positive_readings}")

# Exercise 2: Train/Test Splitter
items = list(range(20))
train_data = items[:15]
test_data = items[15:]
print(f"Train length: {len(train_data)}")
print(f"Test length: {len(test_data)}")

Positive readings: [10, 20, 30]
Train length: 15
Test length: 5
