# 01-01: Python Basics Review (for ML Context)

In [None]:
# Environment setup (Colab-friendly)
# NOTE: In Colab, most of these are preinstalled

import math
import random


## Learning Objectives

By the end of this notebook, you should be able to:

- Read Python code with confidence
    
- Use variables, lists, dictionaries, loops, and conditionals
    
- Understand Python code in a **data analysis / ML context**
    
- Recognize patterns you will see later in pandas and PyTorch

---

## Why This Matters

You are **not** learning Python to become a software engineer.

You are learning Python so that:

- you can prepare data
    
- you can express models
    
- you can understand what ML libraries are doing for you
    

If you can _read_ Python comfortably, you can learn ML quickly.

---

## Variables and Types (Quick Review)

In [1]:
income = 75_000
loan_accepted = True
interest_rate = 0.12

Python figures out types automatically — but **you must still reason about them**

### Exercise 1

In [None]:
#What is the type of each variable?
# Use the type() function to check.

### Some Basic Calculations 

Taking user input of old and new salary and finding the percentage change

In [None]:
s1 = eval(input("What is your current salary? "))
s2 = eval(input("What is your new salary?"))

pchange = ((s2 - s1)/s1) 

print(f"Percentage change in salary is: {pchange: .2%}")

Later on, we will look at functions in `python` simillar to what we did in `R`.

---

## Lists and Dictionaries

Understanding these concepts is very important.

1. Lists: The Versatile "To-Do" Items 
Lists are the most common collection. They are ordered and mutable (changeable).

Key Syntax: `my_list = [1, 2, 3]`

Concept: Like a shopping list where you can add, remove, or swap items.

2. Tuples: The "Locked" Sequences 
Tuples are ordered but immutable (cannot be changed after creation).

Key Syntax: `my_tuple = (1, 2, 3)`

Concept: Like a GPS coordinate (Latitude, Longitude); once it's set, you don't want someone accidentally changing just one part of it.

3. Dictionaries: The "Key-Value" Map 
Dictionaries store data in pairs.

Key Syntax: `my_dict = {"name": "Alice", "age": 25}`

Concept: Like a real-life dictionary where you look up a "word" (key) to find its "definition" (value).

## 1. Lists: The Versatile To-Do List 
Lists are **ordered** and **mutable** (you can change them).

* **Syntax:** Uses square brackets `[]` 
* **Indexing:** Starts at `0`

In [None]:
# Creating a list
languages = ["Python", "Java", "R"]

# Accessing items
print(f"The first item is: {languages[0]}")

# Changing an item
languages[1] = "Rust"

# Adding an item
languages.append("JavaScript")

print(f"Updated list: {languages}")

## 2. Tuples: The Locked Sequence 
Tuples are **ordered** but **immutable** (they cannot be changed).

* **Syntax:** Uses parentheses `()`
* **Why use them?** For data that should never change, like GPS coordinates or dates.

In [None]:
# Creating a tuple
coordinates = (40.7128, -74.0060)

print(f"Latitude: {coordinates[0]}")

# Try to change a tuple (This will throw an error!)
try:
    coordinates[0] = 34.0522
except TypeError as e:
    print(f"Caught expected error: {e}")

Several functions in ML will expect arguments as tuples.  There is a very powerful concept called __tuple unpacking__ which allows you to "extract" the values inside a tuple directly into separate variables in a single line.

In [None]:
# A tuple representing (Revenue, Expenses, Tax_Rate)
quarterly_report = (150000, 95000, 0.21)

# Unpacking into descriptive variables
rev, exp, tax = quarterly_report

rev

In [None]:
# using the above in a calculation
# Now we can perform business logic easily
profit = rev - exp
net_income = profit * (1 - tax)

print(f"Net Income: ${net_income:,.2f}")

We can also use this concept to extract relevant fields from a tuple and store others as a metadata using the `*` operator:

In [7]:
# Transaction: (ID, Amount, Time, IP, Region, Device)
transaction = ("TXN-992", 450.00, "2026-01-24", "192.168.1.1", "Midwest", "Mobile")

# Extract core metrics and bundle the audit trail
txn_id, amount, *rest = transaction

print(f"Processing {txn_id} for ${amount}")
print(f"Audit Metadata: {rest}") # This becomes a list

Processing TXN-992 for $450.0
Audit Metadata: ['2026-01-24', '192.168.1.1', 'Midwest', 'Mobile']


Note that there is nothing special about _rest_ in `*rest`. Sometimes, even an `_` is used to capture the extra information: `txn_id, amount, *_ = transaction`

## 3. Dictionaries: The Backbone of Data & Models in Python

A **dictionary** is a data structure that stores information as **key–value pairs**.

- Lists answer: _“What is at position i?”_
    
- Dictionaries answer: _“What is the value associated with this name?”_
    

This makes dictionaries ideal for:

- Structured data
    
- Labeled information
    
- Configuration settings
    
- Model inputs and outputs

In [7]:
# Creating a dictionary
student = {
    "name": "Alex",
    "major": "Business Analytics",
    "gpa": 3.72,
    "graduating": True
}

print(student)

{'name': 'Alex', 'major': 'Business Analytics', 'gpa': 3.72, 'graduating': True}


Here:
- Keys: `"name"`, `"major"`, `"gpa"`, `"graduating"`
- Values: strings, numbers, booleans
    
---

### Accessing and Updating Values

In [8]:
student["gpa"]

3.72

In [9]:
student["gpa"] = 3.80
student["internship"] = "Marketing Analytics"
student

{'name': 'Alex',
 'major': 'Business Analytics',
 'gpa': 3.8,
 'graduating': True,
 'internship': 'Marketing Analytics'}

 **Important rule**:

- Keys must be **unique**
    
- Values can repeat
    
- Keys are usually strings, but can be numbers or tuples

---

### Why Dictionaries Matter More Than Lists

Compare this list-based representation:

`student_list = ["Alex", "Business Analytics", 3.8, True]`

❌ Problems:

- Hard to remember order
    
- Error-prone
    
- Not self-documenting
    

### Dictionary version:

In [10]:
student_dict = {
    "name": "Alex",
    "major": "Business Analytics",
    "gpa": 3.8,
    "graduating": True
}

This is:
- clear
- robust
- extensible

---

### Looping through Dictionarioes

In [11]:
for key, value in student.items():
    print(f"{key}: {value}")

name: Alex
major: Business Analytics
gpa: 3.8
graduating: True
internship: Marketing Analytics


### Nested Dictionaries

A disctionary can contain another dictionary as a value. Consider the following example where the key `credit` has as value another dictionary.

In [12]:
loan_applicant = {
    "income": 85000,
    "credit": {
        "score": 720,
        "delinquencies": 0
    },
    "approved": True
}

loan_applicant["credit"]["score"]

720

A lot of data is stored as nested disctionaries. 

We can loop through the above dictionary:

In [13]:
for key, value in loan_applicant.items():
    print(f"{key}: The corresponding value is {value}")

income: The corresponding value is 85000
credit: The corresponding value is {'score': 720, 'delinquencies': 0}
approved: The corresponding value is True


We will apply functions to dictionaries but as a preview, dictionaries are present in pandas and pytorch

### Dictionaries in Pandas (Preview)

A row in a DataFrame behaves like a dictionary.

In [14]:
import pandas as pd

df = pd.DataFrame([
    {"income": 50000, "loan": "No"},
    {"income": 90000, "loan": "Yes"}
])

df.iloc[1]["income"]

np.int64(90000)

### Dictionaries in PyTorch (Very Important Later)

PyTorch models and datasets return dictionaries.

In [15]:
batch = {
    "inputs": X,
    "labels": y
}

NameError: name 'X' is not defined

Model configuration (like the trainControl function in caret) in pytorch is as dictionary

In [None]:
config = {
    "learning_rate": 0.001,
    "epochs": 20,
    "batch_size": 32
}

### Key Takeaways

- Dictionaries are named containers for data

- They enable:

    - Clean code

    - Safer function design

    - ML pipelines

- If lists are about position, dictionaries are about meaning

- Mastering dictionaries makes:

    - Pandas easier

    - PyTorch natural

    - APIs understandable