# Notebook 01: Python Fundamentals via Choice Modeling

**Objective:** Introduce core Python programming fundamentals (variables, data types, data structures, functions, and control flow) and illustrate them with simple examples. We then apply these basics to a toy choice modeling scenario: a softmax-based utility model for choosing among travel modes. By the end of this notebook, participants will be comfortable with basic Python syntax and see how these concepts translate to discrete choice contexts.

## 01.1 Variables and Data Types

In Python, you can store data in variables. A variable is essentially a name that refers to a value. You create a variable with the assignment operator `=`. Python is dynamically typed, meaning you don't declare types explicitly; the type is inferred from the value assigned.

Examples of basic data types:

In [None]:
# Assigning variables of different types
traveler_name = "Alice"        # str (string)
age = 30                      # int (integer)
ticket_price = 15.50          # float (floating-point number)
is_student = True             # bool (Boolean)


In [None]:
traveler_name

Here, `traveler_name` is a string (text) containing `"Alice"`, `age` is an integer, `ticket_price` is a floating-point number, and `is_student` is a Boolean value (True/False). We can check their types using the built-in `type()` function:

In [None]:
# check the type of a variable
type(traveler_name)

Variables allow us to label data and reuse it. We will use variables to store things like travel times, costs, choices, etc., in choice modelling examples.

> **Note:** Python variable names should start with a letter and can contain letters, numbers, and underscores. They are case-sensitive (`Mode` and `mode` would be different variables).

## 01.2 Data Structures: Lists and Dictionaries

**Lists:** A list in Python is an ordered, mutable sequence of items. Lists are created with square brackets `[]`. They can contain elements of any type (even mixed types, though usually we keep them homogeneous). Use lists to store collections of related items, like a list of mode names or a list of travel times.

In [None]:
# List of available travel modes
modes = ["Car", "Bus", "Train"]

In [None]:
# print the list of modes
print("Modes:", modes)

In [None]:
# or just type the variable name
modes

In [None]:
# Indexing (0-based: 0 is first item)
print("First mode:", modes[0])

In [None]:
# or access individual items by index
modes[0]

In [None]:
# Add an element to the list
modes.append("Air")              
modes

Lists preserve the order of insertion and allow duplicates. You can modify elements (`modes[1] = "Coach"` would change "Bus" to "Coach"), iterate over them, and use built-in functions like `len(modes)` (number of items).

In [None]:
modes[1] = "Coach"
print("Updated modes:", modes)

In [None]:
# check lenght of list
len(modes)

In [None]:
# or print length
print("Number of modes:", len(modes))

**Dictionaries:** A dictionary is an unordered collection of key-value pairs enclosed in curly braces `{}`. Each entry maps a key to a value, like a real dictionary maps words to definitions. Use dictionaries to structure data by named attributes.

Example:

In [None]:
# Dictionary of travel times for each mode (in minutes)
travel_time = {"Car": 30, "Coach": 45, "Train": 40}


In [None]:
# check the dictionary
travel_time

In [None]:
# access travel time for a specific mode
travel_time["Car"]

In [None]:
print("Travel time by Car:", travel_time["Car"], "minutes")

In [None]:
# Add a new key-value pair for Air
travel_time["Air"] = 60
print("Modes and times:", travel_time)

Here, `"Car"`, `"Coach"`, etc. are keys (must be unique and immutable, typically strings or numbers), and the numbers are values. We accessed the Car time with `travel_time["Car"]`. We then added `"Air": 60`. Dictionaries are great for structured data – e.g., storing attributes of an alternative (mode) by name.

In [None]:
# List all modes
print("All modes:", list(travel_time.keys()))

# List all travel times
print("All travel times:", list(travel_time.values()))  

In [None]:
# storing attributes of an alternative (mode) by name
mode_attributes = {
    "Car": {"cost": 10.0, "comfort": 7},
    "Coach": {"cost": 5.0, "comfort": 5},
    "Train": {"cost": 8.0, "comfort": 6},
    "Air": {"cost": 50.0, "comfort": 9}
}

In [None]:
mode_attributes

In [None]:
print("Train attributes:", mode_attributes["Train"])

In [None]:
print("Air cost:", mode_attributes["Air"]["cost"])

In [None]:
# mode_attributes could be used in a choice model to evaluate alternatives
# based on cost, comfort, and travel time stored in the dictionaries.

# for example, calculate a utility function for Car
V_Car = -0.1 * mode_attributes["Car"]["cost"] + 0.5 * mode_attributes["Car"]["comfort"] - 0.05 * travel_time["Car"]
V_Car

In [None]:
# Now let's calculate a utility score for each mode
for i in range(len(modes)):
    mode = modes[i]
    attrs = mode_attributes[mode]
    time = travel_time.get(mode, 999)  # Default to 999 if mode not found
    V = -0.1 * attrs["cost"] + 0.5 * attrs["comfort"] - 0.05 * time
    print(f"Utility for {mode}: {V:.2f}")

We will often use dictionaries to hold parameters or results in modeling (for example, a dictionary of utility coefficients by variable name, or a record of outputs).

**List of dictionaries:** Sometimes you'll have a list of records, where each record is a dictionary. This could represent dataset-like structures (each dict is an observation). For instance, a list of individuals each with their attributes, or a list of alternatives each with its characteristics. Python's flexibility with these structures is useful for simple simulations.

In [None]:
# Example: List of individuals with attributes
individuals = [
    {"name": "Alice", "age": 30, "income": 70000},
    {"name": "Bob", "age": 25, "income": 50000},
    {"name": "Charlie", "age": 35, "income": 100000}
]

# Example: List of alternatives with characteristics
transport_modes = [
    {"mode": "car", "cost": 0.5, "comfort": 0.8, "time": 30},
    {"mode": "bus", "cost": 0.2, "comfort": 0.6, "time": 45},
    {"mode": "bike", "cost": 0.1, "comfort": 0.7, "time": 60}
]
print("Individuals:", individuals)
print("Transport modes:", transport_modes)

# Accessing data from the list of dictionaries
for person in individuals:
    print(f"{person['name']} is {person['age']} years old with an income of ${person['income']}.")

for mode in transport_modes:
    print(f"Transport mode: {mode['mode']}, Cost: {mode['cost']}, Comfort: {mode['comfort']}, Time: {mode['time']} minutes")


## 01.3 Introducing NumPy Arrays

While Python lists are very flexible, for numeric data we often use **NumPy arrays** for efficiency. NumPy (Numerical Python) provides a multi-dimensional array object and operations to process arrays quickly

A NumPy array is like a grid of values (all of the same type) indexed by tuple(s) of nonnegative integers. They are optimized for numeric computations, enabling vectorized operations (operating on whole arrays at once).

First, import NumPy:

In [None]:
import numpy as np


Create a NumPy array from a Python list:

In [None]:
times_list = [30, 45, 40, 60]               # regular Python list
times_array = np.array(times_list)          # NumPy array

In [None]:
times_array

In [None]:
2 * times_array

In [None]:
# difference between list and array multiplication
print("List * 2:", times_list * 2)          # List * 2 concatenates the list with itself
print("Array * 2:", times_array * 2)        # Array * 2 multiplies each element by 2

>  Notice the difference: multiplying a Python list by 2 repeats it (because for lists, `*` is defined as repetition), whereas multiplying a NumPy array by 2 performs element-wise numerical doubling. This vectorization is powerful for mathematical operations and is much faster than using loops for large arrays.

In [None]:
print("Array mean:", np.mean(times_array))    # Mean of the array
print("Array sum:", np.sum(times_array))      # Sum of the array
print("Array sqrt:", np.sqrt(times_array))    # Square root of each element

Creating a 2D NumPy array (matrix) for travel times of different modes over different distances:

In [None]:
travel_times = np.array([[30, 45, 40], [60, 50, 55], [70, 80, 75]])  # 3 modes, 3 distances

In [None]:
travel_times

In [None]:
# access first row
travel_times[0]

In [None]:
# access first column
travel_times[:, 0]

In [None]:
# access a specific element (row 2, column 3)
travel_times[1, 2]

In [None]:
# calculate the mean travel time across all modes and distances
np.mean(travel_times)

In [None]:
# calculate the mean travel time per column
np.mean(travel_times, axis=0)

In [None]:
# calculate the mean travel time per row
np.mean(travel_times, axis=1)

In [None]:
travel_times = np.array([[30, 45, 40], [60, 50, 55], [70, 80, 75]])  # 3 modes, 3 distances
print("Travel times (2D array):")
print(travel_times)
print("First row (mode 1):", travel_times[0])        # First row
print("Element at (2,1):", travel_times[2, 1])
print("Mean travel time:", np.mean(travel_times))      # Mean of all elements
print("Sum of travel times:", np.sum(travel_times))      # Sum of all elements
print("Travel times * 1.1 (10% increase):")
print(travel_times * 1.1)  # Increase all times by 10%

Example: Calculate average travel time for each mode (row-wise mean)

In [None]:
average_times = np.mean(travel_times, axis=1)
print("Average travel time per mode:", average_times)

>  NumPy arrays will be heavily used when we deal with large datasets or model computations (e.g., calculating utility for many observations at once). We will explore NumPy further in Notebook 02, but remember: when you see `np.array` and similar syntax, we are leveraging NumPy for speed and convenience in numeric calculations.

## 01.4 Functions and Control Flow

**Functions:** Functions are reusable blocks of code that perform a specific task. We define a function with the `def` keyword, specifying parameters, and use `return` to output a result. Functions help organize code and avoid repetition.

For example, let's define a simple function to compute a linear utility given some attributes:

In [None]:
def compute_utility(time, cost):
    """Compute utility as a weighted sum of time and cost (toy example)."""
    beta_time = -0.1   # coefficient for travel time (per minute)
    beta_cost = -0.5   # coefficient for travel cost (per currency unit)
    utility = beta_time * time + beta_cost * cost
    return utility

In [None]:
# Test the function
u_car = compute_utility(time=30, cost=5)   # e.g., 30 mins, £5
print("Utility for car:", u_car)

u_bus = compute_utility(time=45, cost=2)   # e.g., 45 mins, £2
print("Utility for bus:", u_bus)

u_bike = compute_utility(time=25, cost=1)   # e.g., 25 mins, £1
print("Utility for bike:", u_bike)

So, for a 30-minute trip costing £5, the utility is -5.5 (the negative sign indicating disutility from time and cost, as expected).

Another example:

In [None]:
def calculate_travel_cost(distance, mode):
    cost_per_km = {"car": 0.5, "bus": 0.2, "bike": 0.1}  # cost per km for each mode
    return distance * cost_per_km.get(mode, 0)

# Test the function
print("Travel cost (car, 100 km):", calculate_travel_cost(100, "car"))
print("Travel cost (bus, 100 km):", calculate_travel_cost(100, "bus"))
print("Travel cost (bike, 100 km):", calculate_travel_cost(100, "bike"))

Functions make code more readable and maintainable, especially when the logic might be used multiple times. We will use functions to encapsulate tasks like computing probabilities or evaluating log-likelihoods in later notebooks.

**Control Flow:** Control flow statements like **if-else** and **loops** allow us to execute code based on conditions and to repeat tasks.

* *Conditional statements*: `if` checks a condition and executes a block if true, optionally followed by `elif` (else-if) and `else` for additional cases. For example:

In [None]:
mode = "Bus"
if mode == "Car":
    print("Driving a car")
elif mode == "Bus":
    print("Taking a bus")
else:
    print("Other mode")


In [None]:
def travel_advice(mode, weather):
    if mode == "bike":
        if weather == "rainy":
            return "It's rainy, consider taking public transport instead of biking."
        else:
            return "Great weather for biking!"
    elif mode == "car":
        return "Driving a car is comfortable."
    elif mode == "bus":
        return "Taking a bus is economical."
    else:
        return "Consider walking or other modes."   

In [None]:
# Test the function
print(travel_advice("bike", "sunny"))

In [None]:
print(travel_advice("bike", "rainy"))

In [None]:
print(travel_advice("car", "cloudy"))

* *Loops*: Python has *for* loops to iterate over items in a sequence, and *while* loops to repeat until a condition is false. For instance, to iterate over our modes list:

In [None]:
for m in modes:
    print("Mode option:", m)

In [None]:
def travel_advice(mode, weather):
    advice = {
        ("bike", "sunny"): "Great weather for biking!",
        ("bike", "rainy"): "It's rainy, consider taking public transport instead of biking.",
        ("car", "cloudy"): "Driving a car is comfortable.",
        ("bus", "sunny"): "Taking a bus is economical.",
    }
    return advice.get((mode, weather), "Consider walking or other modes.")

# Test the function
print(travel_advice("bike", "sunny"))
print(travel_advice("bike", "rainy"))
print(travel_advice("car", "cloudy"))
print(travel_advice("bus", "sunny"))



In [None]:
def calculate_total_cost(travel_data):
    total_cost = 0
    for data in travel_data:
        mode = data.get("mode")
        distance = data.get("distance", 0)
        cost = calculate_travel_cost(distance, mode)
        total_cost += cost
    return total_cost

def calculate_travel_cost(distance, mode):
    cost_per_km = {"car": 0.5, "bus": 0.2, "bike": 0.1}  # cost per km for each mode
    return distance * cost_per_km.get(mode, 0)

# Test the function
travel_data = [
    {"mode": "car", "distance": 100},
    {"mode": "bus", "distance": 50},
    {"mode": "bike", "distance": 20},
]

print("Total travel cost:", calculate_total_cost(travel_data))

We will use loops to iterate over records (like going through each individual or each alternative) and if-statements to apply logic (like availability checks: e.g., if a mode is not available, skip it).

However, in data analysis with Pandas or NumPy, explicit loops are often unnecessary because we can operate on vectors, but it’s important to know how to use loops when needed (especially for clarity in simple cases or when writing simulation logic).

## 01.5 How to Import and Use an External Python Script/Function

Let’s say you have a Python script (not a package) that you’d like to reuse without pasting all its code into your notebook. Using an external script keeps your notebook clean and organized for several reasons:

- **Readability**: notebooks stay focused on analysis or explanation, not cluttered with long function definitions.

- **Reusability**: the same code can be imported into multiple notebooks or projects without duplication.

- **Maintainability**: if you fix a bug or improve a function, you only have to do it once—in the script—rather than hunting through multiple notebooks.

- **Version control**: scripts are easier to track in Git or other version-control systems; notebooks mix code, output, and metadata, which makes diffing messy.

- **Testing**: standalone scripts or modules can be tested independently before you call them in your notebook.

Let's save the following code as utility_fct.py in the same folder as your notebook:

```bash
def compute_utility_fct(time, cost):
    """Compute utility as a weighted sum of time and cost (toy example)."""
    beta_time = -0.1   # coefficient for travel time (per minute)
    beta_cost = -0.5   # coefficient for travel cost (per currency unit)
    utility = beta_time * time + beta_cost * cost
    return utility
```

In [None]:
# Now you can import and use it:

import utility_fct
u_car = utility_fct.compute_utility_fct(time=30, cost=5)
u_car

In [None]:
# Or, if you prefer a shorter syntax, or if you have many funtions in the script and you want to only import one (or some) of them:
from utility_fct import compute_utility_fct
u_bus = compute_utility_fct(time=45, cost=5)
u_bus

In [None]:
#### Stop here ###

## 01.6 Example: A Toy Softmax Utility Model (Three Travel Modes)

Now that we've covered basics, let's apply them to a simple choice modeling scenario. We will create a toy example of a traveler choosing among three travel modes (Car, Bus, Train) for a trip, using a softmax function to model choice probabilities. This mimics a Multinomial Logit model where utilities are computed and then converted to choice probabilities via the softmax (logit) formula.

**Step 1: Define mode attributes and parameters.** For simplicity, assume:

- Travel times for Car, Bus, Train are 30, 45, 40 minutes respectively.

- Travel costs for Car, Bus, Train are £5, £2, £3 respectively.

- We will use fixed utility coefficients: $\beta_{time}= -0.1$ (per minute), $\beta_{cost} = -0.5$ (per £).

Let's set this up:

In [None]:
# Modes and their attributes
modes = ["Car", "Bus", "Train"]
time = {"Car": 30, "Bus": 45, "Train": 40}   # in minutes
cost = {"Car": 5, "Bus": 2, "Train": 3}      # in £

beta_time = -0.1   # coefficient for time
beta_cost = -0.5   # coefficient for cost

# Compute utilities for each mode
utility = {}
for m in modes:
    utility[m] = beta_time * time[m] + beta_cost * cost[m]
print("Utilities:", utility)


Interestingly, with these specific values, all three utilities came out equal (-5.5) in this toy example! Let's adjust one to see differences—suppose Train cost is £4 (instead of 3):

In [None]:
cost["Train"] = 4
for m in modes:
    utility[m] = beta_time * time[m] + beta_cost * cost[m]
print("Adjusted utilities:", utility)


So Train is slightly less preferred (utility -6.0) due to the higher cost, while Car and Bus remain at -5.5.

**Step 2: Convert utilities to choice probabilities using softmax.** The softmax formula for choice probability of mode $i$ is:
$$
P(i) = \frac{\exp{(U_i)}}{\sum_{j \in modes}\exp{(U_j)}}
$$

This formula ensures probabilities are positive and sum to 1. . We'll implement this.

In [None]:
import math

# Compute softmax probabilities for each mode
exp_utilities = {m: math.exp(utility[m]) for m in modes}
sum_exp = sum(exp_utilities.values())
prob = {m: exp_utilities[m] / sum_exp for m in modes}
print("Choice probabilities:", prob)
print("Sum of probabilities:", sum(prob.values()))



Because Car and Bus utilities were equal and Train was lower, Car and Bus ended up with higher probabilities. If all utilities were equal, each mode would be ~38.4%. The softmax (logit) model captures the idea that modes with higher utility (higher attractiveness) have higher choice probability.

**Step 3: Interpret the results.** In our toy scenario, if Car and Bus have the same utility, a traveler is indifferent between them, and they each get about one-third probability (sharing the market equally with the third mode). If we made Car more attractive (say reduce Car's time or cost), its probability would increase. You can try modifying the inputs:

In [None]:
time["Car"] = 20  # Car becomes much faster
# Recompute utility and probabilities...
for m in modes:
    utility[m] = beta_time * time[m] + beta_cost * cost[m]
exp_utilities = {m: math.exp(utility[m]) for m in modes}
sum_exp = sum(exp_utilities.values())
prob = {m: exp_utilities[m] / sum_exp for m in modes}
print("Updated choice probabilities:", prob)
print("Sum of probabilities:", sum(prob.values()))

Compute the joint likelihood as the product of individual choice probabilities:

In [None]:
def likelihood_function(data, params):
# Compute the joint likelihood as the product of individual choice probabilities
    likelihood = 1.0
    for entry in data:
        # Compute probability of observed choice given params
        prob = compute_choice_probability(entry, params)
        likelihood *= prob
    return likelihood

def compute_choice_probability(entry, params):
    # Utility and softmax helpers
    def utility_function(attributes, params):
        # attributes is expected to be a dict with keys "time" and "cost"
        return params["beta_time"] * attributes["time"] + params["beta_cost"] * attributes["cost"]

    def softmax(utilities):
        exp_utilities = {k: math.exp(v) for k, v in utilities.items()}
        sum_exp = sum(exp_utilities.values())
        return {k: v / sum_exp for k, v in exp_utilities.items()}

    attributes = entry["attributes"]

    # Two supported attribute formats:
    # 1) attributes is a dict mapping alternative -> attribute-dict:
    #    {"Car": {"time":..,"cost":..}, "Bus": {...}, ...}
    # 2) attributes is a single attribute-dict for the chosen alternative (NOT recommended
    #    for computing choice probabilities across alternatives). We will handle only format (1)
    #    here to compute meaningful choice probabilities across alternatives.
    if not isinstance(attributes, dict) or not attributes:
        raise ValueError("Entry 'attributes' must be a non-empty dict mapping alternatives to attribute dicts.")

    sample_value = next(iter(attributes.values()))
    if not isinstance(sample_value, dict):
        # If values are not dicts, the provided structure is not the expected per-alternative format.
        raise ValueError(
            "Each value in entry['attributes'] must be a dict with keys 'time' and 'cost'. "
            "Provide attributes as: {'Car': {'time':..,'cost':..}, 'Bus': {...}, ...}."
        )

    # Compute utilities for all alternatives (format 1)
    utilities = {alt: utility_function(attrs, params) for alt, attrs in attributes.items()}

    # Compute choice probabilities using softmax
    probabilities = softmax(utilities)

    # Return probability of the chosen alternative
    chosen = entry["choice"]
    return probabilities.get(chosen, 0.0)

# Example data and parameters (attributes provided per alternative)
data = [
    {
        "choice": "Car",
        "attributes": {
            "Car":   {"time": 30, "cost": 5},
            "Bus":   {"time": 45, "cost": 2},
            "Train": {"time": 40, "cost": 3}
        }
    },
    {
        "choice": "Bus",
        "attributes": {
            "Car":   {"time": 30, "cost": 5},
            "Bus":   {"time": 45, "cost": 2},
            "Train": {"time": 40, "cost": 3}
        }
    }
]

params = {"beta_time": -0.1, "beta_cost": -0.5}

# Compute likelihood
likelihood = likelihood_function(data, params)
print("Likelihood:", likelihood)




### Estimate parameters by MLE using the existing choice-probability function

Run a maximum-likelihood optimizer that calls the already-defined `compute_choice_probability` for each observation and minimizes the negative log-likelihood.

What it does
- Converts per-observation choice probabilities to a joint negative log-likelihood.
- Uses SciPy's `minimize` (BFGS) to find the parameters that minimize the negative log-likelihood.
- Penalizes zero probabilities to keep the optimizer away from invalid regions.

Checklist before running
- Ensure `math` is imported and `compute_choice_probability` is in scope.
- Check `result.success` and `result.message` after optimization.
- For stability and scale, prefer the log-likelihood for larger datasets.

Quick diagnostics to run after optimization
- Print `result.x`, `result.fun`, and `result.success`.
- Try different starting values to check robustness.

In [None]:
import numpy as np
from scipy.optimize import minimize

def logsumexp(u):
    m = np.max(u)
    return m + np.log(np.sum(np.exp(u - m)))

def neg_log_likelihood(theta, data):
    beta_time, beta_cost = theta
    nll = 0.0
    for entry in data:
        attrs = entry["attributes"]
        alts = list(attrs.keys())
        utils = np.array([beta_time * attrs[a]["time"] + beta_cost * attrs[a]["cost"] for a in alts])
        try:
            chosen_idx = alts.index(entry["choice"])
        except ValueError:
            return np.inf
        nll += (logsumexp(utils) - utils[chosen_idx])
    return nll

# example optimization
init = np.array([0.0, 0.0])
res = minimize(neg_log_likelihood, x0=init, args=(data,), method="BFGS", options={"disp": True})
print("success:", res.success, res.message)
print("estimates:", res.x, "nll:", res.fun)

# approx SEs from inverse Hessian (when available)
if res.success and hasattr(res, "hess_inv"):
    cov = np.array(res.hess_inv)            # BFGS returns an object convertible to array
    se = np.sqrt(np.diag(cov))
    print("approx SEs:", se)

### Simulation-recovery test (validate your estimation)

Simulate synthetic choice data using specified true betas and then re-estimates parameters on the simulated data to check recovery.

Purpose
- Confirms the estimator and likelihood are implemented correctly.
- Reveals issues with identification, numerical stability, or optimization.

Usage notes
- Choose a reasonably large N (e.g., 500–2000) to get stable recovery.
- Fix the RNG seed for reproducibility (`np.random.default_rng(seed)`).
- Compare recovered estimates to the true values and inspect standard errors / CI.
- If recovery fails, try more observations, different seeds, or multiple optimizer starts.

In [None]:
def simulate_data(N, modes, times, costs, beta_time, beta_cost, rng=None):
    rng = rng or np.random.default_rng(0)
    data = []
    for _ in range(N):
        utils = np.array([beta_time*times[m] + beta_cost*costs[m] for m in modes])
        probs = np.exp(utils - logsumexp(utils))
        choice = rng.choice(modes, p=probs)
        attrs = {m: {"time": times[m], "cost": costs[m]} for m in modes}
        data.append({"choice": choice, "attributes": attrs})
    return data

# simulate and recover
true = (-0.1, -0.5)
sim_data = simulate_data(1000, modes, time, cost, *true)
res_sim = minimize(neg_log_likelihood, x0=np.array([0.0,0.0]), args=(sim_data,), method="BFGS")
print("recovered:", res_sim.x, "true:", true)

Make sure you understand each code snippet and run them yourself. Modify values or add print statements to explore (e.g., what if cost has a positive coefficient, etc.). In choice modelling, getting the logic right (like correctly computing utility or probabilities) is crucial, and these basic Python skills are the foundation for more advanced analysis.

In the next notebook, we will introduce **NumPy** and **Pandas** in depth and start working with a real dataset: the Apollo mode choice data. Keep this fundamentals notebook in mind, as you will see lists, dicts, loops, and functions being used frequently in the context of data analysis and modeling.