# Python Course - Tutorial 10

### Exercise 1: Generator for a Simulated Price Path (Geometric Brownian Motion)

In empirical finance, asset prices are often modeled as stochastic processes. In this exercise, you will implement a generator that produces a simulated price path step by step.

Assume the price process follows **Geometric Brownian Motion (GBM)**. In discrete time with step size `dt`, one common simulation scheme is:

$$
S_{t+dt} = S_t \cdot \exp\left(\left(\mu - \frac{1}{2}\sigma^2\right)dt + \sigma \sqrt{dt}\, Z_t\right),
\quad Z_t \sim \mathcal{N}(0,1).
$$

Implement a generator function `gbm_prices(S0, mu, sigma, dt, n_steps)` that yields a stream of prices.

1. Use `numpy.random.normal(0, 1)` to generate the normal shocks \(Z_t\).
2. The generator should start at `S0` and yield prices one-by-one using `yield`.
3. Yield tuples `(t, S_t)` where `t` is the current time (starting at `0.0`) and `S_t` is the current simulated price.
4. Demonstrate that your generator is lazy by:
   - creating the generator object,
   - calling `next()` a few times manually,
   - and then iterating over the remaining values in a `for` loop.
5. Using the stream of generated prices, compute the **running mean** of the simulated price (online, without storing all prices in a list). Print the final running mean after the simulation ends.

**Do not store the full simulated path in memory.**


In [None]:
import numpy as np


def gbm_prices(S0: float, mu: float, sigma: float, dt: float, n_steps: int, seed: int | None = None):
    # Yield (t, S_t) from a Geometric Brownian Motion simulated step-by-step.
    if n_steps < 0:
        raise ValueError("n_steps must be non-negative.")
    if dt <= 0:
        raise ValueError("dt must be positive.")
    if sigma < 0:
        raise ValueError("sigma must be non-negative.")

    rng = np.random.default_rng(seed)

    t = 0.0
    S = float(S0)

    yield t, S  # include initial state

    drift = (mu - 0.5 * sigma**2) * dt
    vol = sigma * np.sqrt(dt)

    for _ in range(n_steps):
        z = rng.normal(0.0, 1.0)
        S *= np.exp(drift + vol * z)
        t += dt
        yield t, S


# Create the generator (nothing is simulated yet until you iterate).
gen = gbm_prices(S0=100.0, mu=0.05, sigma=0.2, dt=1 / 252, n_steps=252, seed=42)

# Pull a few values manually (proves streaming / laziness).
print(next(gen))
print(next(gen))
print(next(gen))

# Continue consuming the remaining stream and compute a running mean online.
count = 0
running_mean = 0.0

for _, price in gen:
    count += 1
    running_mean += (price - running_mean) / count  # stable online update

print("Final running mean (excluding the first 3 printed values):", running_mean)


### Exercise 2: Type Hinting 

Your task is to **add type hints** to the code in the next two cells (one basic, one slightly harder).  
Do not change the logic. Only add annotations and any necessary imports from `typing`.

Useful references (Official Python Documentation):
- [typing — Support for type hints](https://docs.python.org/3/library/typing.html)
- [Built-in generic types (`list[int]`, `dict[str, float]`, ...)](https://docs.python.org/3/library/stdtypes.html)

What you are expected to use:
- `Union` types via `A | B` (or `Union[A, B]`)
- `Any` for values where you cannot be precise
- Container types like `list[...]`, `dict[..., ...]`, `tuple[...]`
- `Callable[...]` for function arguments
- `None` types via `T | None`

**Exercise:** Add type hints (and required imports) to the two functions in the code cells.

In [None]:
# Exercise 2.1

def format_value(x, decimals=2, missing="NA"):
    # Solution (type hints):
    # def format_value(x: int | float | None, decimals: int | float = 2, missing: str = "NA") -> str:

    if x is None:
        return missing
    if isinstance(x, (int, float)):
        return f"{x:.{decimals}f}"
    return str(x)


def mean(values):
    # Solution (type hints):
    # def mean(values: list[int | float]) -> float:

    if not values:
        raise ValueError("values must not be empty")
    return sum(values) / len(values)

In [None]:
# Exercise 2.2

def parse_observation(row):
    # Solution (type hints):
    # from typing import Any
    # ObsRow = dict[str, Any] | tuple[str, float]
    # Obs = dict[str, float | str]
    # def parse_observation(row: ObsRow) -> Obs:

    if isinstance(row, dict):
        date = row["date"]
        value = row["value"]
    else:
        date, value = row

    return {"date": str(date), "value": float(value)}


def filter_and_transform(rows, predicate, transform=None):
    # Solution (type hints):
    # from typing import Any, Callable, Iterable
    # ObsRow = dict[str, Any] | tuple[str, float]
    # Obs = dict[str, float | str]
    # def filter_and_transform(
    #     rows: Iterable[ObsRow],
    #     predicate: Callable[[Obs], bool],
    #     transform: Callable[[Obs], float] | None = None,
    # ) -> list[float] | list[Obs]:

    out = []
    for r in rows:
        obs = parse_observation(r)
        if predicate(obs):
            out.append(transform(obs) if transform is not None else obs)
    return out


### Exercise 3: Baseball Analytics (Sabermetrics)

You have just started an internship in a baseball analytics group. The team you support works in the "Moneyball spirit": using data-driven methods to evaluate players and make decisions under uncertainty and limited budgets.   
To get you onboarded, your supervisor asks you to build a small **object-oriented** analysis system that stores season totals and computes a few standard sabermetrics-style indicators.

Teams like the **Los Angeles Dodgers** are often highlighted as modern, analytics-driven organizations. Your job here is not to "predict the World Series", but to build clean tooling that makes analysis reproducible and easy to extend.

**Further reading (optional):**
- Sabermetrics overview and metric definitions: [FanGraphs Sabermetrics Library](https://library.fangraphs.com/)
- OPS definition (used for ranking): [FanGraphs: OPS](https://library.fangraphs.com/offense/ops/)
- Moneyball background: [Michael Lewis (official site)](https://www.michaellewiswrites.com/) and [Moneyball (book)](https://en.wikipedia.org/wiki/Moneyball:_The_Art_of_Winning_an_Unfair_Game)


#### Background: batting totals and simple metrics (needed for this exercise)

You will work with **season totals** (aggregated counts). The following abbreviations are used:

- `AB` (At-Bats): number of official batting attempts  
- `H` (Hits): number of times the player gets a hit  
- `2B` (Doubles): hits where the batter reaches second base  
- `3B` (Triples): hits where the batter reaches third base  
- `HR` (Home Runs): hits where the batter scores directly  
- `BB` (Walks): batter reaches first base due to four balls  
- `HBP` (Hit By Pitch): batter is awarded first base after being hit  
- `SF` (Sacrifice Flies): a plate appearance resulting in an out with a run scoring (used in OBP denominator)

From these totals, compute the following **performance metrics**:

- **Batting Average (BA)**  
  $$
  BA = \frac{H}{AB}
 $$

- **On-Base Percentage (OBP)**  
  $$
  OBP = \frac{H + BB + HBP}{AB + BB + HBP + SF}
 $$

- **Slugging Percentage (SLG)**  
  First compute singles:  
  $$
  1B = H - 2B - 3B - HR
  $$  
  Then total bases:  
  $$
  TB = 1B + 2\cdot 2B + 3\cdot 3B + 4\cdot HR
  $$  
  And:
  $$
  SLG = \frac{TB}{AB}
  $$

- **OPS (On-base Plus Slugging)**  
  $$
  OPS = OBP + SLG
 $$

If a denominator is zero, return `0.0` for that metric.

#### Tasks

##### 1. Create a `Person` class
- Attributes: `name`, `age`
- Method to display basic information
- Implement `__str__` for readable output

##### 2. Create a `Player` class (inherits from `Person`)
- Attributes: `player_id`, `position`
- Maintain a **private** stats container holding batting totals (counts such as `AB`, `H`, `BB`, `HR`, ...)
- Provide access to the stats through a `@property` (read-only; do not expose the private container directly)
- Implement a method that updates season totals by adding new game totals (counts)

##### 3. Implement player performance metrics (computed properties)
Implement computed properties for:
- Batting Average (BA)
- On-Base Percentage (OBP)
- Slugging Percentage (SLG)
- OPS = OBP + SLG

##### 4. Compare players by performance
- Implement comparison so players can be sorted by performance (use OPS as the default ranking)

##### 5. Create a `Team` class
- Attributes: `name` and a container of `players`
- Methods:
  - add players
  - list players
  - return top-n players by OPS
- Implement a team-level metric (e.g. average OPS across players)

##### 6. Create an `Analyst` class (inherits from `Person`)
- Attributes: `analyst_id`, `department`
- Method: produce a ranking report for a given team (top players + team summary)

##### 7. Test scenario
Create:
- one team
- one analyst
- multiple players with different season totals

Then:
- add players to the team
- update at least one player with additional stats
- print a readable report and show player sorting works


In [None]:
from dataclasses import dataclass
from typing import Dict, List, Tuple


@dataclass
class Person:
    name: str
    age: int

    def display(self) -> None:
        print(f"Name: {self.name}, Age: {self.age}")

    def __str__(self) -> str:
        return f"{self.name} ({self.age})"


class Player(Person):
    """
    Stores batting totals and computes simple sabermetric-style metrics.

    Stats used (counts):
      AB  = at-bats
      H   = hits
      2B  = doubles
      3B  = triples
      HR  = home runs
      BB  = walks
      HBP = hit by pitch
      SF  = sac flies
    """

    def __init__(self, name: str, age: int, player_id: str, position: str):
        super().__init__(name, age)
        self.player_id = player_id
        self.position = position
        self.__stats: Dict[str, int] = {
            "AB": 0,
            "H": 0,
            "2B": 0,
            "3B": 0,
            "HR": 0,
            "BB": 0,
            "HBP": 0,
            "SF": 0,
        }

    @property
    def stats(self) -> Dict[str, int]:
        # Return a copy so callers cannot mutate the private container.
        return dict(self.__stats)

    def add_totals(self, **game_totals: int) -> None:
        # Adds counts (e.g., AB=4, H=2, BB=1). Missing keys are treated as 0.
        for k, v in game_totals.items():
            if k not in self.__stats:
                raise KeyError(f"Unknown stat '{k}'. Allowed: {sorted(self.__stats.keys())}")
            if v < 0:
                raise ValueError("Stat increments must be non-negative.")
            self.__stats[k] += int(v)

        # Basic integrity checks
        if self.__stats["H"] > self.__stats["AB"]:
            raise ValueError("Invalid totals: hits cannot exceed at-bats.")
        if self.__stats["2B"] + self.__stats["3B"] + self.__stats["HR"] > self.__stats["H"]:
            raise ValueError("Invalid totals: extra-base hits cannot exceed hits.")

    def __str__(self) -> str:
        return f"{self.name} [{self.player_id}] - {self.position} | OPS={self.ops:.3f}"

    def _safe_div(self, num: float, den: float) -> float:
        return num / den if den != 0 else 0.0

    @property
    def ba(self) -> float:
        s = self.__stats
        return self._safe_div(s["H"], s["AB"])

    @property
    def obp(self) -> float:
        s = self.__stats
        # OBP = (H + BB + HBP) / (AB + BB + HBP + SF)
        return self._safe_div(s["H"] + s["BB"] + s["HBP"], s["AB"] + s["BB"] + s["HBP"] + s["SF"])

    @property
    def slg(self) -> float:
        s = self.__stats
        singles = s["H"] - s["2B"] - s["3B"] - s["HR"]
        total_bases = singles + 2 * s["2B"] + 3 * s["3B"] + 4 * s["HR"]
        return self._safe_div(total_bases, s["AB"])

    @property
    def ops(self) -> float:
        return self.obp + self.slg

    # Compare players by OPS by default
    def __lt__(self, other: "Player") -> bool:
        if not isinstance(other, Player):
            return NotImplemented
        return self.ops < other.ops

    def __eq__(self, other: object) -> bool:
        if not isinstance(other, Player):
            return False
        return self.player_id == other.player_id


class Team:
    def __init__(self, name: str):
        self.name = name
        self._players: List[Player] = []

    @property
    def players(self) -> Tuple["Player", ...]:
        # Expose an immutable view
        return tuple(self._players)

    def add_player(self, player: Player) -> None:
        if any(p.player_id == player.player_id for p in self._players):
            raise ValueError(f"Player with id '{player.player_id}' already on team.")
        self._players.append(player)

    def top_players(self, n: int = 3) -> List[Player]:
        return sorted(self._players, reverse=True)[:n]

    def average_ops(self) -> float:
        if not self._players:
            return 0.0
        return sum(p.ops for p in self._players) / len(self._players)

    def __str__(self) -> str:
        return f"Team {self.name} (players={len(self._players)})"


class Analyst(Person):
    def __init__(self, name: str, age: int, analyst_id: str, department: str):
        super().__init__(name, age)
        self.analyst_id = analyst_id
        self.department = department

    def team_report(self, team: Team, top_n: int = 3) -> str:
        lines = []
        lines.append(f"Analyst: {self.name} [{self.analyst_id}] ({self.department})")
        lines.append(f"{team}")
        lines.append(f"Team average OPS: {team.average_ops():.3f}")
        lines.append("")
        lines.append(f"Top {top_n} players by OPS:")
        for i, p in enumerate(team.top_players(top_n), start=1):
            s = p.stats
            lines.append(
                f"{i}. {p.name:18s} OPS={p.ops:.3f}  "
                f"(BA={p.ba:.3f}, OBP={p.obp:.3f}, SLG={p.slg:.3f})  "
                f"AB={s['AB']}, H={s['H']}, HR={s['HR']}, BB={s['BB']}"
            )
        return "\n".join(lines)


# --- Test scenario ---
team = Team("Internship Dodgers Project")
analyst = Analyst("Felix", 26, analyst_id="A-019", department="Baseball Analytics")

p1 = Player("Player One", 28, player_id="P001", position="OF")
p2 = Player("Player Two", 30, player_id="P002", position="1B")
p3 = Player("Player Three", 24, player_id="P003", position="SS")

# Season totals (illustrative)
p1.add_totals(AB=400, H=120, **{"2B": 25, "3B": 4, "HR": 20}, BB=55, HBP=2, SF=5)
p2.add_totals(AB=380, H=95,  **{"2B": 18, "3B": 1, "HR": 28}, BB=70, HBP=1, SF=6)
p3.add_totals(AB=420, H=140, **{"2B": 30, "3B": 6, "HR": 12}, BB=40, HBP=3, SF=4)

team.add_player(p1)
team.add_player(p2)
team.add_player(p3)

# Update one player with another "game"
p2.add_totals(AB=4, H=2, **{"2B": 1, "3B": 0, "HR": 0}, BB=1, HBP=0, SF=0)

print(analyst.team_report(team, top_n=3))
print("\nSorted players (descending OPS):")
for p in sorted(team.players, reverse=True):
    print(" ", p)


## Exercise 4: Formula 1 Data Science 

You have been offered a short internship in the data science unit of a Formula 1 team. Your group supports race-weekend decisions by combining telemetry, lap-time data, and simulation-based strategy evaluation. Your job is to build a small **object-oriented** system that stores lap times for multiple drivers and produces simple performance summaries.

This is inspired by how modern teams operate at scale: data engineering + modeling + fast reporting for engineers and strategists. A high-profile example is **Oracle Red Bull Racing**, where analytics and simulation play a central role in race preparation and strategy.

**Further reading (optional):**
- “Turning data into decisions” (Oracle Red Bull Racing): [Oracle blog](https://blogs.oracle.com/connect/oracle-red-bull-racing-turns-data-into-decisions)
- Hands-on (optional): [Oracle LiveLabs workshop](https://livelabs.oracle.com/ords/r/dbpm/livelabs/view-workshop?wid=909)

One core task in race analysis is turning raw lap time sequences into decision-ready metrics (pace and consistency). Your goal here is not realism, but clean, reusable code structure that matches how analysis tooling is organized.

### Tasks

#### 1. Create a `Person` class
- Attributes: `name`, `age`
- Implement `__str__`

#### 2. Create a `Driver` class (inherits from `Person`)
- Attributes: `driver_id`, `team_name`
- Maintain a **private** container of lap times (seconds)
- Provide access through a `@property` (do not expose the private container directly)
- Method to add lap times

#### 3. Implement driver metrics (computed properties)
Implement computed properties for:
- Best lap time
- Average lap time
- A simple consistency metric (e.g. standard deviation of lap times)

#### 4. Compare drivers by performance
- Implement comparisons so drivers can be sorted by performance (lower average lap time is better)

#### 5. Create a `Session` class
- Attributes: `name` (e.g. `"FP1"`), `track`, and a container of drivers
- Methods:
  - register drivers
  - record laps for a driver
  - compute a leaderboard

#### 6. Create an `Engineer` class (inherits from `Person`)
- Attributes: `employee_id`, `area` (e.g. `"Performance"`)
- Method: produce a session report (leaderboard + summary stats)

#### 7. Test scenario
Create:
- one session with a track name
- one engineer
- multiple drivers with lap times

Then:
- record laps (for each driver)
- print a leaderboard and show sorting works


In [None]:
from dataclasses import dataclass
from math import sqrt
from typing import Dict, List, Tuple


@dataclass
class Person:
    name: str
    age: int

    def __str__(self) -> str:
        return f"{self.name} ({self.age})"


class Driver(Person):
    def __init__(self, name: str, age: int, driver_id: str, team_name: str):
        super().__init__(name, age)
        self.driver_id = driver_id
        self.team_name = team_name
        self.__laps: List[float] = []

    @property
    def lap_times(self) -> Tuple[float, ...]:
        return tuple(self.__laps)

    def add_lap(self, lap_time_seconds: float) -> None:
        if lap_time_seconds <= 0:
            raise ValueError("Lap time must be positive.")
        self.__laps.append(float(lap_time_seconds))

    @property
    def best_lap(self) -> float:
        return min(self.__laps) if self.__laps else float("inf")

    @property
    def avg_lap(self) -> float:
        return sum(self.__laps) / len(self.__laps) if self.__laps else float("inf")

    @property
    def consistency(self) -> float:
        # Population standard deviation (simple and sufficient for this exercise)
        if not self.__laps:
            return float("inf")
        m = self.avg_lap
        var = sum((x - m) ** 2 for x in self.__laps) / len(self.__laps)
        return sqrt(var)

    def __str__(self) -> str:
        return f"{self.name} [{self.driver_id}] {self.team_name} | avg={self.avg_lap:.3f}s"

    # Compare drivers by avg lap time (lower is better)
    def __lt__(self, other: "Driver") -> bool:
        if not isinstance(other, Driver):
            return NotImplemented
        return self.avg_lap < other.avg_lap

    def __eq__(self, other: object) -> bool:
        if not isinstance(other, Driver):
            return False
        return self.driver_id == other.driver_id


class Session:
    def __init__(self, name: str, track: str):
        self.name = name
        self.track = track
        self._drivers: Dict[str, Driver] = {}

    @property
    def drivers(self) -> Tuple[Driver, ...]:
        return tuple(self._drivers.values())

    def register_driver(self, driver: Driver) -> None:
        if driver.driver_id in self._drivers:
            raise ValueError(f"Driver '{driver.driver_id}' already registered.")
        self._drivers[driver.driver_id] = driver

    def record_lap(self, driver_id: str, lap_time_seconds: float) -> None:
        if driver_id not in self._drivers:
            raise KeyError(f"Unknown driver_id '{driver_id}'.")
        self._drivers[driver_id].add_lap(lap_time_seconds)

    def leaderboard(self) -> List[Driver]:
        # Sort by avg lap time (lower is better)
        return sorted(self._drivers.values())

    def __str__(self) -> str:
        return f"Session {self.name} @ {self.track} (drivers={len(self._drivers)})"


class Engineer(Person):
    def __init__(self, name: str, age: int, employee_id: str, area: str):
        super().__init__(name, age)
        self.employee_id = employee_id
        self.area = area

    def session_report(self, session: Session) -> str:
        lines = []
        lines.append(f"Engineer: {self.name} [{self.employee_id}] ({self.area})")
        lines.append(str(session))
        lines.append("")
        lines.append("Leaderboard (by average lap time):")
        for i, d in enumerate(session.leaderboard(), start=1):
            laps = d.lap_times
            lines.append(
                f"{i}. {d.name:16s} team={d.team_name:10s} "
                f"avg={d.avg_lap:.3f}s best={d.best_lap:.3f}s cons={d.consistency:.3f}s "
                f"(laps={len(laps)})"
            )
        return "\n".join(lines)


# --- Test scenario ---
session = Session("FP1", "Silverstone")
engineer = Engineer("Felix", 26, employee_id="E-044", area="Performance")

d1 = Driver("Driver A", 27, driver_id="44", team_name="Team X")
d2 = Driver("Driver B", 25, driver_id="16", team_name="Team Y")
d3 = Driver("Driver C", 29, driver_id="01", team_name="Team X")

for d in (d1, d2, d3):
    session.register_driver(d)

# Record some lap times (seconds)
for t in (88.120, 87.950, 88.010, 87.990):
    session.record_lap("44", t)

for t in (88.400, 88.220, 88.310, 88.260):
    session.record_lap("16", t)

for t in (87.880, 87.920, 87.910, 87.860):
    session.record_lap("01", t)

print(engineer.session_report(session))

print("\nSorted drivers (by avg lap time):")
for d in sorted(session.drivers):
    print(" ", d)


### Optional Exercise: OOP Practice with `turtle` 

In the lecture, we introduced the core OOP ideas you will repeatedly use in applied work: **classes**, **objects**, **state**, and **methods**. In empirical projects, these ideas help you structure code so that it is readable, reusable, and easier to test.

For an optional, hands-on way to practice, Python’s built-in `turtle` library provides immediate visual feedback: when an object’s state changes (position, heading, pen state), you can see the result instantly.   
This makes it a useful sandbox for building intuition about how methods operate on object state.

**Official references:**
- [turtle — Turtle graphics (Python docs)](https://docs.python.org/3/library/turtle.html)
- [Turtle methods (`forward`, `left`, `penup`, `pendown`, ...)](https://docs.python.org/3/library/turtle.html#turtle-methods)

What to focus on (application of the OOP basics):
- Define a class that *encapsulates* a turtle object and exposes higher-level methods.
- Use instance attributes to represent **state** (e.g., current position, step counter).
- Separate responsibilities: one object draws, another controls logic (optional).

Possible, small extensions:
- Add a `steps` counter and a method that moves the turtle and updates the counter.
- Add a method `draw_square(size)` that draws a square starting at the turtle’s current position.
- Add a method `random_walk(n_steps, step_length)` that performs a random walk (left/right turns), while tracking turns.

In [None]:
# Starting Point
import turtle
import random

class TurtleRobot:
    def __init__(self, step_length=20):
        self.t = turtle.Turtle()
        self.t.shape("turtle")
        self.t.speed(0)
        self.step_length = step_length
        self.steps = 0
        self.turns = 0

    def step(self):
        self.t.forward(self.step_length)
        self.steps += 1

    def turn_left(self):
        self.t.left(90)
        self.turns += 1

    def turn_right(self):
        self.t.right(90)
        self.turns += 1

    def random_walk(self, n_steps=50):
        for _ in range(n_steps):
            if random.random() < 0.5:
                self.turn_left()
            else:
                self.turn_right()
            self.step()

screen = turtle.Screen()
screen.title("TurtleRobot Demo")

robot = TurtleRobot(step_length=25)
robot.random_walk(n_steps=60)

print("Steps:", robot.steps)
print("Turns:", robot.turns)

screen.mainloop()