# Session 2: OOP


## When Should You Use OOP?

Use classes when you want:
- **data + behavior** together
- constraints/invariants (e.g., "age must be between 0–200")
- a reusable abstraction with a clear interface

Don't force OOP when:
- a dict is enough
- you have only one function using the data


## A Minimal Class

Key idea:
- `__init__` runs when you create the object
- `self` is the object being created/used


In [None]:
class Person:
    def __init__(self, name: str, age: int) -> None:
        self.name = name
        self.age = age

    def greet(self) -> str:
        return f"Hi, I'm {self.name}"

# Using the class
p = Person("Sara Ahmed", 23)
print(p.name)
print(p.greet())


## Exercise 1: Complete the Person Class

**Task:** Add the following to the `Person` class:
1. A `__repr__` method for nice printing
2. A `first_name` property (computed from `name`)
3. A `last_name` property (computed from `name`)
4. Age validation: age must be between 0 and 200

**Hint:** Use `@property` for computed attributes and `@age.setter` for validation.


In [3]:
### CODE START HERE ###
class Person:
    def __init__(self, name: str, age: int) -> None:
        self.name = name
        self.age = age  # This should call the setter
    
    # Add __repr__ method here
    ...
    
    # Add first_name property here
    ...
    
    # Add last_name property here
    ...
    
    # Add age property with validation here
    ...
### CODE END HERE ###

# Test it
p = Person("Sara Ahmed", 23)
print(p)
print(f"First: {p.first_name}, Last: {p.last_name}")


<__main__.Person object at 0x79fc8d203ed0>


AttributeError: 'Person' object has no attribute 'first_name'

In [4]:
from tests import test_person_class
test_person_class(Person)

❌ Some tests failed. Here's what went wrong:

Test 1: __repr__ method
  Input: Person("Sara Ahmed", 23)
  Expected output: repr string containing "Person", "Sara Ahmed", and "23"
  Actual output: <__main__.Person object at 0x79fc8d3e6bd0>

Test 2: first_name property
  Input: Person("Sara Ahmed", 23).first_name
  Expected output: Sara
  Actual output: None
  [91mError: AttributeError: first_name property not implemented[0m

Test 3: last_name property
  Input: Person("Sara Ahmed", 23).last_name
  Expected output: Ahmed
  Actual output: None
  [91mError: AttributeError: last_name property not implemented[0m

Test 4: Age validation (too high)
  Input: Person("Test", 23); p.age = 300
  Expected output: ValueError: age must be between 0 and 200
  Actual output: No error raised (validation failed)

Test 5: Age validation (negative)
  Input: Person("Test", 23); p.age = -5
  Expected output: ValueError: age must be between 0 and 200
  Actual output: No error raised (validation failed)


5 

In [None]:
# Solution
class Person:
    def __init__(self, name: str, age: int) -> None:
        self.name = name
        # This assignment triggers the age setter, which validates the value
        self.age = age  # calls the setter

    def __repr__(self) -> str:
        # Use !r to get the repr of the string (adds quotes automatically)
        return f"Person(name={self.name!r}, age={self.age})"

    @property
    def first_name(self) -> str:
        # Split name and take the first part
        parts = self.name.split()
        if not parts:
            return ""
        return parts[0]

    @property
    def last_name(self) -> str:
        # Split name and take the last part (handles multiple middle names)
        parts = self.name.split()
        if not parts:
            return ""
        return parts[-1]

    @property
    def age(self) -> int:
        # Getter returns the private _age attribute
        return self._age

    @age.setter
    def age(self, value: int) -> None:
        # Validator: age must be between 0 and 200
        if value < 0 or value > 200:
            raise ValueError("age must be between 0 and 200")
        # Store in private attribute to avoid infinite recursion
        self._age = value

## Exercise 2: Build ColumnProfile Class

**Task:** Create a `ColumnProfile` class for our CSV profiler.

**Requirements:**
- `__init__`: takes `name`, `inferred_type`, `total`, `missing`, `unique`
- `missing_pct` property: returns percentage (0-100) of missing values
- `to_dict()` method: returns a dictionary with all fields
- `__repr__` method: for nice printing

**Checkpoint:** `missing_pct` returns a number between `0` and `100`.


In [5]:
### CODE START HERE ###
class ColumnProfile:
    def __init__(self, name: str, inferred_type: str, total: int, missing: int, unique: int):
        # Your code here
        ...
    
    @property
    def missing_pct(self) -> float:
        # Your code here
        ...
    
    def to_dict(self) -> dict[str, str | int | float]:
        # Your code here
        ...
    
    def __repr__(self) -> str:
        # Your code here
        ...
### CODE END HERE ###


In [None]:
from tests import test_column_profile_class
test_column_profile_class(ColumnProfile)

❌ Some tests failed. Here's what went wrong:

Test 1: Basic initialization
  Input: ColumnProfile("age", "number", 100, 5, 95)
  Expected output: name="age", total=100
  Actual output: None
  [91mError: AttributeError: 'ColumnProfile' object has no attribute 'name'[0m

Test 2: missing_pct calculation
  Input: ColumnProfile("test", "text", 100, 10, 90).missing_pct
  Expected output: 10.0
  Actual output: None
  [91mError: TypeError: unsupported operand type(s) for -: 'NoneType' and 'float'[0m

Test 3: missing_pct with zero total
  Input: ColumnProfile("test", "text", 0, 0, 0).missing_pct
  Expected output: 0.0
  Actual output: None

Test 4: to_dict method
  Input: ColumnProfile("age", "number", 100, 5, 95).to_dict()
  Expected output: dict type
  Actual output: NoneType type

Test 5: __repr__ method
  Input: repr(ColumnProfile("age", "number", 100, 5, 95))
  Expected output: repr string containing "ColumnProfile" and "age"
  Actual output: None
  [91mError: TypeError: __repr__ retu

In [None]:
# Solution
class ColumnProfile:
    def __init__(self, name: str, inferred_type: str, total: int, missing: int, unique: int):
        # Store all the basic column statistics
        self.name = name
        self.inferred_type = inferred_type
        self.total = total
        self.missing = missing
        self.unique = unique

    @property
    def missing_pct(self) -> float:
        # Avoid division by zero: if total is 0, return 0.0
        # Otherwise calculate percentage: (missing / total) * 100
        return 0.0 if self.total == 0 else 100.0 * self.missing / self.total

    def to_dict(self) -> dict[str, str | int | float]:
        # Convert the column profile to a dictionary for JSON serialization
        # Note: "type" key maps to inferred_type attribute
        return {
            "name": self.name,
            "type": self.inferred_type,
            "total": self.total,
            "missing": self.missing,
            "missing_pct": self.missing_pct,  # Property is computed, not stored
            "unique": self.unique,
        }

    def __repr__(self) -> str:
        # Multi-line string for readability, using !r for proper quoting
        return (
            f"ColumnProfile(name={self.name!r}, type={self.inferred_type!r}, "
            f"missing={self.missing}, total={self.total}, unique={self.unique})"
        )

## Inheritance: Reuse Behavior

Inheritance allows you to create new classes based on existing ones.


In [None]:
class Employee(Person):
    def __init__(self, name: str, age: int, salary: float) -> None:
        super().__init__(name, age)
        self.salary = salary

class Student(Person):
    def __init__(self, name: str, age: int, grades: list[float]) -> None:
        super().__init__(name, age)
        self.grades = grades

    @property
    def average(self) -> float:
        if not self.grades:
            return 0.0
        return sum(self.grades) / len(self.grades)

# Both inherit from Person
emp = Employee("Ali", 30, 50000.0)
stu = Student("Fatima", 20, [85.0, 90.0, 88.0])

print(emp.greet())
print(stu.greet())
print(f"Average: {stu.average:.1f}")


## Recap

- A class groups **data + behavior** (encapsulation)
- Properties can **compute** values (`first_name`) or **validate** updates (`age`)
- Inheritance reuses behavior; polymorphism is "same interface, different types"
- A small model class can make your report easier to reason about
