# Class 8: OOP in Python

- Data Engineers build reusable tools: readers, transformers, loggers.
- OOP helps you modularize, scale, and reorganize pipeline steps cleanly.
- Instead of repeating code for each file format/API — define a class once, use everywhere.

## Classes & Objects (Core Foundation)

In [0]:
class Customer: 
    def __init__(self, name):
        self.name = name 

c1 = Customer("Sumit")
print(c1.name)

In [0]:
# Build a PipelineStep Class 
class PipelineStep:

    def __init__(self, step_name):
        self.step_name = step_name 

    def execute(self):
        print(f"🚀 Executing Step: {self.step_name}")


ingest = PipelineStep("Ingestion")
transform = PipelineStep("Transformation")
load = PipelineStep("Loading")

ingest.execute()
transform.execute()
load.execute()

## Encapsulation - Protect Internal Logic

In [0]:
class DatabaseConnection:
    def __init__(self):
        self._credentials = "user:pass@123" # protected variable credentials 

    def connect(self):
        print("🛢️ Connecting using credentials...")
        return "DB Connection Established"
    
db = DatabaseConnection()
print(db.connect())

## Inheritance - Reuse Code with Variants

In [0]:
class Reader:
    def read(self):
        return "Reading from Base Reader..."

class CSVReader(Reader):
    def read(self):
        return "𝄜 Reading from CSV"   
    
class APIReader(Reader):
    def read(self):
        return "🌎 Fetching from API"
    
print(CSVReader().read())
print(APIReader().read())

## Polymorphism - Same Interface, Different Behaviour

In [0]:
def run_reader(reader):
    print(reader.read())

run_reader(CSVReader())
run_reader(APIReader())

## Abstraction - Enforce Strtucture Across Team

In [0]:
from abc import ABC, abstractmethod 

class Ingester(ABC):
    @abstractmethod 
    def read(self):
        pass 

class CSVIngester(Ingester):
    def read(self):
        return 'Reading CSV...'

class APIIngester(Ingester):
    def read(self):
        return "Calling API..." 

def run(ingester):
    print(ingester.read())

run(CSVIngester())
run(APIIngester())

# Real Data Pipeline - OOP Version

In [0]:
class FileIngester:
    def __init__(self, path):
        self.path = path

    def read(self):
        print(f"Reading data from {self.path}")
        return f"Raw data from {self.path}"
    
class Cleaner:
    def clean(self, data):
        print("Cleaning data...")
        return f"Cleaned version of: {data}"
    
class Writer:
    def write(self, data):
        print(f"Writing data: {data}")
        return "Write Success"
    
# Execute Pipeline 
reader = FileIngester("data.csv")
raw_data = reader.read()

cleaned = Cleaner().clean(raw_data)
Writer().write(cleaned)