**Class 8: OOP in Python**


- Data Engineers build reusable tools: readers, transformers, loggers.
- OOP helps you modularize, scale, and reorganize pipeline steps cleanly.
- Instead of repeating code for each file format/API — define a class once, use everywhere

**Classes & Objects (Core Foundation)**


In [0]:

class Customer:
  def __init__(self,name):
    self.name = name

c1=Customer('Radharani')
print(c1.name)

In [0]:
class Customer:
    def __init__(self,name):
        self.name = name
c1 = Customer("Atul")
print(c1.name)

In [0]:
#Build a PipelineStep Class

class PipelineStep:

    def __init__(self, step_name):
        self.step_name = step_name
    
    def execute(self):
        print(f"🚀 Executing step: {self.step_name}")

ingest=PipelineStep("Ingestion")
transform=PipelineStep("Tranformation")

ingest.execute()
transform.execute()

**## Encapsulation – Protect Internal Logic**


In [0]:

class DatabaseConnector:
    def __init__(self):
        self._credentials = "user:pass@123"   #protected Variable credential

    
    def connect(self):
        print("🔐 Connecting using credentials...")
        return "DB connection established"
    
db=DatabaseConnector()
print(db.connect())

### **Inheritance – Reuse Code with Variants**


In [0]:

class Reader():
  def read(self):
    return "Reading from Base Reader...."


class CSVReader(Reader):
  def read(self):
    return "📄 Reading from CSV"


class APIReader(Reader):
    def read(self):
        return "🌐 Fetching from API"
      

print(CSVReader().read())
print(APIReader().read())

**Polymorphism – Same Interface, Different Behavior**


In [0]:
def run_reader(reader):
  print(reader.read())

run_reader(CSVReader())
run_reader(APIReader())

**Abstraction – Enforce Structure Across Team**


In [0]:
from abc import ABC, abstractmethod

class Ingestor(ABC):
    @abstractmethod
    def read(self):
        pass

class CSVIngestor(Ingestor):
    def read(self):
        return "Reading CSV..."

class APIIngestor(Ingestor):
    def read(self):
        return "Calling API..."

def run(ingestor):
    print(ingestor.read())

run(CSVIngestor())
run(APIIngestor())

## **Real Data Pipeline: OOP Version**

In [0]:
class FileIngestor:
    def __init__(self, path):
        self.path = path

    def read(self):
        print(f"Reading data from {self.path}")
        return f"Raw data from {self.path}"

class Cleaner:
    def clean(self, data):
        print("Cleaning data...")
        return f"Cleaned version of: {data}"

class Writer:
    def write(self, data):
        print(f"Writing data: {data}")
        return "Write success"

# Execute pipeline
reader = FileIngestor("data.csv")
raw_data = reader.read()

cleaned = Cleaner().clean(raw_data)
Writer().write(cleaned)