<a href="https://colab.research.google.com/github/Tanu-N-Prabhu/Python/blob/master/Open_Closed_Principle_in_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Write Maintainable ML Code with the Open-Closed Principle in Python

## Extend your machine learning workflows without rewriting a single line.

| ![space-1.jpg](https://github.com/Tanu-N-Prabhu/Python/blob/master/Img/christina-wocintechchat-com-SqmaKDvcIso-unsplash.jpg?raw=true) |
|:--:|
|Photo by <a href="https://unsplash.com/@wocintechchat?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Christina @ wocintechchat.com</a> on <a href="https://unsplash.com/photos/shallow-focus-photo-of-python-book-SqmaKDvcIso?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Unsplash</a>|


### Introduction
In fast-moving AI teams, model iterations and data changes are inevitable. But if you constantly modify the same files, bugs creep in and your pipeline becomes fragile. That's where the Open-Closed Principle saves the day, a fundamental idea in software design that helps build robust, extendable machine learning systems.

---

### Design Principle: Open-Closed Principle (OCP)
One of the SOLID principles, it states:

> *Software entities (classes, modules, functions) should be open for extension, but closed for modification.*

This means you should be able to **add new behavior** without altering existing tested code.

---

### Problem
Imagine your ML pipeline supports only one model, say Logistic Regression. What if tomorrow you need to try Random Forest, XGBoost, or even a Neural Network? If your logic is hardcoded, every new change risks breaking what already works.

---

### Code Implementation (OCP for ML Models)



In [None]:
# base_model.py
from abc import ABC, abstractmethod

class BaseModel(ABC):
    @abstractmethod
    def train(self, X, y):
        pass

    @abstractmethod
    def predict(self, X):
        pass

In [None]:
# logistic_model.py
from sklearn.linear_model import LogisticRegression
from base_model import BaseModel

class LogisticModel(BaseModel):
    def __init__(self):
        self.model = LogisticRegression()

    def train(self, X, y):
        self.model.fit(X, y)

    def predict(self, X):
        return self.model.predict(X)

In [None]:
# random_forest_model.py
from sklearn.ensemble import RandomForestClassifier
from base_model import BaseModel

class RandomForestModel(BaseModel):
    def __init__(self):
        self.model = RandomForestClassifier()

    def train(self, X, y):
        self.model.fit(X, y)

    def predict(self, X):
        return self.model.predict(X)

In [None]:
# pipeline.py
from sklearn.datasets import load_iris
from random_forest_model import RandomForestModel

def run_pipeline(model):
    data = load_iris()
    X, y = data.data, data.target
    model.train(X, y)
    preds = model.predict(X)
    print("First 5 Predictions:", preds[:5])

if __name__ == "__main__":
    model = RandomForestModel()
    run_pipeline(model)

### Output

First 5 Predictions: [0 0 0 0 0]

---

### Code Explanation

* `BaseModel`: Abstract base class for all models.

* `LogisticModel, RandomForestModel`: Extend `BaseModel` without modifying it.

* `pipeline.py`: Can accept any future model that implements `BaseModel`.

* No need to touch old code to add new models, just create a new class.


---

### UML Class Diagram

| ![space-1.jpg](https://github.com/Tanu-N-Prabhu/Python/blob/master/Img/umlopenclose.png?raw=true) |
|:--:|
|Designed by Author|

#### UML Class Diagram Explanation: Open-Closed Principle

1. `BaseModel` (Abstract Class / Interface)
    * Defines a generic interface for all ML models.
    * Declares two abstract methods: `train(X, y)` and `predict(X)`.
    * This class does not contain implementation, only a contract.
    * Closed for modification: You don’t need to edit this when adding new models.
    * Open for extension: You can extend this to implement new model classes.

2. `LogisticModel` (Concrete Class)
    * Inherits from `BaseModel`.
    * Implements `train()` and `predict()` using `LogisticRegression` from Scikit-learn.
    * Follows the contract defined by `BaseModel`.

3. `RandomForestModel` (Concrete Class)
    * Also inherits from `BaseModel`.
    * Uses `RandomForestClassifier` from Scikit-learn.
    * Another extension that does not require modifying any base logic.

4. `Pipeline` (Client Class / Runner)
    * This is the high-level module that uses `BaseModel`.
    * It depends only on the abstraction (`BaseModel`), not on any specific model.
    * Takes any object that follows `BaseModel` and runs the ML pipeline.
    * Fully decoupled: You can pass `LogisticModel`, `RandomForestModel`, or future models without changing `Pipeline`.

#### How It Reflects the Open-Closed Principle
* You can add new models (extensions) like `SVMModel`, `XGBoostModel`, etc.

* You never need to change existing, stable code in `BaseModel` or `Pipeline`.

* This minimizes the chance of breaking existing functionality.

#### Real-World Value
* Supports safe experimentation in research.

* Enables clean architecture in production ML systems.

* Great for unit testing, as each model can be tested in isolation.

* Makes team collaboration easier; contributors add models independently.

---

### Why it’s so important

* Enhances modularity: Add new models or techniques without touching legacy code.

* Enables scalability: More contributors can work in parallel on new models.

* Increases reliability: Tested modules stay untouched, reducing risk of bugs.

* Improves team collaboration: Clear contracts and plug-and-play components.

---

### Applications
* Experimentation in AutoML frameworks.

* Deployment pipelines that support model switching.

* MLOps workflows needing multiple backend models.

* Plugins for training logic in custom AI platforms.

---

### Conclusion
The Open-Closed Principle is a game-changer for ML systems. It keeps your core logic untouched while allowing infinite growth through extensions. Adopt these patterns early, and your ML projects will scale with confidence. Thanks for reading my article, let me know if you have any suggestions or similar implementations via the comment section. Until then, see you next time. Happy coding!

---

### Before you go
* Be sure to Like and Connect Me
* Follow Me : [Medium](https://medium.com/@tanunprabhu95) | [GitHub](https://github.com/Tanu-N-Prabhu) | [LinkedIn](https://ca.linkedin.com/in/tanu-nanda-prabhu-a15a091b5) | [Python Hub](https://github.com/Tanu-N-Prabhu/Python)
* [Check out my latest articles on Programming](https://medium.com/@tanunprabhu95)
* Check out my [GitHub](https://github.com/Tanu-N-Prabhu) for code and [Medium](https://medium.com/@tanunprabhu95) for deep dives!
