# OOP Analysis of AI Software Frameworks
Code Examples for all 4 frameworks. Each Framework will have an example of the OOP concept discussed in the paper
* Inheritance
* Polymorphism
* Encapsulation
* Abstraction

Please run the cell below to be able to compile the code cells, it installs the neccesary packages into your conda env
For some examples if you want to test the full functionality you will need access to a hosted service or LLM
If you dont want to pay for tokens look at Ollama as it has great interoperability with LangChain/LangGraph and LLamaIndex, it uses smallers models like Llama 8b. The examples use the OpenAI GPT interface but it also works with models hosted on AWS/Azure and self hosted models as well


In [1]:
%pip install -U langchain langchain_community langgraph transformers llama-index elasticsearch torch

Collecting langgraph
  Downloading langgraph-0.2.56-py3-none-any.whl.metadata (15 kB)
Downloading langgraph-0.2.56-py3-none-any.whl (126 kB)
Installing collected packages: langgraph
  Attempting uninstall: langgraph
    Found existing installation: langgraph 0.2.55
    Uninstalling langgraph-0.2.55:
      Successfully uninstalled langgraph-0.2.55
Successfully installed langgraph-0.2.56
Note: you may need to restart the kernel to use updated packages.


# HuggingFace Transformers

In [None]:
"""Imports Needed for code to run"""
from transformers import PreTrainedModel, GPT2Model
from transformers import AutoModel, AutoTokenizer
from transformers import pipeline
from transformers import PreTrainedModel, PreTrainedTokenizer

## Inheritance
Through inheriting the base GPT2Model we can then create a custom forward pass while still retaining needed functionality

In [None]:
class CustomGPT2(GPT2Model):
    def __init__(self, config):
        super().__init__(config)
    
    def forward(self, input_ids, attention_mask=None, labels=None):
        # Override the forward method to add custom behavior like the below scaling of logits
        outputs = super().forward(input_ids, attention_mask)
        logits = outputs.last_hidden_state
        return logits * 0.5  # Custom scaling of logits

## Polymorphism through AutoModel

AutoModel can handle any/most models hosted on huggingface and provide a simple same interface for all model interactions

In [None]:
# Using the same interface for different model types
models = ["bert-base-uncased", "gpt2", "distilbert-base-uncased"]
for model_name in models:
    model = AutoModel.from_pretrained(model_name)  # Polymorphism here 
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    print(f"Loaded model: {model_name} with {model.config.hidden_size} hidden units.")

Loaded model: bert-base-uncased with 768 hidden units.
Loaded model: gpt2 with 768 hidden units.
Loaded model: distilbert-base-uncased with 768 hidden units.


## Abstraction
With abstracting complex pipeline logic away for sentiment analysis, like tokenization and more we can simply use the interfaces for quick sentiment analysis

In [14]:
# Hiding complex logic of a model behind a simple interface
classifier = pipeline("sentiment-analysis")
results = classifier("Hugging Face makes machine learning fun!")
print(results)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'label': 'POSITIVE', 'score': 0.9998284578323364}]


# Elasticsearch

In [24]:
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk

## Custom Analyzers from Inheritance
Below is a custom analyzer that inherits standard behavior but creates custom logic for using different filters

In [19]:
custom_analyzer_with_inheritance = (
    {
        "settings": {
            "analysis": {
            "analyzer": {
                "custom_analyzer": {
                "type": "custom",
                "tokenizer": "standard",
                "filter": ["lowercase", "asciifolding"]
                }
            }
            }
        }
    }
)

## Polymorphism through interfaces with query logic
Below is an example of DSL, the ES custom query language, where we define different queries like bool/must within the same query object

In [None]:
query_logic_with_polymorphism = (
    {
        "query": {
            "bool": {
            "must": [
                { "match": { "title": "Elasticsearch" } },
                { "range": { "date": { "gte": "2022-01-01" } } }
            ]
            }
        }
    }
)

## Encapsulation through hiding query logic and formatting
To fully run the below code you need to have access to a cloud where your service is hosted (or localhost) so here is the basic class structure

In [23]:
class SearchService:
    def __init__(self, es_client):
        self.es = es_client

    def search_by_title(self, index, title):
        query = {
            "query": {
                "match": {"title": title}
            }
        }
        return self.es.search(index=index, body=query)

"""Usage below if a hosting service is used"""
# es = Elasticsearch(host=[], cloud_id="", auth="")
# search_service = SearchService(es)
# response = search_service.search_by_title("my_index", "Elasticsearch")
# print(response)


'Usage below if a hosting service is used'

## Abstraction through hiding complex bulk indexing or distributed queries

In [25]:
class BulkIndexer:
    def __init__(self, es_client, index):
        self.es = es_client
        self.index = index

    def index_documents(self, docs):
        actions = [
            {
                "_index": self.index,
                "_id": doc["id"],
                "_source": doc
            }
            for doc in docs
        ]
        bulk(self.es, actions)

"""Usage below if a hosting service is used"""
# es = Elasticsearch(host=[], cloud_id="", auth="")
# indexer = BulkIndexer(es, "my_index")

# documents = [
#     {"id": 1, "title": "Elasticsearch Basics"},
#     {"id": 2, "title": "Advanced Elasticsearch"}
# ]

# indexer.index_documents(documents)


'Usage below if a hosting service is used'

# Langchain/Langgraph

In [35]:
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.tools import BaseTool
from langchain_core.tools import tool
from typing import Any, List
import requests

# Inheritance
Inherits the base LLMChain class to have the basic chaining functionality but allows devs to extend logic like in the example below for logging

In [5]:
class LoggingLLMChain(LLMChain):
    def __init__(self, llm, prompt):
        super().__init__(llm=llm, prompt=prompt)
    
    def _call(self, inputs):
        inputs['text'] = inputs['text'].strip().lower()
        print(f"Input received: {inputs['text']}")

        result = super()._call(inputs)

        print(f"Model output: {result}")
        return result

"""Need to connect to an LLM for this full thing to compile"""
# llm = OpenAI(temperature=0)
# prompt = PromptTemplate(template="Summarize this: {text}", input_variables=["text"])
# logging_chain = LoggingLLMChain(llm=llm, prompt=prompt)

# response = logging_chain.run({"text": " Hello "})
# print(f"last response: {response}")

'Need to connect to an LLM for this full thing to compile'

## Polymorphism through tools and their shared interfaces

In [10]:
class AdditionTool(BaseTool):
    name: str = "addition_tool"
    description: str = "Adds two numbers provided as input."

    def _run(self, query: str):
        try:
            numbers = list(map(float, query.split()))
            return f"Sum: {sum(numbers)}"
        except ValueError:
            return "Invalid input. Provide two numbers separated by a space."


class ReverseStringTool(BaseTool):
    name: str = "reverse_tool"
    description: str = "Reverses the input string."

    def _run(self, query: str):
        return f"Reversed: {query[::-1]}"



## Encapsulation
Heres an example of encapsulation of prompting an LLM without having to create tool nodes, memory management, or even a base agent. To run it you do need access to an LLM.

In [11]:
# prompt = PromptTemplate(template="Translate to French: {text}", input_variables=["text"])
# chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)
# response = chain.run(text="Hello world!")
# print(response)

## Encapsulation
Here is how we can define a custom tool and encapsulate the custom logic in _run so that an LLM or agent does not need to understand more functionality about the tool than what it knows from BaseTool

In [12]:
class WeatherTool(BaseTool):
    name: str = "weather_tool"
    description: str = "Provides current weather information for a given city."

    def _run(self, query: str) -> str:
        """Fetches the weather for a city."""
        city = query.strip()
        try:
            # Replace with your actual weather API endpoint and key
            api_key = "YOUR_API_KEY"
            url = f"http://api.weatherapi.com/v1/current.json?key={api_key}&q={city}"
            response = requests.get(url).json()
            weather = response.get("current", {}).get("condition", {}).get("text", "Unknown")
            temp = response.get("current", {}).get("temp_c", "Unknown")
            return f"The weather in {city} is {weather} with a temperature of {temp}°C."
        except Exception as e:
            return f"Failed to fetch weather data: {str(e)}"

## Abstraction
Above in polymorphism we see how we can define custom tools with predefined equal interfaces (inherited from base class) and here we see how the tool decorater can abstract the extra logic away for simple tool making

In [14]:
@tool
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

# LlamaIndex

In [None]:
from llama_index.core.indices.base import BaseIndex
from llama_index.core.node_parser.node_utils import BaseNode 
from llama_index.core.indices.base.
from llama_index.core.indices.keyword_table import KeywordTableIndex
from llama_index.core import SimpleDirectoryReader

## Inheritance

In [30]:
class CustomIndex(BaseIndex):
    def __init__(self, documents):
        super().__init__()
        self.documents = documents
        self.index_data = self.build_index(documents)
    
    def build_index(self, documents):
        """Custom method to create a basic index from documents."""
        index = {}
        for doc_id, doc in enumerate(documents):
            index[doc_id] = doc.text
        return index
    
    def query(self, query_text):
        """Override the query method to perform a simple keyword search."""
        results = []
        for doc_id, content in self.index_data.items():
            if query_text.lower() in content.lower():
                results.append((doc_id, content))
        return results or "No matches found."

## Polymorphism with indexes

In [31]:
class KeywordIndex(BaseIndex):
    def query(self, query_text: str) -> str:
        return f"KeywordIndex found results for '{query_text}'."

class PrefixIndex(BaseIndex):
    def query(self, query_text: str) -> str:
        return f"PrefixIndex found results starting with '{query_text}'."

def execute_query(index: BaseIndex, query_text: str) -> str:
    """Polymorphism in action: Query any index type."""
    return index.query(query_text)

## Encapsulation
Hides query logic away from interface

In [33]:
class SimpleIndex(BaseIndex):
    def __init__(self, documents):
        self.documents = documents  

    def query(self, query_text: str) -> str:
        # Encapsulate search logic and ovveride query()
        results = [doc for doc in self.documents if query_text.lower() in doc.lower()]
        return results if results else "No matches found."


## Abstraction
Abstraction through hiding query logic through a simple interface

In [36]:
class AbstractIndex:
    def __init__(self, documents: List[str]):
        self.documents = documents

    def query(self, query_text: str) -> str:
        """Abstracts the logic of querying."""
        raise NotImplementedError("Subclasses must implement query() method.")

class KeywordIndex(AbstractIndex):
    def query(self, query_text: str) -> str:
        results = [doc for doc in self.documents if query_text.lower() in doc.lower()]
        return results if results else "No matches found."

class PrefixIndex(AbstractIndex):
    def query(self, query_text: str) -> str:
        results = [doc for doc in self.documents if doc.lower().startswith(query_text.lower())]
        return results if results else "No matches found."