# **Table of Contents**

1. **Introduction**
2. **Required Libraries**
3. **Step 1: Function Registry**
4. **Step 2: LLM + RAG for Function Retrieval**
5. **Step 3: Dynamic Code Generation for Function Invocation**
6. **Step 4: Maintain Context**
7. **Step 5: API Service Implementation**
8. **Bonus Enhancements (Optional)**

---

# **Step 1: Introduction**

In this notebook, we will build a Python-based API service that dynamically retrieves and executes automation functions. The core functionality will use LLM (Large Language Models) and RAG (Retrieval-Augmented Generation) to map user queries to predefined functions and then generate executable Python code for those functions. The notebook will guide you through creating a function registry, integrating LLM + RAG for function retrieval, dynamically generating code, maintaining context, and implementing an API service.

---


# **Step 2: Required Libraries**

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [15]:
import warnings
warnings.filterwarnings("ignore", category=UserWarning, module='streamlit')

In [2]:
!python -m spacy download en_core_web_md

Collecting en-core-web-md==3.8.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_md-3.8.0/en_core_web_md-3.8.0-py3-none-any.whl (33.5 MB)
     ---------------------------------------- 0.0/33.5 MB ? eta -:--:--
     ---------------------------------------- 0.0/33.5 MB ? eta -:--:--
     ---------------------------------------- 0.0/33.5 MB ? eta -:--:--
     ---------------------------------------- 0.0/33.5 MB ? eta -:--:--
     ---------------------------------------- 0.3/33.5 MB ? eta -:--:--
      --------------------------------------- 0.5/33.5 MB 1.2 MB/s eta 0:00:28
      --------------------------------------- 0.8/33.5 MB 1.2 MB/s eta 0:00:28
     - -------------------------------------- 1.0/33.5 MB 1.2 MB/s eta 0:00:28
     - -------------------------------------- 1.0/33.5 MB 1.2 MB/s eta 0:00:28
     - -------------------------------------- 1.0/33.5 MB 1.2 MB/s eta 0:00:28
     - -------------------------------------- 1.0/33.5 MB 1.2 MB/s et


[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [3]:
import spacy

# Load pre-trained spaCy model
nlp = spacy.load('en_core_web_md')

In [4]:
# Importing necessary libraries
import os
import webbrowser
import faiss
import torch
from transformers import BertTokenizer, BertModel
import spacy
import numpy as np
from fastapi import FastAPI
from pydantic import BaseModel
import json

# Load pre-trained model and tokenizer for Hugging Face Transformers
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

# Load pre-trained spaCy model
nlp = spacy.load('en_core_web_md')

# Function to get embeddings using Hugging Face
def get_embedding_huggingface(text):
    inputs = tokenizer(text, return_tensors="pt")
    with torch.no_grad():
        outputs = model(**inputs)
    return outputs.last_hidden_state.mean(dim=1).squeeze().numpy()

# Function to get embeddings using spaCy
def get_embedding_spacy(text):
    doc = nlp(text)
    return doc.vector





# **Step 3: Function Registry**

The first part of the task involves creating a registry of common automation functions. These functions can open applications, retrieve system information, or execute commands.

We will create functions for tasks such as:

- Opening Google Chrome
- Opening Calculator
- Opening Notepad
- Retrieving CPU usage
- Retrieving RAM usage
- Executing shell commands

These functions will be stored in a dictionary called `function_registry` to allow easy and dynamic access. By using this registry, we can map user requests to these predefined functions and execute them based on the query.

### **Theory**:

1. **Function Definition**: We define Python functions to carry out tasks like opening apps or monitoring system performance.
2. **Registry Setup**: Functions are mapped to string keys (e.g., "open_chrome"), and stored in a dictionary. This enables quick retrieval of functions based on user input.
3. **Function Invocation**: The system can dynamically invoke functions by referencing the dictionary.

Now, let's implement the function registry in Python.

In [5]:
# Function Registry

def open_chrome():
    webbrowser.open("https://www.google.com")

def open_calculator():
    os.system("calc")

def open_notepad():
    os.system("notepad")

def get_cpu_usage():
    return os.popen("wmic cpu get loadpercentage").read()

def get_ram_usage():
    return os.popen("wmic OS get FreePhysicalMemory,TotalVisibleMemorySize /Value").read()

def execute_command(command):
    return os.popen(command).read()

# Dictionary to store functions by name for easy reference
function_registry = {
    "open_chrome": open_chrome,
    "open_calculator": open_calculator,
    "open_notepad": open_notepad,
    "get_cpu_usage": get_cpu_usage,
    "get_ram_usage": get_ram_usage,
    "execute_command": execute_command
}

# **Step 4: LLM + RAG for Function Retrieval**

In this step, we will integrate a **Large Language Model (LLM)** and **Retrieval-Augmented Generation (RAG)** to retrieve the best matching function from the registry based on user input.

### **Theory**:

1. **Vector Embeddings**: We use a pre-trained LLM to convert user queries into vector embeddings. The model encodes the query into a dense vector representation, which is then compared with the vectors of predefined functions stored in a vector database.
   
2. **Vector Search**: We will use a vector search technique (such as FAISS or ChromaDB) to store function metadata (such as function names and descriptions) as vectors. By converting both user queries and function metadata into embeddings, we can search for the most relevant function based on cosine similarity between their vector representations.

3. **Query Handling**: The system will dynamically retrieve the most relevant function from the function registry and generate the corresponding executable Python code.

Now, let's implement the LLM + RAG for function retrieval using a pre-trained sentence transformer model to encode the user query and function metadata.


In [7]:
# Import the necessary library
from sentence_transformers import SentenceTransformer
import numpy as np

# Load Sentence Transformer model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Function metadata: Example descriptions of functions stored in a list
function_metadata = {
    "open_chrome": "Opens Google Chrome browser",
    "open_calculator": "Opens the Calculator application",
    "open_notepad": "Opens the Notepad application",
    "get_cpu_usage": "Retrieves the current CPU usage percentage",
    "get_ram_usage": "Retrieves the current RAM usage",
    "execute_command": "Executes a shell command"
}

# Create embeddings for the function metadata
function_embeddings = {func: model.encode(description) for func, description in function_metadata.items()}

# Function to encode user query and find the best matching function
def find_best_matching_function(query):
    query_embedding = model.encode(query)
    
    # Calculate cosine similarity with function embeddings
    similarities = {func: np.dot(query_embedding, embedding) / (np.linalg.norm(query_embedding) * np.linalg.norm(embedding))
                    for func, embedding in function_embeddings.items()}
    
    # Find the function with the highest similarity score
    best_function = max(similarities, key=similarities.get)
    return best_function, function_registry[best_function]


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

### **Explanation of the Code**:

1. **Importing the SentenceTransformer**: 
   We import the `SentenceTransformer` class from the `sentence_transformers` library to load a pre-trained model that can encode sentences into vector representations (embeddings).

2. **Loading the Pre-trained Model**: 
   We use the `SentenceTransformer('all-MiniLM-L6-v2')` model, which is a lightweight transformer model trained to generate high-quality sentence embeddings.

3. **Function Metadata**: 
   A dictionary, `function_metadata`, is created where each function name (e.g., `open_chrome`) is mapped to a short description of the function's task (e.g., "Opens Google Chrome browser").

4. **Generating Embeddings**:
   For each function description in the `function_metadata`, we generate an embedding (vector representation) using the pre-trained model. This is stored in the `function_embeddings` dictionary.

5. **Function to Find Best Matching Function**:
   - We encode the user query into an embedding using the same model.
   - We compute the cosine similarity between the user query's embedding and each function's embedding.
   - The function with the highest similarity score is selected as the best match.
   - The corresponding function from the `function_registry` is returned for execution.

This process allows the system to dynamically retrieve the most relevant function based on the user's input.


# **Step 5: Dynamic Code Generation for Function Invocation**

In this step, we will dynamically generate executable Python code based on the function retrieved from the LLM + RAG system.

### **Theory**:

1. **Code Generation**: 
   Once we have the relevant function from the function registry, we will generate a Python script to invoke that function. The generated script will include the necessary imports, the function call, and proper error handling.

2. **Dynamic Script Creation**: 
   We will use Python's string formatting or f-strings to construct a Python script that includes the required imports, executes the chosen function, and handles potential exceptions.

3. **Error Handling**: 
   We will ensure that the generated code includes a `try-except` block to catch and handle any errors that might occur during the function execution.

Now, let's implement the dynamic code generation for the function invocation.

In [10]:
# Function to generate Python code for function invocation
def generate_code_for_function(function_name):
    # The base import statement for the function
    import_statement = "from automation_functions import " + function_name
    
    # The dynamic code generation for the function invocation
    code = f"""
{import_statement}

def main():
    try:
        {function_name}()
        print("{function_name} executed successfully.")
    except Exception as e:
        print(f"Error executing {function_name}: {{e}}")

if __name__ == "__main__":
    main()
"""
    return code


### **Explanation of the Code**:

1. **Import Statement Generation**: 
   We dynamically create the import statement for the given function. This is done by concatenating the function name with the string `"from automation_functions import "`. This allows us to import the correct function based on the query received.

2. **Dynamic Code Creation**: 
   Using Python's f-string formatting, we construct a Python script that:
   - **Imports the Function**: The generated script begins by importing the required function using the import statement created earlier.
   - **Defines the `main()` Function**: Inside the `main()` function, we call the function dynamically using its name. We wrap this in a `try-except` block to ensure that any errors during function execution are handled gracefully.
   - **Error Handling**: If an error occurs while invoking the function, it is caught in the `except` block, and a message is printed indicating which function failed and the nature of the error.

3. **Structure of the Generated Code**: 
   The script is structured as a standalone Python program. When executed, it will import the function, attempt to run it, and print either a success or error message.

This dynamic code generation allows us to automate the process of creating executable scripts for any function retrieved by the system.

# **Step 6: API Service Implementation**

In this step, we will implement the API service that exposes an endpoint to handle user requests. The service will receive a user query via a POST request, retrieve the relevant function, generate the Python code, and return it as a response.

### **Theory**:

1. **API Framework**: 
   We will use **FastAPI**, a modern web framework for building APIs, which provides automatic generation of OpenAPI documentation and is very easy to work with.

2. **Request Handling**: 
   The API will have an endpoint (`/execute`) that accepts POST requests. The body of the request will contain a JSON object with a `prompt` key, which will hold the user query.

3. **Response**: 
   After retrieving the relevant function and generating the code, the API will return a JSON response containing the function name and the generated code.

4. **API Service Structure**: 
   - The request will trigger the process of encoding the query, retrieving the relevant function, and generating the corresponding code.
   - The response will contain the function name and the Python code for invocation.

Now, let's implement the FastAPI service.


In [11]:
# Import FastAPI and other necessary modules
from fastapi import FastAPI
from pydantic import BaseModel
import json

# Initialize FastAPI app
app = FastAPI()

# Create a Pydantic model for request data
class QueryRequest(BaseModel):
    prompt: str

# Endpoint to execute user queries and generate code
@app.post("/execute")
async def execute_query(query_request: QueryRequest):
    # Extract the user query from the request
    user_query = query_request.prompt
    
    # Find the best matching function based on the query
    best_function, function = find_best_matching_function(user_query)
    
    # Generate the Python code for function invocation
    generated_code = generate_code_for_function(best_function)
    
    # Return the response with the function name and generated code
    return {"function": best_function, "code": generated_code}


### **Explanation of the Code**:

1. **FastAPI Setup**:
   - We begin by importing the `FastAPI` class and initializing the app (`app = FastAPI()`). FastAPI will manage the web server and handle HTTP requests for our API.
   
2. **Pydantic Model**:
   - We define a `QueryRequest` class using `pydantic.BaseModel`. This class specifies the structure of the expected JSON request body. It expects a key called `prompt`, which will hold the user’s query.
   - **Why Pydantic?**: Pydantic models help validate the incoming request data and automatically convert it to the correct format, making our code cleaner and more reliable.

3. **API Endpoint**:
   - The `@app.post("/execute")` decorator defines a POST endpoint `/execute` that listens for requests.
   - **`execute_query` function**: This function is triggered when a POST request is made to `/execute`. It accepts a `QueryRequest` object (which contains the user query in the `prompt` field).
     - First, we extract the user query (`user_query = query_request.prompt`).
     - We then pass the query to the `find_best_matching_function` function, which returns the most relevant function and the corresponding implementation from the registry.
     - Next, we generate the Python code for invoking the selected function using `generate_code_for_function(best_function)`.
     - Finally, we return a JSON response that contains the function name and the dynamically generated Python code.

4. **Asynchronous Handling**:
   - We use the `async` keyword with the `execute_query` function, making it an asynchronous function. This allows FastAPI to handle multiple requests efficiently without blocking the server.
   - **Why Async?**: Asynchronous functions improve performance, especially when the server is handling multiple requests, by allowing the server to perform other tasks while waiting for one request to complete.

This FastAPI service allows users to send a query, retrieve the appropriate function, and get the corresponding Python code for execution—all in a structured and efficient way.


# **Bonus Enhancements (Optional)**

In this step, we will implement additional features to enhance the system. These include logging, monitoring, and supporting custom user-defined functions.

### **Theory**:

1. **Logging**: 
   We will add logging to the system to track the function execution process, including any errors or issues that arise. This will help in debugging and maintaining the system.
   
2. **Monitoring**: 
   We will implement monitoring to track function usage, execution times, and performance metrics, ensuring the system is running efficiently and is scalable.

3. **Support for Custom User-Defined Functions**: 
   We will extend the system to allow users to define their own functions through the API. Users can submit new functions, which will be stored in the function registry and made available for execution.

Now, let's implement the logging and monitoring features, followed by supporting custom user-defined functions.


In [12]:
import logging
from typing import Optional

# Set up basic logging
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")

# Function to log function execution
def log_function_execution(function_name: str, status: str, error_message: Optional[str] = None):
    if status == "success":
        logging.info(f"Function {function_name} executed successfully.")
    else:
        logging.error(f"Function {function_name} failed with error: {error_message}")

# Endpoint to add custom user-defined functions
@app.post("/add_function")
async def add_user_function(function_name: str, function_code: str):
    try:
        # Save the user-defined function code
        exec(function_code)
        function_registry[function_name] = eval(function_name)
        logging.info(f"Function {function_name} added successfully.")
        return {"message": f"Function {function_name} added successfully."}
    except Exception as e:
        logging.error(f"Error adding function {function_name}: {e}")
        return {"message": f"Error adding function {function_name}: {str(e)}"}


### **Explanation of the Code**:

1. **Logging**:
   - We set up logging using `logging.basicConfig()` to capture events with timestamps and log levels (INFO for success, ERROR for failure).
   - The `log_function_execution` function logs the status of a function execution, recording whether it was successful or failed. If failed, the error message is logged.

2. **Custom User-Defined Functions**:
   - We added an endpoint `/add_function` that accepts a `function_name` and `function_code` (Python code) as input.
   - The `exec()` function is used to dynamically execute and add the custom function to the `function_registry`. This makes the new function available for future execution.
   - If the function is added successfully, a success message is logged; if there’s an error, it’s logged as an ERROR.

3. **Why These Enhancements**:
   - **Logging** helps track function execution and error handling, which is useful for debugging and maintaining the system.
   - **Custom Functions** enable users to extend the system by defining their own automation tasks through the API.

These enhancements improve the system's flexibility, monitoring, and troubleshooting capabilities.
