# Embeddings and tools tests and conclusions

The purpose of this document is to test different approaches in the use of embeddings and tools to improve the context and objectives of the LLM project in terms of code structure and existing functionalities, in order to enhance the development of new code and functionalities.

In [None]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document


embeddings_model = OpenAIEmbeddings()
chroma_db = Chroma(
    collection_name="codebase",
    embedding_function=embeddings_model,
    persist_directory="./chroma_db",
)

## Pure embeddings

Although the use of simple embeddings can improve the LLM's context and give it "awareness" of certain existing code snippets that match the request, on their own, they do not provide significant value or context to the LLM, being useful only for identifying repeated or identical functionality in the code.

In cases where it is necessary to add a new feature to an existing code snippet (such as a new method in a class), it has been observed that relying solely on embeddings as a code source results in the LLM having great difficulty integrating the feature while keeping the rest of the code in the file intact, as it often needs to be entirely rewritten. This leads to incorrect code and/or the loss of other existing functionalities.

Two approaches were tested here:

   * Creating embeddings for each existing function in the code.
   * Creating embeddings by splitting the document into parts (e.g., 100 characters each).
 

**NOTE**:

For better results, the embeddings should be cleaned, going through a process of filtering and removing unnecessary elements, such as extraneous comments or commented-out code. A solution to explore further is the removal of unused functions or unnecessary imports, if they exist. However, this kind of approach requires static code analysis tools, or other tools, that can identify these types of issues.


### Creating embeddings for each existing function in code

To create embeddings for each function, we first need to go through each Python file and extract every function along with the code that makes it up. 

The following class is used to parse Python code and extract various aspects of it, such as imports, classes, functions, etc., into a JSON structure:

In [None]:
import ast


class PythonFileParser(ast.NodeVisitor):
    """Parse a Python file and return its structure."""

    def __init__(self, file_name):
        self.file_name = file_name
        self.imports = []
        self.classes = []
        self.functions = []
        self.constants = []
        self.main_block = False
        self.global_statements = []
        self.comments = []

    def _create_func_dict(self, func_node):
        # Pre-process the function returns
        if not func_node.returns:
            returns = None

        elif isinstance(func_node.returns, ast.Constant):
            returns = func_node.returns.value

        elif isinstance(func_node.returns, ast.BinOp):
            returns = [func_node.returns.left.id, func_node.returns.right.id]

        else:
            returns = func_node.returns.id

        return {
            "name": func_node.name,
            "start_line": func_node.lineno - 1,
            "end_line": func_node.end_lineno,
            "parameters": [arg.arg for arg in func_node.args.args],
            "returns": returns,
        }

    def visit_Import(self, node):
        self.imports.append(
            {
                "module": node.names[0].name,
                "alias": None,
                "start_line": node.lineno - 1,
                "end_line": node.end_lineno,
            }
        )
        self.generic_visit(node)

    def visit_ImportFrom(self, node):
        self.imports.append(
            {
                "module": node.module,
                "alias": node.names[0].name,
                "start_line": node.lineno - 1,
                "end_line": node.end_lineno,
            }
        )
        self.generic_visit(node)

    def visit_ClassDef(self, node):
        tmp_class = {
            "name": node.name,
            "start_line": node.lineno - 1,
            "end_line": node.end_lineno,
            "methods": [],
        }
        for func in node.body:
            if isinstance(func, ast.FunctionDef):
                method_dict = self._create_func_dict(func)
                tmp_class["methods"].append(method_dict)
                func.is_method = True
        self.classes.append(tmp_class)
        self.generic_visit(node)

    def visit_FunctionDef(self, node):
        if not hasattr(node, "is_method"):
            self.functions.append(self._create_func_dict(node))
        self.generic_visit(node)

    def visit_Assign(self, node):
        if isinstance(node.targets[0], ast.Name):
            value = self._simplify_value(node.value)
            const_dict = {
                "name": node.targets[0].id,
                "start_line": node.lineno - 1,
                "end_line": node.end_lineno,
            }
            if value is not None:
                const_dict["value"] = value
            self.constants.append(const_dict)
        self.generic_visit(node)

    def visit_If(self, node):
        if (
            isinstance(node.test, ast.Compare)
            and isinstance(node.test.left, ast.Name)
            and node.test.left.id == "__name__"
            and any(isinstance(op, ast.Eq) for op in node.test.ops)
            and any(isinstance(cmp, ast.Str) and cmp.s == "__main__" for cmp in node.test.comparators)
        ):
            self.main_block = True
        self.generic_visit(node)

    def visit_Expr(self, node):
        if isinstance(node.value, ast.Str):
            self.comments.append(
                {
                    "type": "docstring",
                    "content": node.value.s,
                    "start_line": node.lineno - 1,
                    "end_line": node.end_lineno,
                }
            )
        self.generic_visit(node)

    def visit(self, node):
        if isinstance(node, ast.Expr) and isinstance(node.value, ast.Str):
            self.comments.append(
                {
                    "type": "docstring",
                    "content": node.value.s,
                    "start_line": node.lineno - 1,
                    "end_line": node.end_lineno,
                }
            )
        else:
            super().visit(node)

    def visit_Module(self, node):
        for n in node.body:
            if isinstance(n, ast.Expr) and isinstance(n.value, ast.Str):
                self.comments.append(
                    {
                        "type": "docstring",
                        "content": n.value.s,
                        "start_line": n.lineno - 1,
                        "end_line": n.end_lineno,
                    }
                )
            else:
                self.visit(n)

    def visit_Global(self, node):
        self.global_statements.append(
            {
                "type": "global",
                "identifiers": node.names,
                "start_line": node.lineno - 1,
                "end_line": node.end_lineno,
            }
        )
        self.generic_visit(node)

    def get_structure(self):
        return {
            "file_name": self.file_name,
            "imports": self.imports,
            "classes": self.classes,
            "functions": self.functions,
            "constants": self.constants,
            "main_block": self.main_block,
            "global_statements": self.global_statements,
            "comments": self.comments,
        }

    def _simplify_value(self, value):
        if isinstance(value, (ast.Str, ast.Num, ast.Constant)):  # Python 3.8+ uses ast.Constant
            return value.value if hasattr(value, "value") else value.n
        elif isinstance(value, ast.NameConstant):
            return value.value
        return None

Which can be invoked using the parse_python_file function.

In [None]:
import os


def parse_python_file(file_path: str) -> str | dict:
    """Parses a Python file and returns its structure."""
    if os.path.isdir(file_path):
        return "IS A DIRECTORY"

    with open(file_path, "r") as source:
        file_content = source.read()
    parser = PythonFileParser(file_name=file_path)
    tree = ast.parse(file_content, file_path)
    parser.visit(tree)
    return parser.get_structure()

##### parse_python_file call result

Below is the (simplified) result of executing the parse_python_file function on a Python file with imports, classes, and other functions.

```json
{
  "file_name": "./utilities.py",
  "imports": [
    {
      "module": "ast",
      "alias": null,
      "start_line": 1,
      "end_line": 1
    },
    {
      "module": "cProfile",
      "alias": null,
      "start_line": 2,
      "end_line": 2
    }
  ],
  "classes": [
    {
      "name": "PythonFileParser",
      "start_line": 57,
      "end_line": 230,
      "methods": [
        {
          "name": "__init__",
          "start_line": 60,
          "end_line": 68,
          "parameters": ["self", "file_name"],
          "returns": null
        },
        {
          "name": "_create_func_dict",
          "start_line": 70,
          "end_line": 90,
          "parameters": ["self", "func_node"],
          "returns": null
        },
        ...
      ]
    }
  ],
  "functions": [
    {
      "name": "get_project_files",
      "start_line": 15,
      "end_line": 29,
      "parameters": ["directory_path"],
      "returns": "str"
    },
    {
      "name": "get_lines_code",
      "start_line": 32,
      "end_line": 42,
      "parameters": ["file_path", "start_line", "end_line"],
      "returns": "str"
    }
  ],
  "constants": [
    {
      "name": "result",
      "start_line": 17,
      "end_line": 17,
      "value": ""
    },
    ...
  ],
  "comments": [
    {
      "type": "docstring",
      "content": "List all Python (.py) files and directories that may contain Python files.",
      "start_line": 16,
      "end_line": 16
    },
    ...
  ]
}
```

The following function uses the start and end of the lines of code in the previous structure to retrieve the code from the Python file at the specified lines.

In [None]:
def get_lines_code(file_path: str, start_line: int, end_line: int) -> str:
    """Get specific lines of code from a file."""
    if not os.path.exists(file_path):
        return f"{file_path} FILE NOT FOUND"

    elif os.path.isdir(file_path):
        return "IS A DIRECTORY"

    with open(file_path, "r") as f:
        lines = f.readlines()
    return "\n".join(map(lambda s: s.strip(), lines[start_line:end_line]))

To generate the embeddings for each file in the project, we still need a function that allows us to list files in a directory:

In [None]:
def get_project_files(directory_path: str = ".") -> str:
    """List all Python (.py) files and directories that may contain Python files."""
    result = ""
    try:
        for file in os.listdir(directory_path):
            full_path = os.path.join(directory_path, file)
            if os.path.isdir(full_path):
                result += f"{file}/\n"
            elif file.endswith(".py"):
                result += f"{file}\n"
    except NotADirectoryError:
        result = "NOT A DIRECTORY"
    except FileNotFoundError:
        result = "FILE NOT FOUND"
    return result

Now that we have the necessary functions, we can create embeddings for each existing function in each Python file:

In [None]:
def store_embeddings(content: str, metadata: list[dict] = None) -> None:
    """Store embeddings of content with associated metadata."""
    embedding_vector = embeddings_model.embed_query(content)

    if metadata and not isinstance(metadata, list):
        metadata = [metadata]

    chroma_db.add_texts([content], embeddings=[embedding_vector], metadatas=metadata)


def analyze_and_store_code(file_path: str):
    """Analyze a Python file and store its embeddings, including functions and class methods."""
    structure = parse_python_file(file_path)

    # Store embeddings for standalone functions
    for func in structure["functions"]:
        code_snippet = get_lines_code(file_path, func["start_line"], func["end_line"])
        metadata = {"name": func["name"], "type": "function", "file": file_path}
        store_embeddings(code_snippet, metadata)

    # Store embeddings for classes and their methods
    for cls in structure.get("classes", []):
        class_snippet = get_lines_code(file_path, cls["start_line"], cls["end_line"])
        class_metadata = {"name": cls["name"], "type": "class", "file": file_path}

        # Store the class code snippet
        store_embeddings(class_snippet, class_metadata)

        # Store methods within the class
        for method in cls.get("methods", []):
            method_snippet = get_lines_code(file_path, method["start_line"], method["end_line"])
            method_metadata = {
                "name": method["name"],
                "type": "method",
                "class": cls["name"],
                "file": file_path,
            }
            store_embeddings(method_snippet, method_metadata)


# Preprocess: Store embeddings for all project files
for file in get_project_files().strip().split("\n"):
    analyze_and_store_code(file)

#### Results:

Although the results are useful for understanding repeated functions or those with similar functionality, this method needs to be further developed. Moreover, when used alone, it does not address the issues raised by the exclusive use of embeddings. Two proposed strategies to enhance the value of the embeddings are:

* Adding docstrings and typing for greater context for the LLM and improved results in the embeddings.
* Including metadata such as the file and line numbers where it exists, a brief explanation of the code, objectives, and examples of output.

It is believed that adding this data to the embeddings improves the results since the LLM does not always search for embeddings by the specific name of the function or by code snippets, but also by expressions or phrases.
Although this may improve the results of the embeddings and their usefulness, they need to be complemented with other 'tools' that the LLM can use when it deems appropriate.

Note that both docstrings and metadata can (or should) be generated by the LLM itself during an initial process of interpreting and exploring the project.
**Another issue that arises from this method is removing old embeddings and updating them with the most recent ones as the project's code is developed or altered.**

### Creating embeddings by splitting the document into parts

This other tested method is quite similar to the previous one; however, there is no interpretation of the Python code now. Instead, the file is split into parts of a specified size, and the respective embeddings are calculated and stored.

In [None]:
def load_file(python_file_path):
    with open(python_file_path, "r") as python_file:
        source = python_file.read()

    # Split documents text
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=100,
        chunk_overlap=100,
        length_function=len,
        add_start_index=True,
    )
    chunks = text_splitter.split_documents([Document(source, metadata={"source": python_file_path})])

    # Save to chroma database
    chroma_db.from_documents(chunks, embeddings_model, persist_directory="./chroma_db")

    print(f"{len(chunks)} chunks persisted")

Now that we have the necessary functions, we can create embeddings for each existing Python file:

In [None]:
for file in get_project_files().strip().split("\n"):
    load_file(file)

#### Results:

Although this method seems to solve the problem of losing code when asking the LLM to add a new functionality, forcing it to rewrite a file, it is not an absolute method because:

* It depends on the size of the embedding splits, which can accommodate an entire file in a single embedding or in parts.
* It depends on the dimension of the code file, where the request for embeddings may not result in the complete code from the embeddings, thus maintaining the problem of code loss.
* It depends on the terms used by the LLM to search within the embeddings, where results for the file being altered may not even appear.
* Large embeddings, and depending on the number of results to retrieve, can easily exceed the token limit that the LLM model accepts as input in a prompt.

Additionally, there is the difficulty of improving the context of the embeddings by adding docstrings or code descriptions, as the file is split into pieces, resulting in a loss of flow and logic in the code snippets.

Im my results analysis, to maximize the effectiveness of the embeddings, methods that preserve the context and logic of the code should be prioritized, avoiding excessive fragmentation. Therefore,
**this should be a method to avoid if we want to make the most out of the embeddings.**

## Pure tools


In line with the previous approaches and knowing that some LLMs can use 'tools,' I decided to provide the LLM with some functions that I developed and found relevant. 'Tools' are code functions that are provided to the LLM, which can invoke them if it deems appropriate. They function like a normal Python function, with the exception that the LLM decides when to invoke them and also which arguments to use.


### Providing Tools to the LLM


The tools should be provided to the LLM following the structure presented here. The function schema (in code bellow) creates this structure. It's important to note that the tools must include typing for both input parameters and return values, and they must also have a docstring that concisely describes what the function does (this helps the LLM understand when to use it). The function's return value must always be of type string, and if it is not, it must be converted to that type.

> **SIDE NOTE:**

> Not all LLMs support 'tools,' but some tests I conducted demonstrate that if we include something like in the system message or in the context of the prompt:

> 'If you need to execute functions, use EXEC \<function name\> \<arguments in JSON\>'

> and then write code to handle these cases, we can achieve almost the same results. Of course, we need to inform the LLM about which functions we have, their purpose, and input arguments, similar to what is done with 'tools' schemas.

Following the reasoning of 'What do I, as a human, need to access and obtain code and content from a project,' I arrived at three basic needs:

* View the content of a project (list directories)
* See what imports, classes, and functions (and other code snippets) a file contains (code structure)
* Read parts or the entire code file

Based on this, I used the tools I had already developed for the embeddings. Now, it remains to allow the LLM to use them in the code.

In [None]:
import inspect
import json
import sys

from langchain_core.prompts import ChatPromptTemplate
from openai import OpenAI
from pydantic import create_model

SYSTEM_MESSAGE = """
You are an expert Python programmer. You are tasked with enhancing an existing codebase by adding features, 
writing tests, optimizing performance, fixing bugs, or other typical software development tasks. 
You have access to specific tools to gather information about the codebase, such as fetching file 
contents, analyzing file structure, running tests, checking syntax, profiling performance, and more. Use these tools 
to gather the necessary context and data before making any modifications to the codebase. Do not make assumptions 
beyond the provided tools and retrieved information.
"""

PROMPT_TEMPLATE = """
Complete the request. You can use the provided tools to get more context of project and files code to complete the request.
{question}
"""


def schema(f):
    kw = {
        n: (o.annotation, ... if o.default == inspect.Parameter.empty else o.default)
        for n, o in inspect.signature(f).parameters.items()
    }
    s = create_model(f"Input for {f.__name__}", **kw).schema()
    return {
        "type": "function",
        "function": dict(name=f.__name__, description=f.__doc__, parameters=s),
    }


if __name__ == "__main__":
    client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

    prompt_template = ChatPromptTemplate.from_template(PROMPT_TEMPLATE)

    messages = [{"role": "system", "content": SYSTEM_MESSAGE}]

    llm_tools = [schema(get_lines_code), schema(get_project_files), schema(parse_python_file)]

    while True:
        user_input = input("? ")
        if not user_input:
            continue

        elif user_input == "exit":
            break

        prompt = prompt_template.format(question=user_input)
        messages.append({"role": "user", "content": prompt})

        # Request OpenAI
        try:
            while True:
                response = client.chat.completions.create(
                    model="gpt-4o",
                    messages=messages,
                    tools=llm_tools,
                )

                messages.append(response.choices[0].message)

                if hasattr(response.choices[0].message, "tool_calls") and getattr(
                    response.choices[0].message, "tool_calls"
                ):
                    # Get the function arguments from model response and call it
                    for tool_call in response.choices[0].message.tool_calls:
                        function_name = tool_call.function.name
                        arguments = json.loads(tool_call.function.arguments)

                        print("USING TOOLS:")
                        print("Using:", tool_call.function.name, arguments)

                        # Call the function with specified arguments
                        result = globals()[function_name](**arguments)

                        # Ensure result is converted to a string
                        str_context_results = json.dumps(result) if isinstance(result, (dict, list)) else str(result)

                        print("Results:")
                        print(str_context_results)
                        print("--------------------------------------------------\n\n")

                        # Provide de results to the model again
                        messages.append(
                            {
                                "role": "tool",
                                "name": tool_call.function.name,
                                "content": str_context_results,
                                "tool_call_id": tool_call.id,
                            }
                        )

                else:
                    print(response.choices[0].message.content)
                    break

        except Exception as e:
            print(f"Error: {e}", sys.stderr)

This approach produced significantly better results than the previous ones that relied on embeddings, particularly in solving the problem of adding a new feature without losing parts of the code, in the file, that the embeddings failed to return.

With the use of these three tools, when asking the LLM to add a specific functionality to some python file in the project, the approach it generally takes is:

* List the files in the project's root directory using get_project_files
* Retrieve information about the Python code structure using parse_python_file
* Get the entire content of the file from line 0 to the last line using get_lines_of_code
* Implement the requested functionality
* Return the entire content of the file with the new functionality implemented

As observed throughout several experiments, the procedure is always approximately the same, highlighting the need to add a function to read the complete content of a file.

In [None]:
def get_file_content(file_path: str) -> str:
    """Get the full content of a file."""
    if not os.path.exists(file_path):
        return f"{file_path} FILE NOT FOUND"

    elif os.path.isdir(file_path):
        return "IS A DIRECTORY"

    with open(file_path, "r") as file:
        return file.read()

It should be noted that the use of these functions raises the same problem previously mentioned for embeddings. The LLM, when reading the complete content of a file, can quickly exhaust the tokens it accepts as input. Therefore, we will need to test an approach to instruct or limit the model's reading of large files at once, as even removing the get_file_content tool, it uses get_lines_of_code to achieve the same end, as mentioned earlier.

Another issue that arises is asking the LLM to add a functionality without specifying which file it should do so in. Although in the tested project there is a file called utilities.py, and most of the time new functionalities have been added there correctly, it is uncertain how the model will behave in larger projects. This behavior should be tested in those projects, assessing what action will be taken with and without the context of the project's code.

It is likely that new functionalities may end up in unexpected places or may not make sense in the context of the project if the model is not properly instructed on where they should be placed.

#### Providing a function to retrieve results from the embeddings

One idea that occurred was, instead of providing the embeddings in the LLM's prompt, to give it a function to search for embeddings when deemed appropriate. Thus, the following function was provided as a tool to the LLM, accepting search terms and a value for k (how many matches to retrieve):

In [None]:
def query_embeddings_for_llm(query_text: str, k: int = 5, return_metadata: bool = False) -> list:
    """Fetch similar content from embeddings for LLM use."""
    results = chroma_db.search(query=query_text, search_type="similarity", k=k)
    if return_metadata:
        return [(res.metadata, res.page_content) for res in results]
    return [res.page_content for res in results]

It happens that from the various tests conducted, the LLM always preferred to use the other tools instead of this one. When questioned about this preference, it responded that, lacking prior knowledge of what is contained in the embeddings, it prefers to use other functions over this one.

After this response, the system message was changed to indicate that all the code is preloaded and available in the embeddings. Still, after this indication, the model exhibited the same behavior, completely ignoring this component.

The code below presents the new system prompt, along with other functions that were developed to create and retrieve embeddings, as well as how this functionality was made available to the LLM. The rest of the code remains unchanged compared to what has been presented so far.

In [None]:
SYSTEM_MESSAGE = """
You are an expert Python programmer. You are tasked with enhancing an existing codebase by adding features, 
writing tests, optimizing performance, fixing bugs, or other typical software development tasks.
 
All project code has been analyzed and stored in embeddings to facilitate more insightful and context-aware interactions. 

You have access to specific tools to gather information about the codebase, such as fetching file 
contents, analyzing file structure, and more. Use these tools 
to gather the necessary context and data before making any modifications to the codebase. Do not make assumptions 
beyond the provided tools and retrieved information.
"""

# ...


def store_embeddings(content: str, metadata: list[dict] = None) -> None:  # noqa F811
    """Store embeddings of content with associated metadata."""
    embedding_vector = embeddings_model.embed_query(content)

    if metadata and not isinstance(metadata, list):
        metadata = [metadata]

    chroma_db.add_texts([content], embeddings=[embedding_vector], metadatas=metadata)


def analyze_and_store_code(file_path: str):
    """Analyze a Python file and store its embeddings, including functions and class methods."""
    structure = parse_python_file(file_path)

    # Store embeddings for standalone functions
    for func in structure["functions"]:
        code_snippet = get_lines_code(file_path, func["start_line"], func["end_line"])
        metadata = {"name": func["name"], "type": "function", "file": file_path}
        store_embeddings(code_snippet, metadata)

    # Store embeddings for classes and their methods
    for cls in structure.get("classes", []):
        class_snippet = get_lines_code(file_path, cls["start_line"], cls["end_line"])
        class_metadata = {"name": cls["name"], "type": "class", "file": file_path}

        # Store the class code snippet
        store_embeddings(class_snippet, class_metadata)

        # Store methods within the class
        for method in cls.get("methods", []):
            method_snippet = get_lines_code(file_path, method["start_line"], method["end_line"])
            method_metadata = {
                "name": method["name"],
                "type": "method",
                "class": cls["name"],
                "file": file_path,
            }
            store_embeddings(method_snippet, method_metadata)


if __name__ == "__main__":
    # Preprocess: Store embeddings for all project files
    for file in get_project_files().strip().split("\n"):
        analyze_and_store_code(file)

    client = OpenAI()
    prompt_template = ChatPromptTemplate.from_template(PROMPT_TEMPLATE)

    messages = [{"role": "system", "content": SYSTEM_MESSAGE}]

    llm_tools = [
        schema(get_lines_code),
        schema(get_project_files),
        schema(parse_python_file),
        schema(store_embeddings),
        schema(query_embeddings_for_llm),
    ]

    # ...

It is worth noting that we also chose to provide a function for storing embeddings (store_embeddings), which also proved to be of no utility to the LLM.

Although the LLM has consistently opted not to use embeddings through query_embeddings_for_llm, it may be necessary to improve the system message and/or prompt to place more emphasis on this functionality. However, this option still requires testing, as it has not been asked, for example, for the LLM to determine whether a specific function already provides a certain functionality or if there is similar code scattered throughout the project. Other similar and relevant questions should be posed, in terms of code refactoring rather than development, to analyze what utility the LLM derives from this type of function, which is more suited to the cases now presented.

 Based on the behavior observed so far, it is believed that the LLM's choice will continue to lean towards avoiding embeddings in favor of more direct functions, even if they require more operations and time.

### Results

The use of the three initial tools, followed by the addition of another to read complete files, were undoubtedly the most preferred and utilized by the LLM during the conducted tests. They can also be crucial, when combined with other functionalities, to provide a greater global context of the project even before being assigned to the LLM, as was the case when using them to generate embeddings from the existing code.

Regarding the tools for storing and retrieving embeddings, they did not prove to be as useful from the LLM's perspective, likely due to their unpredictability compared to the others, where the results are more concise, accurate, and deterministic than this other method.

It is necessary to conduct more tests on the system message and the prompt, and to analyze how the model reacts to these changes. Additionally, the same tests should be performed with other models to understand which one or which are suitable for the objective at hand, as well as how they respond to the same prompts and system messages.

After these tests, it was decided to combine the two approaches, that is: to provide the results of the embeddings in the prompt while simultaneously offering the tools created and already described to the LLM.

## Combine embeddings and tools

Now, let's explain the process followed to combine the previous methods, that is, to inject the results of the embeddings into the prompt while also providing the tools to the LLM. It is worth noting in advance that this is a case that still requires extensive testing, as this approach has been evaluated lightly.

Here, all the tools previously made available were used. The main difference was that the results of the embeddings were incorporated into the prompt introduced into the LLM.

The method of embeddings used was the creation of structured embeddings according to the code, and the approach of splitting files into chunks and generating embeddings from them was not tested.

In [None]:
# ...

while True:
    user_input = input("? ")
    if not user_input:
        continue
    elif user_input == "exit":
        break

    # Retrieve similar code snippets using embeddings
    similar_content = query_embeddings_for_llm(user_input)

    # Construct the AI prompt with embedding-derived context
    prompt = prompt_template.format(request=user_input, similar_code="\n".join(similar_content))
    messages.append({"role": "user", "content": prompt})

# ...

### Results

Although this method has not been exhaustively tested, the few tests conducted suggest that the LLM tends to prefer the tools, seemingly ignoring the information provided in the context altogether.

One option to explore here is to say **"using the examples that follow"** instead of **"using the context that follows"**. Perhaps this slight change could trigger a different behavior from the LLM and encourage it to pay more attention to what is provided by the embeddings.

From this approach, another idea emerged, which, in summary, consists of maintaining a database of code embeddings used to solve specific problems in other projects that can be provided as examples for similar functionalities.

It should be mentioned again that this approach has been very minimally tested and may yield relevant results. Additionally, it is advisable to test changing the prompt or the system message to see their influence on the LLM and whether it pays more attention to the context provided to it. Furthermore, it should be tested in larger projects to verify whether the LLM truly benefits from the embeddings.

## Embeddings across different projects


This approach has not been tested, and it is believed that human intervention may be necessary. However, it could be worth exploring the use of a persistent embeddings database where curated solutions that are similar across various projects and that proven to work can be stored. For example, the use of rest_framework, where there are sometimes custom methods developed by us programmers that solve common problems in an API.

It is essential to test the feasibility of what has been mentioned, to ascertain the actual benefit for the LLM, and to determine the human effort required to identify which solutions deserve to be stored. This does not take into account the need to add extra information to these solutions for the LLM to truly leverage the embeddings. However, it is likely that this last part can be done by the LLM itself before generating the embeddings.

It is also important to define human criteria to help decide what should be stored. This may include the frequency with which the solution is used, its complexity, or its utility in different contexts and projects.

The metadata accompanying these solutions should be studied. For example:

* Context of the problem and applicability.
* Specific problem it solves
* Usage examples
* Possible limitations

Although it has been mentioned that the LLM can perform this analysis, it remains relevant to have a human perspective to verify the depth of the problem or nuances in the solution that the LLM may not have captured.

I asked ChatGPT about this approach to get other opinion. I leave her response here:

> The proposed approach presents a promising strategy to maximize the use of embeddings, especially in contexts where solutions are frequently reused. The combination of persistence, human curation, and rigorous testing can lead to significant results. However, it is vital to implement a clear and structured process to ensure that the stored solutions are truly useful and that the LLM can access them effectively.
Suggestions for Further Exploration

> * Case Studies: Conduct case studies on various projects to see how different approaches work in practice.
> * LLM Feedback: Consider implementing a mechanism for the LLM to provide feedback on the usefulness of the stored solutions, assisting in the ongoing curation of the database.
> * Rapid Iteration: Adopt a rapid iteration cycle in testing and implementations, allowing for adjustments as new information and results become available.

## Conversation history

...

## Prompt enginneering

[Prompting guide](https://www.promptingguide.ai/)