### Problem
How to teach AI to understand my python project?  

I want to be able to ask AI questions about the project:
* Where function X is defined and how it's used in the project?
* What's the architecture of the project?
* How to add new components?
* etc.

Below is my experience I'd like to share with you. Hope you find it interesting to start experimenting with LLMs.

### Toolset installation

##### Python
Download and install Python 3.9 or higher. Then install requirements.

```console
pip install -r requirements.txt
```

This should install FAISS for vector databases, Langchain for LLM based applications and other dependencies.

In case you don't want to clog the system repository, use virtual environment.

```console
python -m venv python_ai
python_ai\Scripts\activate
pip install -r requirements.txt
```

##### Ollama
Ollama is a framework for running LLMs locally.
[Download](https://github.com/ollama/ollama/releases) and install it as usual.

Once installed, pull LLM model which we are going to use during the tutorial.
```console
ollama pull llama3.2
```

Make sure the model is downloaded with the command:
```console
ollama list
```

The last command runs Ollama as server if it's not running yet:
```console
ollama serve
```

Ollama is ready! It stays in background and provides LLMs for us.

## Let's go!
Run the following code to make sure everything is working fine.

In [1]:
from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.2", temperature=0.1) # the model we want to use, the more temperature the more creative/crazy
response = llm.invoke("Hello, how are you?")
print(response.content)

I'm just a language model, so I don't have emotions or feelings like humans do. However, I'm functioning properly and ready to assist you with any questions or tasks you may have! How can I help you today?


You should get the response from LLM that it's ready for helping you. The cool thing here is that you can pull new models in Ollama and use them right away! 

You can even use multiple models at the same time.

You can browse models on [Hugging Face](https://huggingface.co/models?pipeline_tag=text-generation&num_parameters=min:0,max:12B&sort=trending&search=gguf). It's like github for LLMs. Ollama supports GGUF models.

Make sure that models can be very large and it directly affects the memory required to load the model and the computation time. 

**I don't recommend experimenting with models larger than 4 GB as it can be very slow and consume a lot of memory.**

Ok, let's define some variables for the tutorial.

In [2]:
ProjectFolder = "./rigBuilder" # that's a path to your project, use your own path
TestFile = ProjectFolder+"/widgets.py" # some file for testing from the project
ProjectInfo = "Rig Builder is a tool for making UIs for python scripts." # information about your project for LLM to start understand something
Question = "How to add a custom template widget for my project?" # the question you want to answer about the project


Here we start working with FAISS. FAISS is a vector database for storing and searching vectors.
It converts texts or documents to vectors and stores them in a database.

In [3]:
from langchain.vectorstores import FAISS
from langchain_huggingface  import HuggingFaceEmbeddings
from langchain.schema import Document

with open(TestFile) as f:
    text = f.read()

testFileDocument = Document(page_content=text, metadata={"source": TestFile})

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2") # embeddings model converts text to vector
vector_db = FAISS.from_documents([testFileDocument], embeddings)

`Document` is a class from Langchain that represents a textual document. It works as a base class for all documents in Langchain. Metadata is really helpful for filtering documents by type, file name, etc.

FAISS requires a model to be used for vectorization. We use here `sentence-transformers/all-MiniLM-L6-v2` model. There are lots of them, but once you've chosen one, you cannot change it for the same vector database. For simplicity, understand this model as a function that converts text to a vector (array of numbers).

Then you can search for similar texts in the database. 

In [4]:
for i, chunk in enumerate(vector_db.similarity_search(Question)):
    print(f"CHUNK: {i}")
    print(chunk.page_content)

CHUNK: 0
from PySide2.QtGui import *
from PySide2.QtCore import *
from PySide2.QtWidgets import *

import sys
import os
import json
import math
from .utils import *
from .editor import *
from .jsonWidget import JsonWidget

DCC = os.getenv("RIG_BUILDER_DCC") or "maya"

if sys.version_info.major > 2:
    RootPath = os.path.dirname(__file__) # Rig Builder root folder
else:
    RootPath = os.path.dirname(__file__.decode(sys.getfilesystemencoding())) # legacy

if DCC == "maya":
    import maya.cmds as cmds

class TemplateWidget(QFrame):
    somethingChanged = Signal()

    def __init__(self, *, executor=None, **kwargs):
        super().__init__(**kwargs)
        self.executor = executor # used to execute commands
...

The problem here is that we have a single file in the database and so it will always return the same document.

Of course, we can add more files to the database and make a searcher that finds a file that's more relevant to the question. But it will anyway return the whole document which is redundant and inefficient, not to mention that it won't give us any relevant information regarding the documents (just whole files).

In order to use vector databases more efficiently, we need to split textual data into smaller chunks before storing it. Each chunk is a portion of the file.

In [5]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20) # split by text separators and chunk size
chunks = text_splitter.split_documents([testFileDocument])

vector_db = FAISS.from_documents(chunks, embeddings)

for i, chunk in enumerate(vector_db.similarity_search(Question, k=5)): # k is the number of documents to return
    print(f"CHUNK: {i}")
    print(chunk.page_content)    

CHUNK: 0
widgets = value["widgets"]
        templates = value["templates"]
CHUNK: 1
for template, d in widgetsData:
                templates.append(template)
CHUNK: 2
layout.addWidget(self.buttonWidget)
CHUNK: 3
layout.addWidget(self.textWidget)
        layout.addWidget(okBtn)
CHUNK: 4
class TextTemplateWidget(TemplateWidget):
    def __init__(self, **kwargs):


Now we have more documents (pieces of the file) in the database, but the textual fragments as you can see are very small and often look "random" to the question. Why?

The reason is that the vector database searches for the semantic meaning (similarity). It doesn't understand the code at all, let alone the project structure. 

If you search for `how to make a new component for my project?` it may find chunks with `def make_component` or `my_project.build()`. You probably don't have such phrases in your codebase like `how to make ...` and so the searching will not give any useful information about how to actually create a new component for the project. You'll just get the nearest code chunks to the question.

In [6]:
from langchain.prompts import ChatPromptTemplate
from langchain.chains import create_retrieval_chain

prompt_template = ChatPromptTemplate.from_messages([
    ("system", ProjectInfo+"\nUse the following context to answer the questions. Context:\n{context}"),
    ("user", "{input}"),
])

chain = prompt_template | llm # langchain chaining

retriever = vector_db.as_retriever(search_kwargs={"k": 5}) # number of chunks to retrieve from vector database
retrieval_chain = create_retrieval_chain(retriever, chain) # this is a high level chain for retrieving data somewhere

response = retrieval_chain.invoke({"input": Question}) # user question
print(response["answer"].content)

To add a custom template widget for your project, you can follow these steps:
1. Create a new class that inherits from `TemplateWidget`. This is the base class for all widgets in Rig Builder.
2. Define the properties and methods of your custom widget using Python's property decorator and method definitions.
3. In the `__init__` method, initialize any attributes or variables specific to your widget.
4. Use the `layout.addWidget(self.widget)` method to add your widget to the layout.

Here is an example:

```python
from rigBuilder.widgets import TemplateWidget

class CustomTemplateWidget(TemplateWidget):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.custom_attribute = kwargs.get('custom_attribute', 'default value')

    @property
    def custom_property(self):
        return f"Custom property: {self.custom_attribute}"

    def custom_method(self):
        print("This is a custom method")
```
To use this widget in your project, you would need to add it t

This answer is not reflecting my project's structure, it won't work. The code above is a bit complicated, let's discover it step by step.

First, prompts. A prompt is one of the main components for LLMs. It's very important to understand how it works and affects LLMs. I found that it's more art than science. 
* Adding adjectives to the prompt can change the output.
* Rephrasing the prompt can change the output.
* Adding more context to the prompt can change the output.
* etc.

That's quite unusual for me that a program can work differently just by manipulating words in the prompt.

In Langchain, a prompt is just a template with placeholders for variables (sometimes optional). The idea behind prompt templates is that you can define them once and then reuse many times with different variables. 

A prompt has roles and messages. Briefly:

| Role | Message |
| -- | -- |
| system | You're an expert in AI, answer briefly without long descriptions. |
| user | What are the prompts in LLMs? |

Second, chaining. The following code creates a _chain_ of operations within Langchain framework. 
```python
chain = prompt_template | llm
```

Simply, this means that you can pass data into `chain` and it will pass it through `prompt_template` and then through `llm`.

Each chain has `invoke` method that actually 'executes' the chain.

Important thing here is that `create_retrieval_chain` expects `context` variable to be existed in the prompt.

So, how it works?

1. You ask a question (it goes to user's `{input}` ).
2. `create_retrieval_chain` searches for the nearest chunks to the question in the vector database with the retriever.
3. Then passes those chunks to the prompt under `context` variable.
4. Then everything is combined and`chain` processes the full prompt with the context.
5. `chain` calls LLM and returns the answer.

Keys like `answer` and `context` can be changed in the functions call as additional arguments if needed.

What do we have after the `chain` call? Did the result get better?

If you run the code, you may find that the result is still quite weird and a bit random. 

Because the actual prompt to the LLM may look like:

---
```
Use the following context to answer the questions. Context:
my_component = make_component()
    return make_component()
    """Make a new component for the project."""    
project.add_component()
...
how to make a new component for my project?
```
---
LLM gets a mess of code chunks and it's not able to find the correct answer! It doesn't know neither structure of the project, nor the whole code.

There are number of solutions to this problem. First, split the file more precisely using dedicated python splitter.

In [7]:
from langchain_text_splitters import Language
from langchain_text_splitters import RecursiveCharacterTextSplitter

code_splitter = RecursiveCharacterTextSplitter.from_language(language=Language.PYTHON, chunk_size=400, chunk_overlap=100) # it will split by python syntax
chunks = code_splitter.split_documents([testFileDocument])

vector_db = FAISS.from_documents(chunks, embeddings)

for i, chunk in enumerate(chunks):
    print(f"CHUNK: {i}")
    print(chunk.page_content)

CHUNK: 0
from PySide2.QtGui import *
from PySide2.QtCore import *
from PySide2.QtWidgets import *

import sys
import os
import json
import math
from .utils import *
from .editor import *
from .jsonWidget import JsonWidget

DCC = os.getenv("RIG_BUILDER_DCC") or "maya"
CHUNK: 1
DCC = os.getenv("RIG_BUILDER_DCC") or "maya"

if sys.version_info.major > 2:
    RootPath = os.path.dirname(__file__) # Rig Builder root folder
else:
    RootPath = os.path.dirname(__file__.decode(sys.getfilesystemencoding())) # legacy

if DCC == "maya":
    import maya.cmds as cmds
CHUNK: 2
class TemplateWidget(QFrame):
    somethingChanged = Signal()

    def __init__(self, *, executor=None, **kwargs):
        super().__init__(**kwargs)
        self.executor = executor # used to execute commands

    def getDefaultData(self):
        return self.getJsonData()

    def getJsonData(self):
        raise Exception("getJsonData must be implemented")
CHUNK: 3
def getJsonData(self):
        raise Exception("getJsonData

This returns more accurate chunks based on Python syntax. But if you try to run the previous cell, the result won't be much better. You can play with different `chunk_size` and `chunk_overlap` parameters, increase `k` in `as_retriever`, but you will be stuck after some time.

This may lead to a question: "Does LLM really understand what I want?" because the results are almost random and irrelevant.

We've come to one of the most important thing regarding LLMs.

#### CONTEXT

The context is what LLM is seeing while trying to answer the question. As it's not trained on your project, it doesn't know anything about it. If you pass just code chunks, it will not "automatically" create the high-level representation of your project.

**Context is limited**. If you have lots of files or a great database, you cannot just pass all to the LLM and say: "Here is all the information you need. Solve my problems".

The context is limited to 8K tokens or more, depending on the model. If you have a small project, you can pass the whole files directly to LLM and ask questions, but anyway, you will come to the point that you think LLM doesn't understand your project at all.

And the solution here is to provide additional information in the context. But rather specific information.

#### SUMMARIZATION

As LLM has limited memory, it cannot remember all the details and facts.

Of course, you can write the documentation for the project yourself, comment every line of code and LLM will derive the project's logic out of your comments and documentation.

But the nicer way is to generate different summaries about the project and pass them to LLM as context before any code chunks.

In [None]:
from langchain_text_splitters import Language
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.chains.combine_documents.stuff import create_stuff_documents_chain
from langchain.prompts import ChatPromptTemplate
from langchain.schema import Document
from langchain_core.runnables import RunnableLambda

def batch(lst, size):
    return [lst[i:i + size] for i in range(0, len(lst), size)]

def summarize_python_file(file_path):
    with open(file_path, "r") as f:
        content = f.read()

    # split on large chunks as we process them separately, in case of strange results play with chunk_size
    code_splitter = RecursiveCharacterTextSplitter.from_language(language=Language.PYTHON, chunk_size=8000, chunk_overlap=50)
    chunks = code_splitter.split_text(content)
    print(f"FILE: {file_path}, CHUNKS COUNT: {len(chunks)}")

    chunk_template = ChatPromptTemplate.from_messages([
        ("system", "Analyze the chunk code in the context. It's a part of a larger file. Context: {context}"),
        ("user", "Generate a concise, structured architecture summary. List all classes and functions and their purpose, without long descriptions, examples and function arguments.")
    ])

    def inspect_state(state):
        #print("CONTEXT", state) # when you want to take a look at the final context before sending it to LLM
        return state

    basic_chain = RunnableLambda(inspect_state) | llm # call our function before going to llm
    chunk_chain = chunk_template | basic_chain

    # summarize each chunk separately
    summaries = []
    for chunk in chunks:
        response = chunk_chain.invoke({"context": chunk}) 
        summaries.append(response.content)

    # then summarize all the chunks' summaries
    summary_template = ChatPromptTemplate.from_messages([
        ("system", "In the context are the summaries of the chunks of the python code. Context: {context}"),
        ("user", "Combine the summaries from the context into cohesive reference to quickly understand the architecture of the code.")
    ])

    summary_chain = summary_template | RunnableLambda(inspect_state) | llm

    batch_summaries = []
    for summary_list in batch(summaries, 3): # process summaries in batches to avoid context size overflow
        summary = "\n\n".join(summary_list)
        response = summary_chain.invoke({"context": summary}) # fill the context in the prompt
        batch_summaries.append(response.content)

    full_summary = "\n\n".join(batch_summaries)
    response = summary_chain.invoke({"context": full_summary})
    return response.content

summary = summarize_python_file(TestFile)
print(summary)

FILE: ./rigBuilder/widgets.py, CHUNKS COUNT: 10
**Architecture Reference**

The provided Python code consists of several interconnected components that work together to provide a robust template-based interface.

### Core Classes

* `TemplateWidget`: A base class for all template widgets, providing a common interface for displaying and editing data.
* Specialized template widgets:
	+ `ButtonTemplateWidget`
	+ `CheckBoxTemplateWidget`
	+ `ComboBoxTemplateWidget`
	+ `CurveTemplateWidget`
	+ `JsonTemplateWidget`
	+ `LabelTemplateWidget`
	+ `LineEditTemplateWidget`
	+ `ListBoxTemplateWidget`
	+ `RadioButtonTemplateWidget`
	+ `TableTemplateWidget`
	+ `TextTemplateWidget`
	+ `VectorTemplateWidget`

### Data Access and Manipulation

* Functions for accessing and manipulating data in templates:
	+ `getDefaultData`
	+ `getJsonData`
	+ `setJsonData`
	+ `smartConversion`
	+ `fromSmartConversion`

### Event Handling and Context Management

* Functions for handling events related to buttons, text c

The downside of this approach is that it's slow as we have to call LLM many times. But generally I see it's a shamanic dance around the context size.

Also try to play with different prompts and you will find that slightly changing "summary" to "knowledge map" can significantly change the results. LLM is whimsical and fancy.

There is a predefined function that works by applying summary logic to chunks and then to the summaries of the chunks recursively. This function is `langchain.chains.summarize.load_summarize_chain`. You can play with it and see if it works better. It also provides different strategies for summarization, like "map_reduce" and "refine".

So, once we have a file summarization function, we can use it to summarize all the project's files and store this information inside our vector database as well as the files themselves.

In [9]:
import os
from langchain.text_splitter import RecursiveCharacterTextSplitter, Language
from langchain.schema import Document

code_splitter = RecursiveCharacterTextSplitter.from_language(language=Language.PYTHON, chunk_size=400, chunk_overlap=50) # for code
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50) # for summaries

documents = []
for file in os.listdir(ProjectFolder):    
    if file.endswith(".py"):
        file_path = ProjectFolder + "/" + file

        with open(file_path) as f:
            content = f.read()

        code_doc = Document(page_content=content, metadata={"source": file, "type":"code"})
        documents += code_splitter.split_documents([code_doc])

        summary = summarize_python_file(file_path)

        summary_doc = Document(page_content=summary, metadata={"source": file, "type":"summary"})
        documents += text_splitter.split_documents([summary_doc])

vector_db = FAISS.from_documents(documents, embeddings)

vector_db.save_local("vector_db") # we can save it to disk
#vector_db = FAISS.load_local("vector_db", embeddings, allow_dangerous_deserialization=True) # and load

FILE: ./rigBuilder/core.py, CHUNKS COUNT: 5
FILE: ./rigBuilder/editor.py, CHUNKS COUNT: 9
FILE: ./rigBuilder/jsonWidget.py, CHUNKS COUNT: 4
FILE: ./rigBuilder/utils.py, CHUNKS COUNT: 2
FILE: ./rigBuilder/widgets.py, CHUNKS COUNT: 10
FILE: ./rigBuilder/__init__.py, CHUNKS COUNT: 15


Once the vector database is filled with the summaries and code chunks, you can use it to answer questions about the project. And it will answer much better (I hope).

In [None]:
from langchain.chains import create_retrieval_chain
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_messages([
    ("system", ProjectInfo+"\nAnswer the user's questions about the project. Context: {context}"),
    ("human", "Question: {input}")
])
chain = prompt_template | llm # langchain chaining

retriever = vector_db.as_retriever(search_kwargs={"k": 10}) # number of chunks to retrieve
retrieval_chain = create_retrieval_chain(retriever, chain) # this is a high level chain for retrieving data somewhere

response = retrieval_chain.invoke({"input": Question}) # user query
print(response["answer"].content)

To add a custom template widget for your project, you can follow these steps:

1. **Create a new class**: In the `widgets.py` file, create a new class that inherits from `TemplateWidget`. This will be your custom template widget.

2. **Implement the necessary methods**: Your class should implement the following methods:
   - `__init__`: Initializes the widget with any necessary parameters.
   - `setJsonData`: Sets the JSON data for the widget.
   - `getDefaultData`: Returns the default data for the widget.
   - `template`: A property that returns the template name.

3. **Create a layout**: In the `__init__` method, create a layout (e.g., QVBoxLayout or QGridLayout) and set it as the main layout for your widget.

4. **Add widgets to the layout**: Add any necessary widgets to the layout, such as labels, buttons, or text editors.

5. **Connect signals and slots**: Connect any signals (e.g., button clicks) to slots (e.g., functions that update the data).

6. **Register the custom widget**:

That's much more relevant to my project, but truth be told, it continues to lie!

It's good to add another kind of summaries, like a brief description of each file or relationship between classes, which can help LLM to understand the project better. A basic project description written manually can also be very helpful. Truth be told, I've found that those summaries are the key point for LLMs to understand the complex structures due to their restricted memory.

Langchain and FAISS provide lots of tools and high level functions to manipulate vector databases and chains, like removing documents or filtering them by metadata.

In [None]:
vector_db.similarity_search("What is the main purpose of this project?", filter={"type":"summary"}) # search for summaries
retriever = vector_db.as_retriever(search_kwargs={"k": 7, "filter": {"type":"summary"}}) # create retriever with filter

### Summary

LLM still doesn't understand the project as it a human does. That's my conclusions.

* Prompts are very important and affect the overal quality of the response. It's more than it seems.
* LLMs context is limited, but you have to provide as much as possible to LLM to understand the problem.
* Summarize everything and store this information inside vector databases as well as the code chunks.
* The lie is in its nature!

Hope you found this tutorial helpful and intriguing for your further researches. Thanks for reading!