### Problem
How to teach AI to understand my python project?  

I want to ask AI questions about the project like:
* Where function X is defined and how it's used in the project?
* What's the architecture of the project?
* How to add new components?
* etc.

That's my experience I'd like to share with you. Hope you find it interesting to start experimenting with LLMs.

### Toolset installation

##### Python
Download and install Python 3.9 or higher. Then install requirements.

```bash
pip install -r Requirements.txt
```

This should install FAISS for vector database, Langchain for LLM based application and other dependencies.

##### Ollama
Ollama is a framework for running LLMs locally.
[Download](https://github.com/ollama/ollama/releases) and install it as usual.

Once installed, pull LLM model:
```bash
ollama pull llama3.2
```

Make sure the model is downloaded with the command:
```bash
ollama list
```
Ollama is ready!

## Let's go!
Run the following code to make sure everything is working fine.

In [None]:
from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.2") # the model we want to use
response = llm.invoke("Hello, how are you?")
print(response.content)

You should get the response from LLM that it's ready for helping you. The cool thing here is that you can pull new models in Ollama and use them right away! 

You can even use multiple models at the same time.

You can browse modules on [Hugging Face](https://huggingface.co/models?pipeline_tag=text-generation&num_parameters=min:0,max:12B&sort=trending&search=gguf). It's like github for LLMs. Ollama supports GGUF models.

Make sure that models can be very large and it directly affects the memory required to load the model. 

**I don't recommend experimenting with models larger than 4 GB as it can be very slow and consume a lot of memory.**

Ok, let's define some variables for the tutorial.

In [None]:
ProjectFolder = "./rigBuilder" # that's a path to your project, use your own path
TestFile = "./rigBuilder/widgets.py" # a file for testing from the project
ProjectInfo = "Rig Builder is a tool for making UIs for python scripts." # something about your project for LLM to understand


Here we start working with FAISS. FAISS is a vector database for storing and searching vectors.
It converts texts or documents to vectors and stores them in a database.

In [None]:
from langchain.vectorstores import FAISS
from langchain_huggingface  import HuggingFaceEmbeddings
from langchain.schema import Document

with open(TestFile) as f:
    text = f.read()

doc = Document(page_content=text, metadata={"source": TestFile})

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vector_db = FAISS.from_documents([doc], embeddings)

`Document` is a class from Langchain that represents a textual document. It has a `page_content` attribute that stores the text of the document and `metadata` attribute that stores additional information about the document like file name, type, etc. It works as a base class for all documents in Langchain.

FAISS requires a model to be used for vectorization. We use here `sentence-transformers/all-MiniLM-L6-v2` model. There are lots of them, but once you've chosen one, you cannot change it for the same vector database. For simplicity, understand this model as a function that converts text to a vector (array of numbers).

You can then search for similar texts in the database. 

In [None]:
for i, chunk in enumerate(vector_db.similarity_search("how to make custom widgets?")):
    print(i, chunk.page_content)

The problem here is that we have a single file in the database and so it will always return the same document.

Of course, we can add more files in the database and make a searcher that finds a file that's more relevant to the question. But it will anyway return the whole document which is redundant and inefficient, not to mention that it won't give us any precise answers regarding the documents.

The main idea behind such databases is that we need to split textual data into smaller chunks before storing them. Each chunk is a portion of the file.

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20) # split by text separators and chunk size
chunks = text_splitter.split_documents([doc])

vector_db = FAISS.from_documents(chunks, embeddings)

for i, chunk in enumerate(chunks):
    print(i, chunk.page_content)

Now we have much more documents (pieces of information) in the database. Let's try to search and see if it returns something relevant.

In [None]:
for i, chunk in enumerate(vector_db.similarity_search("how to make custom widgets?", k=5)): # k is the number of documents to return
    print(i, chunk.page_content)

Now we have much more results, but the textual fragments are very small and often look "random". Why so?

The reason is that the vector database searches for the semantic meaning (similarity). It doesn't understand the code at all. 

If you search for `how to make a new component for my project?` it may find chunks with `def make_component` or `my_project.build()`. Yeah, it's not that stupid, but it's not that smart either. You probably don't have such phrases in your codebase like `how to make ...` and so searching will not give any useful information about how to actually create a new component for the project. You'll just get nearest code chunks to the question.

In [None]:
from langchain.prompts import ChatPromptTemplate
from langchain.chains import create_retrieval_chain

prompt_template = ChatPromptTemplate.from_messages([
    ("system", ProjectInfo+"\nUse the following context to answer the questions.\nContext: {context}"),
    ("user", "{input}"),
])

chain = prompt_template | llm # langchain chaining

retriever = vector_db.as_retriever(search_kwargs={"k": 5}) # number of chunks to retrieve
retrieval_chain = create_retrieval_chain(retriever, chain) # this is a high level chain for retrieving data somewhere

response = retrieval_chain.invoke({"input": "how to make a new component for my project?"})
print(response["answer"].content)

That's a bit complicated code but let's discover it step by step.

First, prompts. Remember, guys (I'm saying this to myself as well), prompts are the main components for LLMs. It's very important to understand how they work and how they affect LLMs.

In Langchain, a prompt is just a template with placeholders for variables (sometimes optional). The idea behind prompt templates is that you can define them once and then reuse many times with different variables. 

A prompt has roles and messages.

| Role | Message |
| -- | -- |
| system | You're an expert in AI, answer briefly without long descriptions. |
| user | What are the prompts for LLMs? |

Second, chaining. The following code creates a _chain_ of operations within Langchain framework. 
```python
chain = prompt_template | llm
```

Simply, this means that you can pass data into `chain` and it will pass it through `prompt_template` and then through `llm`.

Each chain has `invoke` method that actually 'executes' the chain.

Important thing here is that `create_retrieval_chain` waits for `context` variable to be existed in the prompt.

So, how it works?

1. You ask a question (it goes to human `{input}` ).
2. `create_retrieval_chain` searches for the nearest chunks to the question in the vector database with the retriever.
3. Then passes those chunks to the prompt under `context` variable.
4. Then everything is combined and`chain` processes the full prompt with the context.
5. `chain` calls LLM and returns the answer.

Keys like `answer` or `context` can be changed in the functions call as additional arguments if needed.

What do we have after the `chain` call? Did the result get better?

If you run the code, you maybe found that the result is still quite weird and a bit random. 

Because the actual prompt to the LLM may look like:
```code
Use the following context to answer the questions.
Context: def make_component():
    """Make a new component for the project."""    
    return Component()
project.add_component()
...
how to make a new component for my project?
```

LLM gets a mess of code chunks and it's not able to find the correct answer!

There are number of solutions to this problem. First, split the file more precisely using dedicated python splitter.

In [None]:
from langchain_text_splitters import Language
from langchain_text_splitters import RecursiveCharacterTextSplitter

code_splitter = RecursiveCharacterTextSplitter.from_language(language=Language.PYTHON, chunk_size=400, chunk_overlap=100)
chunks = code_splitter.split_documents([doc])

vector_db = FAISS.from_documents(chunks, embeddings)

for i, chunk in enumerate(chunks):
    print(i, chunk.page_content)

This returns more accurate chunks based on Python syntax. But if you try to run the previous cell, the result won't be much better, unfortunately. You can play with different `chunk_size` and `chunk_overlap` parameters, increase `k` in `as_retriever`, but you will be stuck after some time.

This may lead to a question: "Does LLM really understand what I want?" because the results are almost random and irrelevant.

Here we come to one of the most important things I've learnt regarding LLMs.

### CONTEXT

Context is limited. If you have lots of files or a great database with the information, you cannot just pass all to the LLM and say: "Here is all the information you need. Solve my problems".

The context is limited to 8K tokens or more, depending on the model. If you have a small project, you can pass the whole files directly to LLM and ask questions, but anyway, you will come to the point that you think LLM doesn't understand your project at all.

And the solution here is to provide additional information about your project in the context!

### SUMMARIZATION

That's even more important thing I've learnt. As LLM has limited memory, it cannot remember all the details and facts.

Of course, you can write the documentation for the project yourself, comment every line of code and LLM will derive the project's logic out of your comments and documentation.

But the nicer way I found is to generate different summaries about the project and pass them to LLM as context before any other code related chunks.

In [None]:
from langchain_text_splitters import Language
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.chains.combine_documents.stuff import create_stuff_documents_chain
from langchain.prompts import ChatPromptTemplate
from langchain.schema import Document
from langchain_core.runnables import RunnableLambda

def summarize_python_file(filePath):
    with open(filePath) as f:
        code = f.read()
    
    doc = Document(page_content=code)

    # split on quite larger chunks as the whole file cannot fit in LLM memory which can lead to strange answers!
    code_splitter = RecursiveCharacterTextSplitter.from_language(language=Language.PYTHON, chunk_size=7000, chunk_overlap=100)
    code_docs = code_splitter.split_documents([doc])

    prompt_code_template = ChatPromptTemplate.from_messages([
        ("system", "In the context are the chunks of the python code. Context: {context}"),
        ("user", "Make a clear and concise summary of the context with classes and functions and their relations.")
    ])

    def inspect_state(state):
        #print("CONTEXT", state) # in case we want to take a look at the final context
        return state

    chain = RunnableLambda(inspect_state) | llm # call our function before going to llm
    stuff_chain = create_stuff_documents_chain(chain, prompt_code_template) # this create a chain which "eats" documents

    # summarize each chunk separately
    summaries = []
    for chunk in code_docs:
        response = stuff_chain.invoke({"context": [chunk]})
        print(response)
        summaries.append(response)

    # then summarize all the chunks' summaries
    summary_template = ChatPromptTemplate.from_messages([
        ("system", "In the context are the summaries of the chunks of the python code. Context: {context}"),
        ("user", "Make a final summary of the context keeping the logic and relations between the classes and functions.")
    ])

    prompt = summary_template.invoke({"context": summaries})
    response = llm.invoke(prompt)
    
    return response.content
    
summary = summarize_python_file(TestFile)
summary

Here is a summary of the provided code:

**Classes**

1. **TemplateWidget**: A base class for all template widgets, responsible for handling JSON data and emitting a signal when something changes.
2. **EditTextDialog**: A dialog box for editing text, with an optional "python" parameter to enable code editor mode.
3. **EditJsonDialog**: A dialog box for editing JSON data, similar to `EditTextDialog` but designed for JSON-specific functionality.
4. **LabelTemplateWidget**: A template widget that displays a label, allowing the user to edit its text using an `EditTextDialog`.
5. **ButtonTemplateWidget**: A template widget that contains a button, which can be edited or have its command modified using an `EditTextDialog` or other methods.
6. **CheckBoxTemplateWidget**: A template widget that contains a checkbox, emitting a signal when the state changes.

**Functions**

1. `getDefaultData`: Returns the default JSON data for each template widget.
2. `getJsonData` and `setJsonData`: Used to get