# AI for SWEs

## MVP
- Query docs you have in a `/data` folder:

```
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("what is o1")
print(response)
```

This is essentially a RAG system that abstracts over

- Embeddings,
- Vector stores, and
- Chunking.

If this doesn't mean much to you yet, that's cool and you've come to the right place! By the end of this workshop, you'll know about all of this stuff.

The most important aspect to recognize is that we're able to 
- feed an LLM our own documents,
- query them,
- and get a generative response.

It's pretty incredible that we can do all of this in 5 lines of code but we need to make sure we don't enter POC purgatory. Let's keep building!

[Say something briefly about this image:]
![Alt text](img/0-simple-RAG.png)

## Add Front End

Run `2-app-front-end.py` and you'll see something like this:


![Alt Text](img/1-gradio-fe.png)

Now drop a PDF in and query it. In order to prepare for the rest of the workshop, we encourage you to turn your LinkedIn profile into a PDF (print --> PDF) and query your own professional profile!

To build this basic app, we've used

- Gradio in the front end,
- PyMuPDF to help Python read the PDF, and
- Refactored our code.

[say more about all of these things]

[write what type of questions to ask to prep for the app we build later :) ]

## Working locally

You may want to use a local model [talk about the reasons]. There are many ways to use local models and Ollama is a popular one.
If you have Ollama installed and running, it's relatively straightforward to switch out the OpenAI model above for any model you have locally and can run with Ollama. The first thing to do is to switch out the default model that the LlamaIndex query engine uses (OAI) for your local model:



```
# Initialize the Ollama LLM with the desired model (e.g., LLaMA2)
llm = Ollama(model="llama2", request_timeout=60.0)
# Set up the query engine with the Ollama LLM
query_engine = index.as_query_engine(llm=llm)
```


Now, you may think this is enough, but it isn't quite! The following won't work [insert error msg here?]:

```
def process_pdf(pdf_file):
    extracted_text = extract_text_from_pdf(pdf_file)  # Extract text from the uploaded PDF
    document = Document(text=extracted_text)  # Create a proper Document object
    index = VectorStoreIndex.from_documents([document])  # Create index from document
    return index
```

This is because, when using the default OpenAI model, it automatically uses a particular OpenAI model to embed your documents in your vector store.

Now we're using a local model, we need to specify precisely which embedding model we want to use, such as

```
def process_pdf(pdf_file):
    extracted_text = extract_text_from_pdf(pdf_file)
    document = Document(text=extracted_text)
    # Specify a Hugging Face model for local embeddings
    embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
    index = VectorStoreIndex.from_documents([document], embed_model=embed_model)
    return index
```

## Chatting with your PDF

Much of the time, we'd like to ask a clarifying question [give example].
The system we've just built does not store conversation history, meaning it has no memory, so the next step is to make it have this.
There are lots of clever ways to give an LLM enough context to know the history of the conversation (and this is currently a *very* active area of research). The most naive way is to prepend the entire conversation to the query as follows:

```
# Add previous conversation to the query for context
conversation = "\n".join([f"User: {h[0]}\nAssistant: {h[1]}" for h in history])
conversation += f"\nUser: {query}\n"

# Query the index using the user's question with context
response = query_engine.query(conversation)

history.append((query, response.response))
```

We've done this in `4-app-convo.py`. Let's now see it in action!

## Logging Conversations