![ALT_TEXT_FOR_SCREEN_READERS](./header.png)

# Exercise 4 Retrieval Augmented Generation

The goal of this exercise is to build a chatbot demo which allows you to talk about the content of documents. The method behind this exercise is called retrieval augmented generation (RAG).
The detailed tasks in this exercise are:
- install a local large language model using the application LM Studio
- setup a new environment with the required packages
- implement a simple chatbot using llama-index
- test the chatbot on a specific technical document

Sources for llama-index and local LLM [1] using LM Studio [2] and a small LLM [3]:
- [1] [https://docs.llamaindex.ai/en/stable/getting_started/starter_example_local/?h=embedding+model](llama-index)
- [2] [https://lmstudio.ai/](https://lmstudio.ai/)
- [3] [https://huggingface.co/models](https://huggingface.co/models)

# Considerations

- Read the tutorials carefully, especially [1]
- Install LM Studio
- Install additional software packages into the environment by uncommenting the pip install commands one time
- Select a model based on your memory size of the laptop
- This is less a coding example, rather just the integration with a local LLM

# Requirements

- R0: Install the required packages using the pip commands
- R1: Install the LM Studio
- R2: Find a model which is running on your machine
- R3: Start the server for the model in LM Studio
- R4: Connect the server to the notebook
- R5: Run the code parts until the first query
- R6: Improve your query according to the slides learned in the class


In [None]:
#!pip install ipywidgets

In [None]:
#!pip install llama-index

In [None]:
#!pip install pip install llama-index-llms-openai-like

In [None]:
#!pip install llama-index-embeddings-huggingface

# Code Snipplet

In [None]:
from llama_index.llms.openai_like import OpenAILike

In [None]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings

In [None]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

In [None]:
from llama_index.core.embeddings import resolve_embed_model

## Prepare LLM

In [None]:
llm_url = 'http://localhost:1234/v1'

In [None]:
Settings.llm = OpenAILike(model="mistral", temperature=0.0, api_key='bla', api_base=llm_url, )

## Prepare Embedding Model

In [None]:
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en",cache_folder='./')

## Generate Vector Store Content

In [None]:
documents = SimpleDirectoryReader("documents").load_data()

In [None]:
index = VectorStoreIndex.from_documents(documents)

## Setup Query Engine

In [None]:
query_engine = index.as_query_engine(streaming=False, verbose=True, similarity_top_k=2)

## First Query

In [None]:
response = query_engine.query("What are the physical properties of the product?")

In [None]:
print(response)

# Improved Query

In [None]:
...