# RAG Chatbot to assist an English Teacher

In this tutorial, we'll build a chatbot to assist a teacher with grading student assignments and help clarify student questions regarding the syllabus and study materials using **retrieval augmented generation (RAG)**.

We'll build our RAG chatbot with these tools:

- LlamaIndex as the data framework for building the RAG application
- Groq as an LLM vendor
- Instructor to get structured output with a consistent schema from our LLM 
- Weave for tracking and evaluating LLM applications

In [None]:
!pip install -qU rich
!pip install -qU wandb weave
!pip install -qU llama-index
!pip install -qU llama-index-embeddings-huggingface llama-index-llms-groq
!pip install -qU wget

In [1]:
import os
import zipfile
from getpass import getpass

import rich
import weave
import wget
from llama_index.llms.groq import Groq
from llama_index.core import Settings, SimpleDirectoryReader
from llama_index.core.node_parser import SemanticSplitterNodeParser
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import VectorStoreIndex
     

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
groq_api_key = getpass("Enter your Groq API key: ")
os.environ["GROQ_API_KEY"] = groq_api_key

In [3]:
project_name = "llamaindex-groq-rag" # @param {type:"string"}

weave.init(project_name=project_name)

Logged in as Weights & Biases user: geekyrakshit.
View Weave data at https://wandb.ai/geekyrakshit/llamaindex-groq-rag/weave


<weave.weave_client.WeaveClient at 0x722f35df97b0>

In [4]:
class MetaData(weave.Model):
    source_document_url: str = "https://huggingface.co/datasets/wandb/weave_cookbook_datasets/resolve/main/flamingos_ncert.zip"
    embedding_model: str = "BAAI/bge-small-en-v1.5"
    splitter_buffer_size: int = 1
    splitter_breakpoint_percentile_threshold: int = 95
    vector_index_persist_dir: str = "./vector_embedding_storage"
    similarity_top_k: int = 10


metadata = MetaData()

In [5]:
zip_file = wget.download(metadata.source_document_url)
with zipfile.ZipFile(zip_file, "r") as zip_ref:
    zip_ref.extractall("./")
os.remove(zip_file)

In [6]:
reader = SimpleDirectoryReader(input_dir="chapters")
documents = reader.load_data(num_workers=4, show_progress=True)



In [7]:
embed_model = HuggingFaceEmbedding(model_name=metadata.embedding_model)
llm = Groq(model="llama3-8b-8192", api_key=os.environ.get("GROQ_API_KEY"))

Settings.llm = llm
Settings.embed_model = embed_model

In [8]:
splitter = SemanticSplitterNodeParser(
    buffer_size=metadata.splitter_buffer_size,
    breakpoint_percentile_threshold=metadata.splitter_breakpoint_percentile_threshold,
    embed_model=embed_model,
)
nodes = splitter.get_nodes_from_documents(documents, show_progress=True)

Generating embeddings: 100%|██████████| 23/23 [00:01<00:00, 19.55it/s]
Generating embeddings: 100%|██████████| 2/2 [00:00<00:00, 25.59it/s]
Generating embeddings: 100%|██████████| 33/33 [00:01<00:00, 28.89it/s]
Generating embeddings: 100%|██████████| 43/43 [00:01<00:00, 32.26it/s]
Generating embeddings: 100%|██████████| 37/37 [00:01<00:00, 21.18it/s]
Generating embeddings: 100%|██████████| 35/35 [00:01<00:00, 27.07it/s]
Generating embeddings: 100%|██████████| 23/23 [00:01<00:00, 18.69it/s]
Generating embeddings: 100%|██████████| 14/14 [00:01<00:00, 12.63it/s]
Generating embeddings: 100%|██████████| 23/23 [00:01<00:00, 15.96it/s]
Generating embeddings: 100%|██████████| 20/20 [00:01<00:00, 16.35it/s]
Generating embeddings: 100%|██████████| 21/21 [00:01<00:00, 18.64it/s]
Generating embeddings: 100%|██████████| 21/21 [00:01<00:00, 15.65it/s]
Generating embeddings: 100%|██████████| 26/26 [00:01<00:00, 23.80it/s]
Generating embeddings: 100%|██████████| 26/26 [00:01<00:00, 18.37it/s]
Generati

In [9]:
vector_index = VectorStoreIndex.from_documents(
    documents, show_progress=True, node_parser=nodes
)
vector_index.storage_context.persist(persist_dir=metadata.vector_index_persist_dir)

Parsing nodes: 100%|██████████| 66/66 [00:00<00:00, 1678.37it/s]
Generating embeddings: 100%|██████████| 66/66 [00:26<00:00,  2.53it/s]


In [10]:
query_engine = vector_index.as_query_engine(
    llm=llm,
    similarity_top_k=metadata.similarity_top_k,
)

In [11]:
query = (
    """In the story 'The Last Lesson', what was the mood in the classroom when M. Hamel gave his last French lesson?"""
)
response = query_engine.query(query).response

rich.print(response)