In this module, we will learn what LLM and RAG are and implement a simple RAG pipeline to answer questions about the FAQ Documents from our Zoomcamp courses.
What we will do:
- Index Zoomcamp FAQ documents
- Create a Q&A system for answering questions about these documents
YouTube Class: 1.1 - Introduction to LLM and RAG
- LLM
- RAG
- RAG architecture
- Course outcome
YouTube Class: 1.2 - Configuring Your Environment
Create a python3.9
virtual environment in the repository root (only once):
sudo apt-get install python3.9-venv
python3.9 -m venv venv
Activate this environment:
source venv/bin/activate
Install libraries:
make install
YouTube Class: 1.3 - Retrieval and Search
- Parse FAQ documents
- parse_faq.py: function that reads a FAQ document from a Google Docs file and converts the questions and answers to a list of dict.
- faq_database.json: output of the parse FAQ documents
- Indexing the documents
- minsearch.py: source code of the minimal search engine
- Performing the search
YouTube Class: 1.4 - Generating Answers with OpenAI GPT
- Invoking OpenAI API
- Building the prompt
- Getting the answer
Bonus: OpenAI API Alternatives
Personally, I have used Gemini API from Google because I don't have free credits to use the OpenAI API anymore and Google does not yet require an account with billing to use the Gemini API.
Moreover, Gemini 1.5 Flash model provides a free plan, that is very interesting for study cases.
Gemini 1.5 Flash: free of charge (in June 2024)
Rate Limits
- 15 RPM (requests per minute)
- 1 million TPM (tokens per minute)
- 1,500 RPD (requests per day)
Price (input)
- Free of charge
Price (output)
- Free of charge
Context caching
- Not applicable
Prompts/responses used to improve our products
- Yes
References
API keys must be secret and never exposed publicly, so here it is used as an environment variable declared in .env
file ignored by git.
YouTube Class: 1.5 - The RAG Flow Cleaning and Modularizing Code
- Cleaning the code
- Making it modular
Code in Jupyter Notebook: Intro_RAG.ipynb
YouTube Class: 1.6 - Search with Elastic Search
- Run ElasticSearch with Docker
- Index the documents
- Replace MinSearch with ElasticSearch
Running ElasticSearch locally:
docker run -it \
--rm \
--name elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
docker.elastic.co/elasticsearch/elasticsearch:8.4.3