Skip to content

Latest commit

 

History

History

01-introduction

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Module 1: Introduction

In this module, we will learn what LLM and RAG are and implement a simple RAG pipeline to answer questions about the FAQ Documents from our Zoomcamp courses.

What we will do:

1.1 Introduction to LLM and RAG

YouTube Class: 1.1 - Introduction to LLM and RAG

  • LLM
  • RAG
  • RAG architecture
  • Course outcome

1.2 Preparing the Environment

YouTube Class: 1.2 - Configuring Your Environment

Create a python3.9 virtual environment in the repository root (only once):

sudo apt-get install python3.9-venv
python3.9 -m venv venv

Activate this environment:

source venv/bin/activate

Install libraries:

make install

1.3 Retrieval and Search

YouTube Class: 1.3 - Retrieval and Search

  • Parse FAQ documents
    • parse_faq.py: function that reads a FAQ document from a Google Docs file and converts the questions and answers to a list of dict.
    • faq_database.json: output of the parse FAQ documents
  • Indexing the documents
  • Performing the search

1.4 Generation with OpenAI

YouTube Class: 1.4 - Generating Answers with OpenAI GPT

  • Invoking OpenAI API
  • Building the prompt
  • Getting the answer

Bonus: OpenAI API Alternatives

Personally, I have used Gemini API from Google because I don't have free credits to use the OpenAI API anymore and Google does not yet require an account with billing to use the Gemini API.

Moreover, Gemini 1.5 Flash model provides a free plan, that is very interesting for study cases.


Gemini 1.5 Flash: free of charge (in June 2024)

Rate Limits

  • 15 RPM (requests per minute)
  • 1 million TPM (tokens per minute)
  • 1,500 RPD (requests per day)

Price (input)

  • Free of charge

Price (output)

  • Free of charge

Context caching

  • Not applicable

Prompts/responses used to improve our products

  • Yes

References

API keys must be secret and never exposed publicly, so here it is used as an environment variable declared in .env file ignored by git.

1.5 Cleaned RAG flow

YouTube Class: 1.5 - The RAG Flow Cleaning and Modularizing Code

  • Cleaning the code
  • Making it modular

Code in Jupyter Notebook: Intro_RAG.ipynb

1.6 Searching with ElasticSearch

YouTube Class: 1.6 - Search with Elastic Search

  • Run ElasticSearch with Docker
  • Index the documents
  • Replace MinSearch with ElasticSearch

Running ElasticSearch locally:

docker run -it \
    --rm \
    --name elasticsearch \
    -p 9200:9200 \
    -p 9300:9300 \
    -e "discovery.type=single-node" \
    -e "xpack.security.enabled=false" \
    docker.elastic.co/elasticsearch/elasticsearch:8.4.3