**Reference Link**: https://github.com/DataTalksClub/llm-zoomcamp/tree/main/01-intro

In [17]:
import json


### Reading the faq llm zoomcamp file which is in json format

In [18]:
with open('documents-llm.json', 'rt') as f_in:
    docs_raw = json.load(f_in)

We are adding the course inside the documents which contains text, question and answer

In [19]:
documents = []

for course_dict in docs_raw:
    for doc in course_dict['documents']:
        doc['course'] = course_dict['course']
        documents.append(doc)

In [20]:
documents[0]

{'text': 'Yes, but if you want to receive a certificate, you need to submit your project while we‚Äôre still accepting submissions.',
 'section': 'General course-related questions',
 'question': 'I just discovered the course. Can I still join?',
 'course': 'llm-zoomcamp'}

We are trying to do RAG implementation using Elastic search.

### üß† RAG (Retrieval-Augmented Generation)
**RAG** is a technique used in natural language processing (NLP) to improve the quality of generated text by retrieving relevant documents before generating a response.

### üîç How RAG Works:
 **Retrieval Phase:**
- A query is sent to a document store (like a vector database).
- The system retrieves relevant documents or passages based on semantic similarity.

**Generation Phase:**
- A language model (like GPT) uses the retrieved documents as context.
- It generates a response that‚Äôs grounded in the retrieved information.



#### üîé Elasticsearch
**Elasticsearch** is a search engine based on Lucene, designed for fast and scalable full-text search.

**‚öôÔ∏è Key Features:**
- Indexing: Stores data in a structured format for fast retrieval.
- Search: Supports keyword search, fuzzy search, and filtering.
- Analytics: Can perform aggregations and visualizations (often used with Kibana).

**üß† How It Works:**
- Data is stored in JSON documents.
- You can query using a powerful DSL (Domain Specific Language).
- It‚Äôs optimized for text search, log analysis, and real-time data exploration.

To run elastic search using Docker, run the following command in command line/terminal

docker run -it \
    --rm \
    --name elasticsearch \
    -m 4GB \
    -p 9200:9200 \
    -p 9300:9300 \
    -e "discovery.type=single-node" \
    -e "xpack.security.enabled=false" \
    docker.elastic.co/elasticsearch/elasticsearch:8.4.3

In [28]:
from elasticsearch import Elasticsearch


In [29]:
es_client = Elasticsearch('http://localhost:9200') 


In [1]:
index_settings = {
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    }, 
    "mappings": {
        "properties": {
            "text": {"type": "text"},
            "section": {"type": "text"},
            "question": {"type": "text"},
            "course": {"type": "keyword"} 
        }
    }
}

""" number_of_shards = 1

Shards are like splitting your folder into smaller subfolders so Elasticsearch can search faster.

Here, 1 means we‚Äôre keeping everything in a single shard (good for small datasets).

number_of_replicas = 0

Replicas are backup copies of your shards for fault tolerance.

Here, 0 means no backups ‚Äî fine for testing, but risky for production.

Mapping part defines the structure of the data in the index ‚Äî like setting column types in a database.

- text, section, question ‚Üí type: "text"
- These fields will be analyzed for full-text search.
- Elasticsearch will tokenize and index them for efficient matching.

- course ‚Üí type: "keyword"
- This field is not analyzed.
- Used for exact matches, filtering, and aggregations (e.g., grouping by course name).


"""

index_name = "course-questions"

es_client.indices.create(index=index_name, body=index_settings)

NameError: name 'es_client' is not defined

In [31]:
documents[0]


{'text': 'Yes, but if you want to receive a certificate, you need to submit your project while we‚Äôre still accepting submissions.',
 'section': 'General course-related questions',
 'question': 'I just discovered the course. Can I still join?',
 'course': 'llm-zoomcamp'}

In [53]:
from tqdm.auto import tqdm


Code is inserting documents into your "course-questions" index in Elasticsearch


In [54]:
for doc in tqdm(documents):
    es_client.index(index=index_name, document=doc)

  0%|          | 0/86 [00:00<?, ?it/s]

In [58]:
query="how to get access to saturn cloud"


In [None]:
def elastic_search(query):
    search_query = {
        "size": 5, #This means: only return 5 documents (results).
        "query": {
            "bool": { #A bool query in Elasticsearch lets you combine conditions ‚Äî like saying must match this AND must pass that filter.
                "must": {
                    "multi_match": { #multi_match = Search for the same query across multiple fields.
                        "query": query,
                        "fields": ["question^3", "text", "section"], #"question^3" ‚Üí The ^3 means boost the importance of matches in question by 3√ó., also search in text and section fields.
                        "type": "best_fields" #Choose the single best field match for scoring.
                    }
                },
                "filter": { #Filters are used to narrow down the search results.
                    "term": { #term = Search for a term in a specific field.
                        "course": "llm-zoomcamp" #Search for the term "llm-zoomcamp" in the "course" field. 
                    }
                }
            }
        }
    }

    response = es_client.search(index=index_name, body=search_query)
    
    result_docs = []
    
    for hit in response['hits']['hits']: #Loop through the hits (results) and extract the relevant information.
        result_docs.append(hit['_source']) #Append the source of the hit (the document) to the result_docs list.
    
    return result_docs

In [60]:
search_results = elastic_search(query)
search_results

[{'text': 'Please see the General section or use CTRL+F to search this doc.',
  'section': 'Module 2: Open-Source LLMs',
  'question': 'Saturn Cloud issues',
  'course': 'llm-zoomcamp'},
 {'text': 'Issue: I get the notice that due to traffic, I‚Äôm on a waitlist for new signups.\nAnswer: There was a form to submit our emails to, so Alexey can send it in bulk. If you missed that deadline, just sign up manually (or via request tech demo link) and use the chat to request for free hours for ‚Äúllm zoomcamp‚Äù\nIssue: I‚Äôm a pre-existing user from a different zoomcamp and I‚Äôm not awarded the free hours even though I‚Äôve submitted my email in the form.\nAnswer: Just request it via their chat, after you‚Äôve logged in using your pre-existing account, citing ‚Äúllm zoomcamp‚Äù .',
  'section': 'General course-related questions',
  'question': 'SaturnCloud - How do I get access?',
  'course': 'llm-zoomcamp'},
 {'text': 'Manually set the token as below:\naccess_token = <your_token>\nmodel  =

In [None]:
## Libraries Required
%pip install langchain-huggingface --quiet
## For API Calls
%pip install huggingface_hub --quiet
%pip install transformers --quiet
%pip install accelerate --quiet
%pip install  bitsandbytes --quiet
%pip install langchain --quiet


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;4

In [None]:
import os 
%pip install dotenv 
from dotenv import load_dotenv


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [None]:
load_dotenv()

True

In [None]:
key=os.environ["HUGGINGFACEHUB_API_TOKEN"]
os.environ["HUGGINGFACEHUB_API_TOKEN"]=key

Calling LLM from hugging face model hub

In [None]:
import os
from huggingface_hub import InferenceClient


def llm(prompt):
    client = InferenceClient(
        provider="groq",
        api_key=key,
    )

    completion = client.chat.completions.create(
        model="openai/gpt-oss-120b",
        messages=[
            {
                "role": "user",
                "content": prompt
            }
        ],
    )

    return completion.choices[0].message.content

In [None]:
def build_prompt(query, search_results):
    prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT: 
{context}
""".strip()

    context = ""
    
    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n" 
        
    """Loops through each doc from Elasticsearch search results.

        For each document:

        Adds the section name.

        Adds the FAQ question from the DB.

        Adds the answer (stored in text).

        Each entry is separated by a blank line for readability."""
            
    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

In [None]:
def rag(query):
    search_results = elastic_search(query) #getting the relevant results from elastic search
    prompt = build_prompt(query, search_results) #passing the results with query and prompt in a proper format to llm
    answer = llm(prompt) #getting the answer from hugging face llm
    return answer

In [61]:

query="how to get access to saturn cloud"
rag(query)

'**How to get access to Saturn\u202fCloud for the LLM Zoomcamp**\n\n1. **Submit your email**  \n   - There was a shared form where you could submit your email address. Alexey used that list to grant access in bulk.  \n   - **If you missed that deadline**, simply sign up on the Saturn\u202fCloud site (or use the ‚Äúrequest tech demo‚Äù link) and then **use the Saturn\u202fCloud chat** to ask for free hours, mentioning **‚Äúllm zoomcamp.‚Äù**\n\n2. **If you already have a Saturn\u202fCloud account from a previous Zoomcamp**  \n   - Log in with your existing account.  \n   - Open the Saturn\u202fCloud chat and request the free hours, again citing **‚Äúllm zoomcamp.‚Äù**  \n\nThe key steps are:\u202fsubmit your email (or sign up manually), then request the free ‚ÄúLLM Zoomcamp‚Äù hours via the chat (or have Alexey add you if you‚Äôre on the original email list).'