# This Module 1 from LLM Zoomcamp from DTC

## LLM Zoomcamp 1.1 - Introduction to LLM and RAG

### LLM (Large Language Model)

- Language Model: Basic language (NLP) models predict next token/word based on previous ones.
- LLM: The LMs trained on gorges data with billion and billons of parameter which trained Neural Networks. 

A Large Language Model (LLM) is a type of artificial intelligence model that uses deep learning to understand and generate human language. It's trained on massive amounts of text data, allowing it to learn patterns and structures in language and perform various natural language processing (NLP) tasks. 


Here's a more detailed explanation:

-  Deep Learning:
LLMs are based on deep learning, a subfield of machine learning that uses artificial neural networks with multiple layers to analyze data and learn complex patterns. 

- Transformer Architecture:
Many LLMs are built upon the Transformer architecture, which allows them to process relationships between words in a sentence, even if they're far apart. 

- Training Data:
LLMs are trained on vast amounts of text, such as books, articles, and websites, to learn the nuances of language and its various forms. 

- Capabilities:
LLMs can perform a wide range of NLP tasks, including:
    Text Generation: Creating different textual formats, like poems, code, scripts, musical pieces, email, letters, etc. 

- Translation: Translating languages. 
    - Question Answering: Answering questions based on provided information. 
    - Summarization: Condensing large amounts of text into a shorter version. 
    - Sentiment Analysis: Determining the emotional tone of a piece of text. 
    - Code Generation: Writing code. 

- Applications:
LLMs have a wide range of applications across various industries, including:
    - Customer Service: Providing automated customer support. 
    - Content Creation: Generating marketing copy, blog posts, and other content. 
    - Research: Analyzing large datasets of text to extract insights. 
    - Education: Helping students with writing and language learning. 
In essence, LLMs are powerful tools that can understand, generate, and manipulate human language, making them valuable in many fields. 


![What is LLM](LLM-zoomcamp-whatIsLLM.drawio.png)

### RAG (Retrieval Augmented Generation)

Retrieval-Augmented Generation, is a technique in natural language processing (NLP) that combines the strengths of retrieval and generative AI models. It works by first retrieving relevant information from a knowledge base and then using a large language model (LLM) to generate a response that incorporates the retrieved data. This allows for more accurate, up-to-date, and contextually relevant outputs. 

Here's a more detailed breakdown:

*Retrieval*: RAG utilizes search algorithms to query external data sources like databases, knowledge bases, or even the web. 

*Integration*: The retrieved information is then integrated with a pre-trained LLM. 

*Generation*: The LLM uses the retrieved data to generate a response, which can be a question answer, a summary, or even new text. 

Benefits of RAG:
- Enhanced Accuracy and Relevance:
By accessing external knowledge, RAG can generate more precise and relevant responses. 

- Improved Contextual Understanding:
The retrieved information helps the LLM better understand the context of the user's query, leading to more fitting answers. 

- Real-time Updates:
RAG can incorporate up-to-date information from external sources, ensuring that the generated responses are current. 

- Source Attributions:
RAG can provide citations or references to the sources used to generate the response, improving trust and transparency. 

- Cost-Effective:
RAG can deliver some of the benefits of a custom LLM without the high cost of retraining or fine-tuning a new model. 


![What is RAG](LLM-zoomcamp-whatIsRAG.drawio.png)

## LLM Zoomcamp 1.2 - Configuring Your Environment 
Will be using codespace in loacl vscode via git.

1. install requierments
'''
bash 
pip install tqdm notebook openai elasticsearch pandas scikit-learn ipywidgets
'''

2. Generate a key in openai and export in terminal
'''
bash 
export OPENAI_API_KEY="<your key>"
'''

### 1.2: Test openai api

In [None]:
import os
from openai import OpenAI

token = os.environ["GITHUB_TOKEN"]
endpoint = "https://models.github.ai/inference"
model = "openai/gpt-4.1"

client = OpenAI(
    base_url=endpoint,
    api_key=token,
)

response = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant.",
        },
        {
            "role": "user",
            "content": "What is the capital of France?",
        }
    ],
    temperature=1.0,
    top_p=1.0,
    model=model
)

print(response.choices[0].message.content)



In [None]:
# completion = client.chat.completions.create(
#   model=model,
#   messages=[
#     {
#       "role": "user",
#       "content": "What is the meaning of life?"
#     }
#   ]
# )


In [None]:
# print(completion.choices[0].message.content)

In [None]:
completion = client.chat.completions.create(
  model=model,
  messages=[
    {
      "role": "user",
      "content": "Is it toolate to join the course?"
    }
  ]
)

print(completion.choices[0].message.content)

## LLM Zoomcamp 1.3 Retrieval

### 1.3.1 Implement a Search Engine

For that go to original repo to follow

#### [Build Your Own Search Engine ](https://github.com/alexeygrigorev/build-your-own-search-engine)

or go to [internal folder](build_your_own_search_engine/README.md)

Instead of building a serach engine we can continue a minimalist one buld by Alex from DTC. [Link](https://github.com/alexeygrigorev/minsearch)

#### Intro to RAG

In [None]:
import minsearch

In [None]:
# load data

import requests 

docs_url = 'https://github.com/alexeygrigorev/llm-rag-workshop/raw/main/notebooks/documents.json'
docs_response = requests.get(docs_url)
documents_raw = docs_response.json()

documents = []

for course in documents_raw:
    course_name = course['course']

    for doc in course['documents']:
        doc['course'] = course_name
        documents.append(doc)

In [None]:
documents[0]

In [None]:
# index the documents
# Create and fit the index
index = minsearch.Index(
    text_fields=["question", "text", "section"],
    keyword_fields=["course"]
)

In [None]:
# SELECT * WHERE course = 'data-engineering-zoomcamp';

SELECT * WHERE course = 'data-engineering-zoomcamp';

In [None]:
q = 'the course already started, can I still enroll?'

In [None]:
index.fit(documents)

In [None]:
# define the boost to identify importance of fields
# the higher the value, the more important the field is
# the default value is 1.0 for all fields
boost = {
    'question': 3.0, 'section': 0.5
}

In [None]:
results  = index.search(
    query=q,
    boost_dict=boost,
    num_results=5
)

In [None]:
results

In [None]:
results  = index.search(
    query=q,
    filter_dict={'course': 'data-engineering-zoomcamp'},
    boost_dict=boost,
    num_results=5
)

In [None]:
results

## LLM Zoomcamp 1.4 - Generating Answers with LLM

In [None]:
import os
api_key = os.getenv("OPENAI_API_KEY")


In [None]:
import os
from openai import OpenAI

token = os.environ["GITHUB_TOKEN"]
endpoint = "https://models.github.ai/inference"
model = "openai/gpt-4.1"

client = OpenAI(
    base_url=endpoint,
    api_key=token,
)

In [None]:
q

In [None]:
response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "user", "content": q}
    ]
)

In [None]:
response.choices[0].message.content

In [None]:
prompt_template = """
PROMPT:
You are a course teaching assistant.
Your task is to help students with their questions about the course material based on the context provided from the FAQ database.
Use only the context to answer the questions. If the context does not provide enough information, respond with "I don't know" or "I don't have enough information to answer that question."
Be polite and concise in your responses.

QUESTION: {question}

CONTEXT:
{context}
"""


In [None]:
results

In [None]:
context = ""

for doc in results:
    context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"

In [None]:
print(context)

In [None]:
prompt = prompt_template.format(question=q, context=context).strip()
print(prompt)

In [None]:
response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "user", "content": prompt}
    ]
)

In [None]:
response.choices[0].message.content

## LLM Zoomcamp 1.5 - The RAG Flow Cleaning and Modularizing Code

In [None]:
def search(query):

    boost = {
        'question': 3.0,
        'section': 0.5
    }

    # Search the index with the query and boost
    # Filter by course if needed
    # For example, if you want to filter by 'data-engineering-zoomcamp'
    results = index.search(
        query=query,
        filter_dict={'course': 'data-engineering-zoomcamp'},
        boost_dict=boost,
        num_results=5
    )
    return results

In [None]:
def build_prompt(query, search_results):

    prompt_template = """
    PROMPT:
    You are a course teaching assistant.
    Your task is to help students with their questions about the course material based on the context provided from the FAQ database.
    Use only the context to answer the questions. If the context does not provide enough information, respond with "I don't know" or "I don't have enough information to answer that question."
    Be polite and concise in your responses.

    QUESTION: {question}

    CONTEXT:
    {context}
    """.strip()

    context = ""

    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"
    prompt = prompt_template.format(question=query, context=context).strip()
    
    return prompt

In [None]:
import os
from openai import OpenAI

token = os.environ["GITHUB_TOKEN"]
endpoint = "https://models.github.ai/inference"
model = "openai/gpt-4.1"

client = OpenAI(
    base_url=endpoint,
    api_key=token,
)

def llm(prompt):

    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "user", "content": prompt}
        ]
    )
    return response.choices[0].message.content

In [None]:
query = "How do I run Kafka?"

def rag(query):
    """
    Run the RAG process: search, build prompt, and get answer from LLM.
    """
    # Search for relevant documents
    search_results = search(query)

    prompt = (build_prompt(query, search_results))

    answer = llm(prompt)

    return answer

In [None]:
rag("The course already started, can I still enroll?")

## LLM Zoomcamp 1.6 - Search with Elasticsearch

In [None]:
documents[0]

In [None]:
# !pip install "elasticsearch==8.4.3"

In [None]:
from elasticsearch import Elasticsearch

es_client = Elasticsearch('http://localhost:9200')

In [None]:
# # elasticsearch container with lower memory limits

# docker run -it \
#   --rm \
#   --name elasticsearch \
#   -p 9200:9200 \
#   -p 9300:9300 \
#   -e "discovery.type=single-node" \
#   -e "xpack.security.enabled=false" \
#   -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
#   elasticsearch:8.4.3

In [None]:
es_client.info()

In [None]:
index_settings = {
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "properties": {
            "text": {"type": "text"},
            "section": {"type": "text"},
            "question": {"type": "text"},
            "course": {"type": "keyword"} 
        }
    }
}

index_name = "course-questions"

es_client.indices.create(
    index=index_name,
    body=index_settings)

In [None]:
documents[0]

In [None]:
from tqdm.auto import tqdm

In [None]:
for doc in tqdm(documents):
    es_client.index(index=index_name,document=doc)

In [None]:
query

In [None]:
query = "I just discovered the course, can I still enroll?"

In [None]:
search_query = {
    "size": 5,
    "query": {
        "bool": {
            "must": {
                "multi_match": {
                    "query": query,
                    "fields": ["question^3", "text", "section"],
                    "type": "best_fields"
                }
            },
            "filter": {
                "term": {
                    "course": "data-engineering-zoomcamp"
                }
            }
        }
    }
}

In [None]:
response = es_client.search(
    index=index_name,
    body=search_query
)

In [None]:
response

In [None]:
response['hits']['hits'][0]['_source']

In [None]:
results_docs = []

for hit in response['hits']['hits']:
    results_docs.append(hit['_source'])

In [None]:
results_docs

In [None]:
# clean up 1.6

def elasric_search(query):
    search_query = {
        "size": 5,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": ["question^3", "text", "section"],
                        "type": "best_fields"
                    }
                },
                "filter": {
                    "term": {
                        "course": "data-engineering-zoomcamp"
                    }
                }
            }
        }
    }
    

    response = es_client.search(
        index=index_name,
        body=search_query
    )


    results_docs = []

    for hit in response['hits']['hits']:
        results_docs.append(hit['_source'])

    return results_docs

In [None]:
elasric_search(query)

In [None]:
def rag(query):
    """
    Run the RAG process: search, build prompt, and get answer from LLM.
    """
    # Search for relevant documents
    search_results = elasric_search(query)

    prompt = (build_prompt(query, search_results))

    answer = llm(prompt)

    return answer

In [None]:
rag(query)