### About
### Build an LLM-powered RAG using Elasticsearch .

### Import necessary libraries and packages

In [1]:
%pip install OpenAI -qq

Note: you may need to restart the kernel to use updated packages.


In [2]:
%pip install elasticsearch -qq 

Note: you may need to restart the kernel to use updated packages.


In [3]:
%pip install --upgrade ipywidgets -qq


Note: you may need to restart the kernel to use updated packages.


In [5]:
import pandas as pd
import ast
import json
import elasticsearch
from elasticsearch import Elasticsearch
from tqdm.notebook import tqdm, tqdm_notebook 

### Get data

In [6]:
data = 'https://raw.githubusercontent.com/hariprasath-v/Nnet101_Assistant/refs/heads/main/data/Stackoverflow_data(neural_networks_stats)_pre_processed_Gemini_LLM.csv'
data = pd.read_csv(data)
data.head()

Unnamed: 0,question,q_link,tags,q_question_id,q_is_answered,q_accepted_answer_id,q_view_count,q_answer_count,q_score,q_last_activity_date,q_creation_date,a_score,a_creation_date,a_answer,answer
0,How to choose the number of hidden layers and ...,https://stats.stackexchange.com/questions/181/...,model-selection|neural-networks,181,True,1097,1145801,10,820,1661947755,1279584902,671,1280715630,"I realize this question has been answered, but...",**Network Configuration in Neural Networks**\n...
1,What should I do when my neural network doesn&...,https://stats.stackexchange.com/questions/3520...,neural-networks|faq,352036,True,352037,365434,9,368,1701358003,1529367960,455,1529367960,1. Verify that your code is bug free\nThere's...,**Key Considerations for Neural Network Develo...
2,"What exactly are keys, queries, and values in ...",https://stats.stackexchange.com/questions/4219...,neural-networks|natural-language|attention|mac...,421935,True,424127,261109,11,309,1708928023,1565686855,281,1567068576,The key/value/query formulation of attention i...,In the key/value/query formulation of attentio...
3,What is batch size in neural network?,https://stats.stackexchange.com/questions/1535...,neural-networks|python|terminology|keras,153531,True,153535,731148,6,305,1650529048,1432286121,421,1432288067,The batch size defines the number of samples t...,**Summary**\n\n**Batch Size**\n\nBatch size de...
4,What are the advantages of ReLU over sigmoid f...,https://stats.stackexchange.com/questions/1262...,machine-learning|neural-networks|sigmoid-curve...,126238,True,126362,290897,9,234,1723495231,1417486429,205,1417567286,Two additional major benefits of ReLUs are spa...,**Summary:**\n\nRectified Linear Units (ReLUs)...


In [7]:
data_dict = data[['question','tags','answer']].to_dict(orient='records')

### Elasticsearch setup

In [8]:
es_client = Elasticsearch('http://localhost:9200/', request_timeout=60) 

In [9]:
!curl localhost:9200

{
  "name" : "048022699c89",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "dEW__8ZMTlS1ICK92s7vow",
  "version" : {
    "number" : "8.4.3",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "42f05b9372a9a4a470db3b52817899b99a76ee73",
    "build_date" : "2022-10-04T07:17:24.662462378Z",
    "build_snapshot" : false,
    "lucene_version" : "9.3.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}


### Create and add index to elasticsearch

In [10]:
index_settings = {
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "properties": {
            "question": {"type": "text"},
            "answer": {"type": "text"},
            "tags": {"type": "keyword"},
            
        }
    }
}

index_name = "nnet101"

es_client.indices.delete(index=index_name, ignore_unavailable=True)
es_client.indices.create(index=index_name, body=index_settings)

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'nnet101'})

In [11]:
for doc in tqdm_notebook(data_dict):
    es_client.index(index=index_name, document=doc)

  0%|          | 0/500 [00:00<?, ?it/s]

### Elasticsearch query parameters

In [12]:
def elastic_search(query):
    search_query = {
        "size": 5,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": ["question^3", "answer", "tags"],
                        "type": "best_fields"
                    }
                }
            }
        }
    }

    response = es_client.search(index=index_name, body=search_query)
    
    result_docs = []
    
    for hit in response['hits']['hits']:
        result_docs.append(hit['_source'])
    
    return result_docs

### Sample elasticsearch

In [13]:
elastic_search(
    query="What is pooling layer?"
)

[{'question': 'What is global max pooling layer and what is its advantage over maxpooling layer?',
  'tags': 'neural-networks|convolutional-neural-network|pooling',
  'answer': '**Summary:**\n\nGlobal max pooling is a type of max pooling where the pool size is equal to the input size. Unlike regular max pooling, which produces a smaller output, global max pooling produces an output with the same dimensionality as the input.\n\nIn global max pooling, the maximum value across the entire input is extracted, providing a representation that focuses on the most prominent feature. This is useful in applications like natural language processing, where the most important words in a sentence are often indicative of its meaning.\n\nIn contrast, regular max pooling divides the input into smaller segments and extracts the maximum value from each segment, reducing the output size. This is more common in computer vision, where spatial information is important and reducing the size of the representati

### Ollama’s OpenAI compatible API endpoint

In [14]:
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1',
    api_key='ollama',
)


### Functions to a create prompt using retrieved results and user query

In [15]:
def build_prompt(query, search_results):
    prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT: 
{context}
""".strip()

    context = ""
    
    for doc in search_results:
        context = context + f"tags: {doc['tags']}\nquestion: {doc['question']}\nanswer: {doc['answer']}\n\n"
    
    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

def llm(prompt):
    response = client.chat.completions.create(
        model="gemma:2b",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

### RAG
#### The function retrieves elasticsearch results based on the user's query, creates a prompt using those results, and feeds it into the LLM to generate the final response.


In [16]:
def rag(query):
    search_results = elastic_search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer

In [17]:
%%time
print(llm('what is pooling layer?'))

A pooling layer is a type of neural network layer that reduces the dimensionality of a feature map by performing a mathematical operation (such as averaging, max-pooling, or min-pooling) on the original feature map.

**Key features of pooling layers:**

* **Perform dimensionality reduction:** By taking a smaller subset of features from the original feature map, pooling layers reduce the model's complexity and improve performance.
* **Reduce computation time:** While reducing dimensionality, pooling layers can also perform computation more efficiently by reducing the amount of matrix multiplication required.
* **Adapt to different data types:** Pooling can be applied to both 1D and 2D feature maps, making it useful for tasks such as image classification, object detection, and semantic segmentation.
* **Improve feature distribution:** Pooling can help to spread out feature values, reducing their concentration in certain areas of the feature map.

**Types of pooling layers:**

* **Average