### About
### Build an LLM-powered RAG using Elasticsearch .

### Import necessary libraries and packages

In [1]:
%pip install OpenAI -qq

Note: you may need to restart the kernel to use updated packages.


In [12]:
%pip install elasticsearch -qq 

Note: you may need to restart the kernel to use updated packages.


In [30]:
%pip install --upgrade ipywidgets -qq


Note: you may need to restart the kernel to use updated packages.


In [1]:
import pandas as pd
import ast
import json
import elasticsearch
from elasticsearch import Elasticsearch
from tqdm.notebook import tqdm, tqdm_notebook 

### Get data

In [2]:
data = 'https://raw.githubusercontent.com/hariprasath-v/Nnet101_Assistant/refs/heads/main/data/Stackoverflow_data(neural_networks_stats)_pre_processed_Gemini_LLM.csv'
data = pd.read_csv(data)
data.head()

Unnamed: 0,q_title,q_link,q_tags,q_question_id,q_is_answered,q_accepted_answer_id,q_view_count,q_answer_count,q_score,q_last_activity_date,q_creation_date,a_score,a_creation_date,a_answer,llm_answer_summary
0,How to choose the number of hidden layers and ...,https://stats.stackexchange.com/questions/181/...,"['model-selection', 'neural-networks']",181,True,1097,1145532,10,820,1661947755,1279584902,671,1280715630,"I realize this question has been answered, but...",**Network Configuration in Neural Networks**\n...
1,What should I do when my neural network doesn&...,https://stats.stackexchange.com/questions/3520...,"['neural-networks', 'faq']",352036,True,352037,365347,9,368,1701358003,1529367960,455,1529367960,1. Verify that your code is bug free\nThere's...,**Summary:**\n\nBuilding neural networks requi...
2,"What exactly are keys, queries, and values in ...",https://stats.stackexchange.com/questions/4219...,"['neural-networks', 'natural-language', 'atten...",421935,True,424127,260772,11,309,1708928023,1565686855,281,1567068576,The key/value/query formulation of attention i...,Attention is a retrieval process that involves...
3,What is batch size in neural network?,https://stats.stackexchange.com/questions/1535...,"['neural-networks', 'python', 'terminology', '...",153531,True,153535,730947,6,305,1650529048,1432286121,421,1432288067,The batch size defines the number of samples t...,**Batch Size: Optimization in Deep Learning**\...
4,What are the advantages of ReLU over sigmoid f...,https://stats.stackexchange.com/questions/1262...,"['machine-learning', 'neural-networks', 'sigmo...",126238,True,126362,290838,9,234,1723495231,1417486429,205,1417567286,Two additional major benefits of ReLUs are spa...,"ReLU (Rectified Linear Unit) functions, define..."


### Data pre-processing
#### combine tags with pipe operator

In [3]:
ast.literal_eval(data['q_tags'][0])

['model-selection', 'neural-networks']

In [4]:
data['q_tags'] = data['q_tags'].apply(lambda x: "|".join(i.strip() for i in ast.literal_eval(x)))

#### Column rename

In [5]:
data_1 = data[['q_title','q_tags','llm_answer_summary']].rename(columns={'q_title':'question','q_tags':'tags','llm_answer_summary':'answer'}).to_dict(orient='records')

### Elasticsearch setup

In [6]:
es_client = Elasticsearch('http://localhost:9200/', request_timeout=60) 

In [7]:
!curl localhost:9200

{
  "name" : "4da7f4364af6",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "KEosNzKHRpCSdDQkICVeNA",
  "version" : {
    "number" : "8.4.3",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "42f05b9372a9a4a470db3b52817899b99a76ee73",
    "build_date" : "2022-10-04T07:17:24.662462378Z",
    "build_snapshot" : false,
    "lucene_version" : "9.3.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}


### Create and add index to elasticsearch

In [8]:
index_settings = {
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "properties": {
            "question": {"type": "text"},
            "answer": {"type": "text"},
            "tags": {"type": "keyword"},
            
        }
    }
}

index_name = "nnet101"

es_client.indices.delete(index=index_name, ignore_unavailable=True)
es_client.indices.create(index=index_name, body=index_settings)

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'nnet101'})

In [9]:
for doc in tqdm_notebook(data_1):
    es_client.index(index=index_name, document=doc)

  0%|          | 0/100 [00:00<?, ?it/s]

### Elasticsearch query parameters

In [10]:
def elastic_search(query):
    search_query = {
        "size": 5,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": ["question^3", "answer", "tags"],
                        "type": "best_fields"
                    }
                }
            }
        }
    }

    response = es_client.search(index=index_name, body=search_query)
    
    result_docs = []
    
    for hit in response['hits']['hits']:
        result_docs.append(hit['_source'])
    
    return result_docs

### Sample elasticsearch

In [11]:
elastic_search(
    query="What is pooling layer?"
)

[{'question': 'What is global max pooling layer and what is its advantage over maxpooling layer?',
  'tags': 'neural-networks|conv-neural-network|pooling',
  'answer': "Global max pooling is a max pooling operation where the pool size equals the input size. It outputs the maximum value for each feature across the input's temporal dimension. Ordinary max pooling, in contrast, takes a specified pool size and outputs maximum values within that window.\n\nIn Keras, the `GlobalMaxPooling1D` layer performs global max pooling on 1D temporal data. It converts a 3D tensor (samples, steps, features) to a 2D tensor (samples, features).\n\nGlobal max pooling is commonly used in domains like natural language processing, while ordinary max pooling is more prevalent in domains like computer vision."},
 {'question': 'What is an embedding layer in a neural network?',
  'tags': 'machine-learning|neural-networks|python|word-embeddings',
  'answer': 'Word2Vec, a natural language processing technique, repr

### Ollama’s OpenAI compatible API endpoint

In [12]:
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1',
    api_key='ollama',
)


### Functions to a create prompt using retrieved results and user query

In [13]:
def build_prompt(query, search_results):
    prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT: 
{context}
""".strip()

    context = ""
    
    for doc in search_results:
        context = context + f"tags: {doc['tags']}\nquestion: {doc['question']}\nanswer: {doc['answer']}\n\n"
    
    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

def llm(prompt):
    response = client.chat.completions.create(
        model="gemma:2b",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

### RAG
#### The function retrieves elasticsearch results based on the user's query, creates a prompt using those results, and feeds it into the LLM to generate the final response.


In [14]:
def rag(query):
    search_results = elastic_search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer

In [15]:
%%time
print(llm('what is pooling layer?'))

Sure. Here's the definition of the pooling layer:

**Pooling Layer**

A pooling layer is a type of neural network layer used in computer vision and natural language processing (NLP) to reduce the dimensionality of data while preserving important features. It achieves this by taking multiple smaller feature maps and combining them into a single larger map. This technique helps to reduce computational costs and improve model performance.

**Key Characteristics of Pooling Layers:**

* **Input:** Pooling layers take multiple input feature maps of equal size as their input.
* **Output:** The output is a single feature map with the same dimensions as the input.
* **Method:** There are different pooling methods, such as max pooling, average pooling, and max-pooling with a kernel size of 2x2.
* **Regularization:** Pooling layers can be followed by a regularizer layer to prevent overfitting.

**How Pooling Layers Work:**

Pooling layers work by taking the average or maximum value of each pixel 