### About
### Build an LLM-powered RAG using a simple TF-IDF similarity search.

### Import necessary libraries and packages

In [1]:
%pip install OpenAI -qq

Note: you may need to restart the kernel to use updated packages.


In [3]:
import pandas as pd
import json

### Get data

In [4]:
data = 'https://raw.githubusercontent.com/hariprasath-v/Nnet101_Assistant/refs/heads/main/data/Stackoverflow_data(neural_networks_stats)_pre_processed_Gemini_LLM.csv'
data = pd.read_csv(data)
data.head()

Unnamed: 0,question,q_link,tags,q_question_id,q_is_answered,q_accepted_answer_id,q_view_count,q_answer_count,q_score,q_last_activity_date,q_creation_date,a_score,a_creation_date,a_answer,answer
0,How to choose the number of hidden layers and ...,https://stats.stackexchange.com/questions/181/...,model-selection|neural-networks,181,True,1097,1145801,10,820,1661947755,1279584902,671,1280715630,"I realize this question has been answered, but...",**Network Configuration in Neural Networks**\n...
1,What should I do when my neural network doesn&...,https://stats.stackexchange.com/questions/3520...,neural-networks|faq,352036,True,352037,365434,9,368,1701358003,1529367960,455,1529367960,1. Verify that your code is bug free\nThere's...,**Key Considerations for Neural Network Develo...
2,"What exactly are keys, queries, and values in ...",https://stats.stackexchange.com/questions/4219...,neural-networks|natural-language|attention|mac...,421935,True,424127,261109,11,309,1708928023,1565686855,281,1567068576,The key/value/query formulation of attention i...,In the key/value/query formulation of attentio...
3,What is batch size in neural network?,https://stats.stackexchange.com/questions/1535...,neural-networks|python|terminology|keras,153531,True,153535,731148,6,305,1650529048,1432286121,421,1432288067,The batch size defines the number of samples t...,**Summary**\n\n**Batch Size**\n\nBatch size de...
4,What are the advantages of ReLU over sigmoid f...,https://stats.stackexchange.com/questions/1262...,machine-learning|neural-networks|sigmoid-curve...,126238,True,126362,290897,9,234,1723495231,1417486429,205,1417567286,Two additional major benefits of ReLUs are spa...,**Summary:**\n\nRectified Linear Units (ReLUs)...


#### Get TF-IDF minisearch code from [DataTalks LLM Zoomcamp](https://github.com/DataTalksClub/llm-zoomcamp/blob/main/01-intro/minsearch.py)

In [6]:

!wget https://raw.githubusercontent.com/alexeygrigorev/minsearch/main/minsearch.py

--2024-10-05 06:37:13--  https://raw.githubusercontent.com/alexeygrigorev/minsearch/main/minsearch.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... 

connected.
HTTP request sent, awaiting response... 200 OK
Length: 3832 (3.7K) [text/plain]
Saving to: ‘minsearch.py.1’


2024-10-05 06:37:13 (31.9 MB/s) - ‘minsearch.py.1’ saved [3832/3832]



### Import minisearch

In [6]:
import minsearch

In [8]:
data_dict = data[['question','tags','answer']].to_dict(orient='records')

### Data sample

In [9]:
data_dict[0]

{'question': 'How to choose the number of hidden layers and nodes in a feedforward neural network?',
 'tags': 'model-selection|neural-networks',
 'answer': "**Network Configuration in Neural Networks**\n\n**Standardization**\nThere is no single standardized method for configuring networks. However, guidelines exist for setting the number and type of network layers, as well as the number of neurons in each layer.\n\n**Initial Architecture Setup**\nBy following specific rules, one can establish a competent network architecture. This involves determining the number and type of neuronal layers and the number of neurons within each layer. This approach provides a foundational architecture but may not be optimal.\n\n**Iterative Tuning**\nOnce the network is initialized, its configuration can be iteratively tuned during training. Ancillary algorithms, such as pruning, can be used to eliminate unnecessary nodes, optimizing the network's size and performance.\n\n**Network Layer Types and Sizing

### Sample TF-IDF search

#### Index setting

In [10]:
index = minsearch.Index(
    text_fields=["question",  "answer"],
    keyword_fields=["tag"]
)

#### Fit the data

In [11]:
index.fit(data_dict)

<minsearch.Index at 0x7802d6e1f110>

#### Sample query

In [12]:
q = 'what is pooling layer?'

#### Sample search results

In [13]:
index.search(q)

[{'question': 'What is global max pooling layer and what is its advantage over maxpooling layer?',
  'tags': 'neural-networks|convolutional-neural-network|pooling',
  'answer': '**Summary:**\n\nGlobal max pooling is a type of max pooling where the pool size is equal to the input size. Unlike regular max pooling, which produces a smaller output, global max pooling produces an output with the same dimensionality as the input.\n\nIn global max pooling, the maximum value across the entire input is extracted, providing a representation that focuses on the most prominent feature. This is useful in applications like natural language processing, where the most important words in a sentence are often indicative of its meaning.\n\nIn contrast, regular max pooling divides the input into smaller segments and extracts the maximum value from each segment, reducing the output size. This is more common in computer vision, where spatial information is important and reducing the size of the representati

### Create a function retrieve 5 results using TF-IDF search

In [14]:
def search(query):
    results = index.search(
        query=query,
        num_results=5
    )

    return results

### Ollama’s OpenAI compatible API endpoint

In [15]:
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1',
    api_key='ollama',
)


### Functions to a create prompt using retrieved results and user query

In [16]:
def build_prompt(query, search_results):
    prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT: 
{context}
""".strip()

    context = ""
    
    for doc in search_results:
        context = context + f"tags: {doc['tags']}\nquestion: {doc['question']}\nanswer: {doc['answer']}\n\n"
    
    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

def llm(prompt):
    response = client.chat.completions.create(
        model="gemma:2b",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

### RAG
#### The function retrieves TF-IDF results based on the user's query, creates a prompt using those results, and feeds it into the LLM to generate the final response.


In [17]:
def rag(query):
    search_results = search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer

In [18]:
%%time
print(llm('what is pooling layer?'))

Sure. Here's a definition of a pooling layer:

**Pooling Layer**

A pooling layer is a type of layer used in deep learning neural networks for reducing the dimensionality of feature maps and increasing computational efficiency. It involves taking a subset of the input features and using them to represent the entire input.

**Key Features:**

* **Supervised learning:** Polling layers are typically used in convolutional neural networks (CNNs) for tasks like image classification, object detection, and segmentation.
* **Spatial pooling:** They operate by taking a fixed size pool of features from the input and then applying a non-linear transformation on the pooled representation.
* **Feature extraction:** The output of a pooling layer is a smaller, more compact representation of the input that captures the essential features of the original data.
* **Dimensionality reduction:** By reducing the dimensionality of feature maps, pooling layers can help to:
    * Reduce the memory consumption o