### About
### Build an LLM-powered RAG using a simple TF-IDF similarity search.

### Import necessary libraries and packages

In [None]:
%pip install OpenAI -qq

In [36]:
import pandas as pd
import ast
import json

### Get data

In [2]:
data = 'https://raw.githubusercontent.com/hariprasath-v/Nnet101_Assistant/refs/heads/main/data/Stackoverflow_data(neural_networks_stats)_pre_processed_Gemini_LLM.csv'
data = pd.read_csv(data)
data.head()

Unnamed: 0,q_title,q_link,q_tags,q_question_id,q_is_answered,q_accepted_answer_id,q_view_count,q_answer_count,q_score,q_last_activity_date,q_creation_date,a_score,a_creation_date,a_answer,llm_answer_summary
0,How to choose the number of hidden layers and ...,https://stats.stackexchange.com/questions/181/...,"['model-selection', 'neural-networks']",181,True,1097,1145532,10,820,1661947755,1279584902,671,1280715630,"I realize this question has been answered, but...",**Network Configuration in Neural Networks**\n...
1,What should I do when my neural network doesn&...,https://stats.stackexchange.com/questions/3520...,"['neural-networks', 'faq']",352036,True,352037,365347,9,368,1701358003,1529367960,455,1529367960,1. Verify that your code is bug free\nThere's...,**Summary:**\n\nBuilding neural networks requi...
2,"What exactly are keys, queries, and values in ...",https://stats.stackexchange.com/questions/4219...,"['neural-networks', 'natural-language', 'atten...",421935,True,424127,260772,11,309,1708928023,1565686855,281,1567068576,The key/value/query formulation of attention i...,Attention is a retrieval process that involves...
3,What is batch size in neural network?,https://stats.stackexchange.com/questions/1535...,"['neural-networks', 'python', 'terminology', '...",153531,True,153535,730947,6,305,1650529048,1432286121,421,1432288067,The batch size defines the number of samples t...,**Batch Size: Optimization in Deep Learning**\...
4,What are the advantages of ReLU over sigmoid f...,https://stats.stackexchange.com/questions/1262...,"['machine-learning', 'neural-networks', 'sigmo...",126238,True,126362,290838,9,234,1723495231,1417486429,205,1417567286,Two additional major benefits of ReLUs are spa...,"ReLU (Rectified Linear Unit) functions, define..."


### Data pre-processing
#### combine tags with pipe operator

In [3]:
ast.literal_eval(data['q_tags'][0])

['model-selection', 'neural-networks']

In [4]:
data['q_tags'] = data['q_tags'].apply(lambda x: "|".join(i.strip() for i in ast.literal_eval(x)))

#### Column rename

In [22]:
data_1 = data[['q_title','q_tags','llm_answer_summary']].rename(columns={'q_title':'question','q_tags':'tags','llm_answer_summary':'answer'}).to_dict(orient='records')

#### Get TF-IDF minisearch code from [DataTalks LLM Zoomcamp](https://github.com/DataTalksClub/llm-zoomcamp/blob/main/01-intro/minsearch.py)

In [6]:

!wget https://raw.githubusercontent.com/alexeygrigorev/minsearch/main/minsearch.py

--2024-10-05 06:37:13--  https://raw.githubusercontent.com/alexeygrigorev/minsearch/main/minsearch.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... 

connected.
HTTP request sent, awaiting response... 200 OK
Length: 3832 (3.7K) [text/plain]
Saving to: ‘minsearch.py.1’


2024-10-05 06:37:13 (31.9 MB/s) - ‘minsearch.py.1’ saved [3832/3832]



### Import minisearch

In [7]:
import minsearch

### Data sample

In [37]:
data_1[0]

{'question': 'How to choose the number of hidden layers and nodes in a feedforward neural network?',
 'tags': 'model-selection|neural-networks',
 'answer': '**Network Configuration in Neural Networks**\n\nNeural networks require network configuration, which involves determining the number and types of layers and the number of neurons within each layer.\n\n**Standard Method:**\n\n* Initialize a competent network architecture using a set of rules that determine the number and size of input, hidden, and output layers.\n\n**Optimization:**\n\n* Once initialized, the network configuration can be iteratively tuned during training using pruning techniques.\n* Pruning eliminates unnecessary nodes based on their low weight values.\n\n**Layer Configuration:**\n\n* Input layer: Number of neurons determined by the number of features in the training data.\n* Output layer: Number of neurons determined by the model configuration (classifier vs. regressor).\n* Hidden layers: Typically one hidden layer

### Sample TF-IDF search

#### Index setting

In [32]:
index = minsearch.Index(
    text_fields=["question",  "answer"],
    keyword_fields=["rag"]
)

#### Fit the data

In [None]:
index.fit(data_1)

#### Sample query

In [46]:
q = 'what is pooling layer?'

#### Sample search results

In [47]:
index.search(q)

[{'question': 'What is global max pooling layer and what is its advantage over maxpooling layer?',
  'tags': 'neural-networks|conv-neural-network|pooling',
  'answer': "Global max pooling is a max pooling operation where the pool size equals the input size. It outputs the maximum value for each feature across the input's temporal dimension. Ordinary max pooling, in contrast, takes a specified pool size and outputs maximum values within that window.\n\nIn Keras, the `GlobalMaxPooling1D` layer performs global max pooling on 1D temporal data. It converts a 3D tensor (samples, steps, features) to a 2D tensor (samples, features).\n\nGlobal max pooling is commonly used in domains like natural language processing, while ordinary max pooling is more prevalent in domains like computer vision."},
 {'question': 'Why is max pooling necessary in convolutional neural networks?',
  'tags': 'deep-learning|conv-neural-network|pooling',
  'answer': "**Summary:**\n\nPooling layers, while providing transl

### Create a function retrieve 5 results using TF-IDF search

In [21]:
def search(query):
    results = index.search(
        query=query,
        num_results=5
    )

    return results

### Ollama’s OpenAI compatible API endpoint

In [39]:
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1',
    api_key='ollama',
)


### Functions to a create prompt using retrieved results and user query

In [40]:
def build_prompt(query, search_results):
    prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT: 
{context}
""".strip()

    context = ""
    
    for doc in search_results:
        context = context + f"tags: {doc['tags']}\nquestion: {doc['question']}\nanswer: {doc['answer']}\n\n"
    
    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

def llm(prompt):
    response = client.chat.completions.create(
        model="gemma:2b",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

### RAG
#### The function retrieves TF-IDF results based on the user's query, creates a prompt using those results, and feeds it into the LLM to generate the final response.


In [41]:
def rag(query):
    search_results = search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer

In [44]:
%%time
print(llm('what is pooling layer?'))

Sure, here's a breakdown of the term "pooling layer":

**In the context of artificial intelligence (AI):**

A **pooling layer** is a specific processing layer in deep neural networks that performs a downsampling operation on the input data. It is used to **reduce the dimensionality** of the data and **speed up training and inference**.

**Key features of pooling layers:**

* They reduce the spatial dimensions of the input by extracting a small patch of features and applying a function (such as averaging or max-pooling) to the patch.
* They do not modify the number of channels in the input data.
* They are common at the **input layer** of deep neural networks.
* Pooling layers can be used with various **activation functions**, such as ReLU, max-out, and average-pooling.

**Types of pooling layers:**

* **Max pooling:** The maximum value from the input patch is selected.
* **Average pooling:** The average values from the input patch are calculated.
* **Min pooling:** The minimum value fr