<a href="https://colab.research.google.com/github/raveesharanamukage/RAG-System/blob/main/RAG_system.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install faiss-cpu mistralai

**1. Retrieval-Augmented Generation (RAG)**

RAG is a hybrid approach combining retrieval and generation to enhance the capabilities of large language models (LLMs)

+ **Retrieval**: Fetching relevant documents or text chunks from a knowledge base based on a user’s query.

+ **Augmentation**: Using the retrieved information to augment the prompt sent to the LLM.

+ **Generation**: Generating a response using the LLM, informed by the augmented context.

RAG is particularly useful for grounding LLMs in specific, external knowledge, reducing hallucination (when models generate incorrect or fabricated information) and enabling them to answer questions based on provided documents.

**2. Vector Embeddings**

Text data is not directly comparable in a meaningful way for similarity searches. Vector embeddings are numerical representations of text in a high-dimensional space, where semantically similar texts are mapped to nearby points. These embeddings are generated using models like BERT or, in this case, Mistral’s mistral-embed.



*   **How they work:** A pre-trained neural network transforms text into a fixed-length vector (e.g., 1024 dimensions in this code).

*   **Use case:** Embeddings allow similarity searches by computing distances (e.g., Euclidean distance) between vectors.

**3.Vector Database and FAISS**

A vector database **stores embeddings** and enables efficient **similarity searches**.
**FAISS** (Facebook AI Similarity Search) is a library designed for fast nearest-neighbor searches in high-dimensional spaces.

+ **IndexFlatL2:** This is a simple FAISS index that uses L2 (Euclidean) distance to measure similarity between vectors. It’s suitable for small datasets, as it performs an exact search without approximations.

+ **Search:** Given a query embedding, FAISS returns the indices of the most similar embeddings in the database.

**4.Mistral AI**
Mistral AI provides models for both **text generation (mistral-small-2503)** and **embedding generation (mistral-embed)**. The code uses Mistral’s API to:

+ Generate embeddings for text chunks and queries.
+ Generate responses to augmented prompts.

**5. Chunking**

Large documents are often too long to process in one go due to model input limits or computational constraints. Chunking involves splitting text into smaller, manageable pieces (e.g., 2068 characters in this code) while preserving meaning as much as possible.




In [None]:
from mistralai import Mistral
import requests
import numpy as np
import faiss
import os
from getpass import getpass

api_key=getpass("Enter the API key:")
client=Mistral(api_key=api_key)

Enter the API key:··········


In [None]:
model_id="mistral-small-2503"
chat_response=client.chat.complete(model=model_id,messages=[
    {
    "role":"user",
    "content":"what is the meaning of life?"
}
    ]
                                   )
print(chat_response.choices[0].message.content)



The meaning of life is a philosophical question that has been debated for centuries, and it doesn't have one definitive answer as it can vary greatly depending on personal beliefs, religious or spiritual views, and philosophical persuasions. Here are a few perspectives:

1. **Existentialism**: Existentialists like Jean-Paul Sartre argued that life has no inherent meaning, and it's up to each individual to create their own purpose.

2. **Religious and Spiritual Views**: Many religions provide their own answers. For example:
   - In Christianity, the purpose of life might be seen as loving and serving God.
   - In Buddhism, the purpose might be achieving enlightenment.

3. **Hedonism**: Hedonists believe the purpose of life is to seek pleasure and happiness.

4. **Altruism**: Some people find meaning in life through helping others and making the world a better place.

5. **Personal Growth**: Many people find purpose in learning, growing, and becoming the best version of themselves.

6. *

In [None]:
response=requests.get("https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt")
text=response.text
f=open('essay.txt','w')
f.write(text)
f.close

<function TextIOWrapper.close()>

In [None]:
len(text)

75014

Chunking the data

In [None]:
chunk_size=2068
chunks=[text[i:i+chunk_size] for i in range(0,len(text),chunk_size) ]


In [None]:
len(chunks)

37

Creating Numerical Representation of the Data

In [None]:
def get_text_embeddings(input):
  embeddings_batch_response=client.embeddings.create(
      model="mistral-embed",
      inputs=input
  )
  return embeddings_batch_response.data[0].embedding

In [None]:
get_text_embeddings(chunks[0])

In [None]:
def get_text_embedding_batch(batch):
  embedding_batch_response=client.embeddings.create(
      model="mistral-embed",
      inputs=batch
  )
  return [embedding_batch_response.data[i].embedding for i in range(len(batch))]



In [None]:
text_embeddings=np.array(get_text_embedding_batch(chunks))

In [None]:
text_embeddings.shape

(37, 1024)

Indexing(Populating the Vector DB with Data Chunks and its embeddings)

In [None]:
d=text_embeddings.shape[1]
index=faiss.IndexFlatL2(d)
index.add(text_embeddings)

In [None]:
question="What were the two main things the author worked on before college?"
question_embedding=np.array([get_text_embeddings(question)])
question_embedding.shape


(1, 1024)

In [None]:
question_embedding

array([[-0.05447388,  0.03479004,  0.0375061 , ..., -0.02787781,
        -0.00327492,  0.0029068 ]])

Finding most Similar Chunks(Retrieval)

In [None]:
D,I=index.search(question_embedding,k=2)
print(I)

[[ 0 32]]


In [None]:
retrieved_chunk=[chunks[i] for i in I.tolist()[0]]
print(retrieved_chunk)

['\n\nWhat I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming. I didn\'t write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.\n\nThe first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district\'s 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain\'s lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.\n\nThe language we used was an early version of Fortran. You had to type programs on punch cards, then 

Augmentation

In [None]:
prompt=f"""
Context information is below,
----------------------------
{retrieved_chunk}
---------------------------
Given the context information and not prior knowledge,answer the query,
Query: {question}
Answer:
"""


In [None]:
def run_mistral(user_message,model="mistral-small-2503"):
  messages=[
      {
      "role":"user",
      "content":user_message
      }
      ]
  chat_response=client.chat.complete(
      model=model,
      messages=messages
  )

  return(chat_response.choices[0].message.content)

In [None]:
 run_mistral(prompt)

'Before college, the author worked on two main things outside of school: writing and programming. Specifically, the author wrote short stories and began programming on an IBM 1401 using an early version of Fortran.'

In [None]:
#without knowledge base
run_mistral(question)

'A. The author worked on a farm and did a lot of reading.'