In [None]:
from transformers import AutoModel, AutoTokenizer
import torch
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

In [10]:
def character_chunking(chunk_size):
    with open("../content.txt", "r") as file:
        content = file.read()

    chunks = []

    for i in range(0, len(content), chunk_size):
        chunk = content[i : i + chunk_size]
        chunks.append(chunk)

    return chunks

In [19]:
chunks = character_chunking(300)
chunks

['Retrieval-Augmented Generation (RAG) is a groundbreaking technique in natural language processing that combines the strengths of retrieval-based and generative models to create a system capable of producing highly accurate and contextually relevant text. By leveraging both retrieval and generation, ',
 'RAG models address many of the limitations of traditional models, offering a more robust and flexible approach to various tasks. The process begins with the retrieval of relevant documents or passages from a large corpus. The retrieval component typically uses dense embeddings, which are vector repr',
 'esentations learned to capture the semantic meaning of the text. These embeddings allow the model to measure the similarity between the input query and potential documents, even when they do not share exact keywords. Dense retrieval models, often based on transformer architectures like BERT, excel a',
 't finding contextually relevant information.\n\nOnce the most relevant documents ar

In [11]:
def calculate_embeddings(text, model, tokeniser):
    inputs = tokeniser(text, return_tensors = "pt", truncation = True, padding = True)
    with torch.no_grad():
        outputs = model(**inputs)
    return outputs.last_hidden_state.mean(dim = 1).squeeze().numpy()

In [12]:
def calculate_similarity(embedding1, embedding2):
    similarity = cosine_similarity([embedding1], [embedding2])
    return similarity[0][0]

In [13]:
def test_with_model(chunks, user_query, top_k, model_name = "bert-base-uncased"):
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModel.from_pretrained(model_name)

    user_query_embedding = calculate_embeddings(user_query, model, tokenizer)
    embeddings = [calculate_embeddings(chunk, model, tokenizer) for chunk in chunks]
    count = [i for i in range(len(chunks))]
    scores = {i: calculate_similarity(user_query_embedding, embedding) for embedding, i in zip(embeddings, count)}
    sorted_chunks = sorted(scores.items(), key = lambda item: item[1])[:top_k]
    
    print("Matching chunks:\n")
    for chunk in sorted_chunks:
        print(chunks[chunk[0]])

In [57]:
test_with_model(chunks, user_query="What is rag?", top_k=3)

Matching chunks:

a powerful and versatile approach to natural language processing. By combining the strengths of retrieval-based and generative models, RAG models can provide accurate, contextually relevant, and coherent responses to a wide range of queries. Their ability to handle complex, open-domain tasks, integr
model is trained to optimize both retrieval and generation in a joint manner. This fine-tuning process helps the model adapt to the specific requirements of the target application, improving its performance on real-world tasks.

RAG models have shown significant promise in a variety of applications.
AG models can retrieve relevant medical literature and generate evidence-based recommendations, helping healthcare professionals make informed decisions. In patient education, RAG models can generate personalized explanations of medical conditions or treatments, improving patient understanding and e


In [17]:
# Using langchain
from langchain.text_splitter import CharacterTextSplitter

def langchain_character_chunking(chunk_size):
    text_splitter = CharacterTextSplitter(chunk_size = chunk_size, chunk_overlap = 0)

    with open("../content.txt", "r") as file:
        content = file.read()

    chunks = text_splitter.split_text(content)
    return chunks

In [21]:
chunks = langchain_character_chunking(300)
chunks

Created a chunk of size 944, which is longer than the specified 300
Created a chunk of size 739, which is longer than the specified 300
Created a chunk of size 712, which is longer than the specified 300
Created a chunk of size 739, which is longer than the specified 300
Created a chunk of size 703, which is longer than the specified 300
Created a chunk of size 499, which is longer than the specified 300
Created a chunk of size 925, which is longer than the specified 300
Created a chunk of size 727, which is longer than the specified 300
Created a chunk of size 523, which is longer than the specified 300
Created a chunk of size 586, which is longer than the specified 300
Created a chunk of size 490, which is longer than the specified 300
Created a chunk of size 501, which is longer than the specified 300
Created a chunk of size 576, which is longer than the specified 300
Created a chunk of size 513, which is longer than the specified 300
Created a chunk of size 546, which is longer tha

['Retrieval-Augmented Generation (RAG) is a groundbreaking technique in natural language processing that combines the strengths of retrieval-based and generative models to create a system capable of producing highly accurate and contextually relevant text. By leveraging both retrieval and generation, RAG models address many of the limitations of traditional models, offering a more robust and flexible approach to various tasks. The process begins with the retrieval of relevant documents or passages from a large corpus. The retrieval component typically uses dense embeddings, which are vector representations learned to capture the semantic meaning of the text. These embeddings allow the model to measure the similarity between the input query and potential documents, even when they do not share exact keywords. Dense retrieval models, often based on transformer architectures like BERT, excel at finding contextually relevant information.',
 'Once the most relevant documents are retrieved, t

In [22]:
test_with_model(chunks, user_query="What is rag?", top_k=3)

Matching chunks:

Once the retrieval component has identified the most relevant chunks, the generative component synthesizes this information to produce a coherent response. This process involves integrating the content of the retrieved chunks with the input query, allowing the generative model to generate text that is both accurate and contextually appropriate. The quality of the generated text is heavily influenced by the chunking method used, as well as the embeddings and similarity metrics employed during retrieval. Effective chunking ensures that the retrieved information is coherent and contextually relevant, while high-quality embeddings and accurate similarity calculations ensure that the most relevant information is retrieved.
RAG models have shown significant promise in a variety of applications. In knowledge-intensive tasks, RAG models can provide accurate and contextually relevant answers to complex questions by retrieving and synthesizing information from multiple sources.