# **RAG APPLICATION FOR TEAMSOLVE**

In [13]:
import chromadb
from pypdf import PdfReader
from langchain.text_splitter import RecursiveCharacterTextSplitter, SentenceTransformersTokenTextSplitter
from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction

In [14]:
reader = PdfReader("documents/RAG Input Doc.pdf")
pdf_texts = [p.extract_text().strip() for p in reader.pages]

# Filter the empty strings
pdf_texts = [text for text in pdf_texts if text]

print(pdf_texts[0])

Title:  MeshAnything: Artist -Created Mesh Generation with Autoregressive Transformers  
Authors:  buaacyw/meshanything  
Date:  14 Jun 2024  
Description:  Recently, 3D assets created via reconstruction and generation have matched the 
quality of manually crafted assets, highlighting their potential for replacement.  
Stats:  417, 5.09 stars / hour  
Categories:  Decoder  
Links:  Paper, Code  
 
Title:  Accessing GPT -4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self -
refine with LLaMa -3 8B  
Authors:  trotsky1997/mathblackbox  
Date:  11 Jun 2024  
Description:  This paper introduces the MCT Self -Refine algorithm, an innovative integration of 
Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS), designed to enhance 
performance in complex mathematical reasoning tasks.  
Stats:  279, 2.35 stars / hour  
Categories:  Decision Making, GSM8K +2  
Links:  Paper, Code  
 
Title:  TextGrad: Automatic 'Differentiation' via Text  
Authors:  zou-group/textgrad 

##### The reason of first doing character split and then doing SentenceTransformersTokenTextSplitter is that ```Large documents may not fit entirely into memory. By first breaking the text into character-level chunks, you can process smaller portions at a time, reducing the memory requirements for each step.```

In [19]:
character_splitter = RecursiveCharacterTextSplitter(
    # It will split on the basis of these below characters like newline etc
    separators=["\n\n", "\n", ". ", " ", ""],
    # If after splitting at separators, it got a big length then it will break down into chunk size of 1000 characters maximum
    chunk_size=550,
    chunk_overlap=0
)
character_split_texts = character_splitter.split_text('\n\n'.join(pdf_texts))

print(f"\nTotal chunks: {len(character_split_texts)}")


Total chunks: 5


##### ```Character Splitter is not enough due the reason that the embedder which we have to use has limited 256 characters or tokens context window```

In [20]:
token_splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0, tokens_per_chunk=256) 

token_split_texts = []
for text in character_split_texts:
    token_split_texts += token_splitter.split_text(text)

print(f"\nTotal chunks: {len(token_split_texts)}")


Total chunks: 5


In [21]:
embedding_function = SentenceTransformerEmbeddingFunction()

In [23]:
chroma_client = chromadb.Client()
chroma_collection = chroma_client.create_collection("rag_app1", embedding_function=embedding_function)

ids = [str(i) for i in range(len(token_split_texts))]

chroma_collection.add(ids=ids, documents=token_split_texts)
chroma_collection.count()

5

### Question 1

In [24]:
query = "Which paper received the highest number of stars per hour?"

# Here chroma automatically embeds using the embedding function we have used above the query and give retrieved documents
results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

title : meshanything : artist - created mesh generation with autoregressive transformers authors : buaacyw / meshanything date : 14 jun 2024 description : recently, 3d assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. stats : 417, 5. 09 stars / hour categories : decoder links : paper, code title : accessing gpt - 4 level mathematical olympiad solutions via monte carlo tree self - refine with llama - 3 8b


categories : language modelling links : paper, code title : videollama 2 : advancing spatial - temporal modeling and audio understanding in video - llms authors : damo - nlp - sg / videollama2 date : 11 jun 2024 description : in this paper, we present the videollama 2, a set of video large language models ( video - llms ) designed to enhance spatial - temporal modeling and audio understanding in video and audio - oriented tasks. stats : 318, 1. 50 stars / hour categories : multiple - cho

In [26]:
import replicate

replicate = replicate.Client(api_token='r8_RL3XYRWMZBrlqnnn3xPynfEH4Mc40ej1RYY8S')

In [27]:
def rag(query, retrieved_documents):
    information = "\n\n".join(retrieved_documents)

    output = replicate.run(
        "meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3",
        input={
            "prompt":f"You are a helpful expert research assistant. Your users are asking questions about information contained in reports or files."
                "You will be shown the user's question, and the relevant information from the files or reports. Answer the user's question using only this information." 
                f"Question: {query}. \n Information: {information}",
            }
    )

    ans = []
    for item in output:
        ans.append(item)

    str1 = ''.join(str(e) for e in ans)
    return str1

In [28]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 The paper that received the highest number of stars per hour is "Textgrad : Automatic Differentiation via Text" by Zou-Group/Textgrad, with 485 stars and a rate of 2.04 stars per hour.


### Question 2

In [31]:
query = "What is the focus of the 'MeshAnything' project?"

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

title : meshanything : artist - created mesh generation with autoregressive transformers authors : buaacyw / meshanything date : 14 jun 2024 description : recently, 3d assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. stats : 417, 5. 09 stars / hour categories : decoder links : paper, code title : accessing gpt - 4 level mathematical olympiad solutions via monte carlo tree self - refine with llama - 3 8b


categories : language modelling links : paper, code title : videollama 2 : advancing spatial - temporal modeling and audio understanding in video - llms authors : damo - nlp - sg / videollama2 date : 11 jun 2024 description : in this paper, we present the videollama 2, a set of video large language models ( video - llms ) designed to enhance spatial - temporal modeling and audio understanding in video and audio - oriented tasks. stats : 318, 1. 50 stars / hour categories : multiple - cho

In [32]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 Based on the information provided, the focus of the 'MeshAnything' project appears to be on creating 3D assets using autoregressive transformers, with the goal of matching the quality of manually crafted assets. The project's description highlights the potential for replacement of manual asset creation with AI-generated assets.


### Question 3

In [33]:
query = "Which paper discusses the integration of Large Language Models with Monte Carlo Tree Search?"

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

authors : trotsky1997 / mathblackbox date : 11 jun 2024 description : this paper introduces the mct self - refine algorithm, an innovative integration of large language models ( llms ) with monte carlo tree search ( mcts ), designed to enhance performance in complex mathematical reasoning tasks. stats : 279, 2. 35 stars / hour categories : decision making, gsm8k + 2 links : paper, code title : textgrad : automatic'differentiation'via text authors : zou - group / textgrad date : 11 jun 2024


title : scalable matmul - free language modeling authors : ridgerchu / matmulfreellm date : 4 jun 2024 description : our experiments show that our proposed matmul - free models achieve performance on - par with state - of - the - art transformers that require far more memory during inference at a scale up to at least 2. 7b parameters. stats : 2, 140, 1. 98 stars / hour


categories : language modelling links : paper, code title : videollama 2 : advancing spatial - temporal modeling and audio unders

In [34]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 The paper that discusses the integration of Large Language Models with Monte Carlo Tree Search is "Textgrad: Automatic Differentiation via Text" by Trotsky1997/MathBlackbox, dated June 11, 2024. The paper introduces the mct-self-refine algorithm, which integrates Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS) to enhance performance in complex mathematical reasoning tasks.


### Question 4

In [35]:
query = "What advancements does the 'VideoLLaMA 2' paper propose?"

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

categories : language modelling links : paper, code title : videollama 2 : advancing spatial - temporal modeling and audio understanding in video - llms authors : damo - nlp - sg / videollama2 date : 11 jun 2024 description : in this paper, we present the videollama 2, a set of video large language models ( video - llms ) designed to enhance spatial - temporal modeling and audio understanding in video and audio - oriented tasks. stats : 318, 1. 50 stars / hour categories : multiple - choice, question answering + 3 links : paper, code


title : meshanything : artist - created mesh generation with autoregressive transformers authors : buaacyw / meshanything date : 14 jun 2024 description : recently, 3d assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. stats : 417, 5. 09 stars / hour categories : decoder links : paper, code title : accessing gpt - 4 level mathematical olympiad solutions via m

In [36]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 The 'VideoLLaMA 2' paper proposes several advancements in spatial-temporal modeling and audio understanding for video language models (ViLlMs). The proposed model, VideoLLaMA 2, is designed to enhance performance in video and audio-oriented tasks, including question answering, multiple-choice, and decision-making.

The paper introduces several novel techniques, including:

1. Spatial-Temporal Modeling: VideoLLaMA 2 incorporates a spatial-temporal graph convolutional network (ST-GCN)


### Question 5

In [37]:
query = "Which paper was published most recently?"

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

authors : trotsky1997 / mathblackbox date : 11 jun 2024 description : this paper introduces the mct self - refine algorithm, an innovative integration of large language models ( llms ) with monte carlo tree search ( mcts ), designed to enhance performance in complex mathematical reasoning tasks. stats : 279, 2. 35 stars / hour categories : decision making, gsm8k + 2 links : paper, code title : textgrad : automatic'differentiation'via text authors : zou - group / textgrad date : 11 jun 2024


title : meshanything : artist - created mesh generation with autoregressive transformers authors : buaacyw / meshanything date : 14 jun 2024 description : recently, 3d assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. stats : 417, 5. 09 stars / hour categories : decoder links : paper, code title : accessing gpt - 4 level mathematical olympiad solutions via monte carlo tree self - refine with llama - 3 

In [38]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 The paper that was published most recently is "meshanything: artist-created mesh generation with autoregressive transformers" by buaacyw/meshanything, which was published on June 14, 2024.


### Question 6

In [39]:
query = "Identify a paper that deals with language modeling and its scalability."

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

title : scalable matmul - free language modeling authors : ridgerchu / matmulfreellm date : 4 jun 2024 description : our experiments show that our proposed matmul - free models achieve performance on - par with state - of - the - art transformers that require far more memory during inference at a scale up to at least 2. 7b parameters. stats : 2, 140, 1. 98 stars / hour


authors : trotsky1997 / mathblackbox date : 11 jun 2024 description : this paper introduces the mct self - refine algorithm, an innovative integration of large language models ( llms ) with monte carlo tree search ( mcts ), designed to enhance performance in complex mathematical reasoning tasks. stats : 279, 2. 35 stars / hour categories : decision making, gsm8k + 2 links : paper, code title : textgrad : automatic'differentiation'via text authors : zou - group / textgrad date : 11 jun 2024


categories : language modelling links : paper, code title : videollama 2 : advancing spatial - temporal modeling and audio unders

In [40]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 The paper that deals with language modeling and its scalability is "Scalable MatMul-Free Language Modeling" by Ridgerchu and MatMulFreeLLM. The paper was published on June 4, 2024, and the authors propose a matmul-free language model that achieves performance on par with state-of-the-art transformers while requiring less memory during inference. The paper also presents experiments that show the scalability of the proposed model up to 2.7 billion parameters.


### Question 7

In [41]:
query = "Which paper aims at improving accuracy in Google-Proof Question Answering?"

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

description : without modifying the framework, textgrad improves the zero - shot accuracy of gpt - 4o in google - proof question answering, yields significant relative performance gain in optimizing leetcode - hard coding problem solutions, improves prompts for reasoning, desi gns new druglike small molecules with desirable in silico binding, and designs radiation oncology treatment plans with high specificity. stats : 485, 2. 04 stars / hour categories : question answering, specificity links : paper, code


authors : trotsky1997 / mathblackbox date : 11 jun 2024 description : this paper introduces the mct self - refine algorithm, an innovative integration of large language models ( llms ) with monte carlo tree search ( mcts ), designed to enhance performance in complex mathematical reasoning tasks. stats : 279, 2. 35 stars / hour categories : decision making, gsm8k + 2 links : paper, code title : textgrad : automatic'differentiation'via text authors : zou - group / textgrad date : 11 

In [42]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 The paper that aims at improving accuracy in Google-Proof Question Answering is "Textgrad: Automatic Differentiation'via Text" by Trotsky1997/MathBlackBox. This paper introduces a new algorithm called MCT Self-Refine, which integrates Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS) to enhance performance in complex mathematical reasoning tasks. The authors claim that their approach yields significant relative performance gain in optimizing LeetCode-hard coding problem solutions and improves prompts for reasoning. They also demonstrate the effect


### Question 8

In [43]:
query = "List the categories covered by the paper titled 'TextGrad: Automatic 'Differentiation' via Text'."

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

authors : trotsky1997 / mathblackbox date : 11 jun 2024 description : this paper introduces the mct self - refine algorithm, an innovative integration of large language models ( llms ) with monte carlo tree search ( mcts ), designed to enhance performance in complex mathematical reasoning tasks. stats : 279, 2. 35 stars / hour categories : decision making, gsm8k + 2 links : paper, code title : textgrad : automatic'differentiation'via text authors : zou - group / textgrad date : 11 jun 2024


description : without modifying the framework, textgrad improves the zero - shot accuracy of gpt - 4o in google - proof question answering, yields significant relative performance gain in optimizing leetcode - hard coding problem solutions, improves prompts for reasoning, desi gns new druglike small molecules with desirable in silico binding, and designs radiation oncology treatment plans with high specificity. stats : 485, 2. 04 stars / hour categories : question answering, specificity links : pap

In [44]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 Sure, I'd be happy to help! The paper titled "TextGrad: Automatic 'Differentiation' via Text" covers the following categories:

* Decision making
* GSM8K (a dataset of mathematical problems)
* Question answering
* Specificity
* Language modeling

The authors propose a method called TextGrad, which uses a combination of large language models and Monte Carlo Tree Search to enhance performance in complex mathematical reasoning tasks. They evaluate the effectiveness of TextGrad on several benchmarks, including Google Proof Question Answering and LeetCode-hard coding problem solutions


## **Additional Q/A from a PDF Research Paper**

In [45]:
reader = PdfReader("documents/Research paper.pdf")
pdf_texts = [p.extract_text().strip() for p in reader.pages]

pdf_texts = [text for text in pdf_texts if text]

print(pdf_texts[0])

979-8-3503-9478-8/24/$31.00 ©2024 IEEE Machine Learning and Mediapipe Assisted Sign 
Gesture-based Smart Keyboard for Deaf and Blind 
People 
Arsalan Ali  
Department of Biomedical Engineering  
University of Engineering and 
Technology, 
Lahore, 54890, Pakistan 
2019bme111@uet.edu.pk Muhammad Hamza Zulfiqar  
Department of Biomedical Engineering  
University of Engineering and 
Technology, 
Lahore, 54890, Pakistan  
hamzazulfiqar@uet.edu.pkMuhammad Qasim Mehmood 
Micro Nano Lab, Electrical 
Engineering Department, Information 
Technology University (ITU) of the 
Punjab, Ferozepur Road, Lahore 54600, 
Pakistan  
qasim.mehmood@itu.edu.pk 
 
Abstract — There are 1.1 billion deaf and 2.2 billion blind 
people, and many people have speech impairments. These 
disabilities cause them to be unable to communicate with others 
and use modern technologies. This paper presents a real-time 
smart touchless keypad by which people with blindness, 
deafness, and speech impairments can interact with t

##### The reason of first doing character split and then doing SentenceTransformersTokenTextSplitter is that ```Large documents may not fit entirely into memory. By first breaking the text into character-level chunks, you can process smaller portions at a time, reducing the memory requirements for each step.```

In [47]:
character_splitter = RecursiveCharacterTextSplitter(
    separators=["\n\n", "\n", ". ", " ", ""],
    chunk_size=1000,
    chunk_overlap=0
)
character_split_texts = character_splitter.split_text('\n\n'.join(pdf_texts))

print(f"\nTotal chunks: {len(character_split_texts)}")


Total chunks: 32


##### ```Character Splitter is not enough due the reason that the embedder which we have to use has limited 256 characters or tokens context window```

In [48]:
token_splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0, tokens_per_chunk=256) # tokens_per_chunk is context window which means that it one chunk would have 256 tokens

token_split_texts = []
for text in character_split_texts:
    token_split_texts += token_splitter.split_text(text)

print(f"\nTotal chunks: {len(token_split_texts)}")


Total chunks: 36


In [49]:
embedding_function = SentenceTransformerEmbeddingFunction()

In [54]:
chroma_client = chromadb.Client()
chroma_collection = chroma_client.create_collection("rag_app2", embedding_function=embedding_function)

ids = [str(i) for i in range(len(token_split_texts))]

chroma_collection.add(ids=ids, documents=token_split_texts)
chroma_collection.count()

In [53]:
query = "Which algorithm was used for Machine Learning and Mediapipe Assisted Sign Gesture-based Smart Keyboard for Deaf and Blind People?"

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

deafness, and speech impairments can interact with the computer with the help of 37 hand gestures using a single camera. the presented solution uses a hand tracking module from the mediapipe library for feature extraction, a random forest algorithm for the classification of gestures, and a pyautogui library for controlling the mouse and keyboard according to the classified hand gestures. the presented solution has achieved an accuracy of 99. 66 % on the test set, indicating that this system can also detect gestures for communication with deaf or speech - impaired people. overall, this system employs computer vision and machine learning techniques to improve the lives of the deaf, blind, and mute people. keywords — sign gesture, mediapipe, smart keyboard, deaf, blind, human - computer interaction i. introduction communication is an effective tool to share ideas, thoughts, and feelings. communication between humans makes life


979 - 8 - 3503 - 9478 - 8 / 24 / $ 31. 00 ©2024 ieee machine

In [55]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 The algorithm used for Machine Learning and Mediapipe Assisted Sign Gesture-based Smart Keyboard for Deaf and Blind People is a Random Forest algorithm. It is used for the classification of gestures.


In [56]:
query = "How many sign were used in this research article ?"

results = chroma_collection.query(query_texts=[query], n_results=5)
retrieved_documents = results['documents'][0]

for document in retrieved_documents:
    print(document)
    print('\n')

are placed in a specific 3d space, and the receiving nodes, after receiving the message from the transmitting nodes, classify the 24 signs of sign language with an accuracy of 97 % [ 13 ]. k. bhat presented a smart glove employing flex sensors to detect finger movement. the smart glove interprets the hand's signs using flex sensors and converts them into meaningful messages that can be heard with the android app 2024 5th international conference on advancements in computational sciences ( icacs ) | 979 - 8 - 3503 - 9478 - 8 / 24 / $ 31. 00 ©2024 ieee | doi : 10. 1109 / icacs60934. 2024. 10473252 authorized licensed use limited to : univ of engineering and technology lahore. downloaded on march 22, 2024 at 07 : 13 : 36 utc from ieee xplore. restrictions apply.


want to write ‘ c ’ so, we have shown the ‘ c ’ sign to the camera, and as expected, the prediction is the lett er ‘ c ’, so the pyautigui has clicked the ‘ c ’ and we have also got alphabet ‘ c ’ in our text document. in fig. 8

In [57]:
output = rag(query=query, retrieved_documents=retrieved_documents)
print(output)

 Based on the information provided, the answer to the question "How many signs were used in this research article?" is 37.

The article mentions that the proposed solution uses 37 signs, which is more than the 26 signs used in most present solutions. The authors also mention that they have achieved a decent accuracy of 99.66% with 37 signs, which is comparable to the accuracy of other solutions that use fewer signs.

Therefore, the answer to the question is 37, which is the number of signs used in the research article.
