<a href="https://colab.research.google.com/github/Sweta-Das/LangChain-HuggingFace-LLM/blob/main/RAG_using_Mistral_LangChain_ChromaDB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%pip install -q einops
%pip install -q chromadb
%pip install -q langchain
%pip install -q accelerate
%pip install -q bitsandbytes
%pip install -q transformers
%pip install -q llama-cpp-python
%pip install -q sentence-transformers

In [None]:
!pip install accelerate
!pip install -i https://pypi.org/simple/ bitsandbytes

*einops* -> (Einstein Operations); a powerful tensor library for manipulation. It abstracts the complexity of tensor reshaping, slicing & rearranging operations, making code cleaner & efficient. <br>
*chromadb* -> vector database <br>
*langchain* -> framework that serve as building blocks of generative AI <Br>
*accelerate* -> a lib. that enables the same PyTorch code to be run across any distributed config. It makes training & inference at scale, efficient, simple and adaptable. <br>
*bitsandbytes* -> lightweight Python wrapper around CUDA custom functions, in particular 8-bit optimizers, matrix multiplication (LLM.int8()), and 8 & 4-bit quantization functions. <Br>
*transformers* ->  Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

In [22]:
import torch
import chromadb
import accelerate
import transformers
from time import time
from torch import cuda, bfloat16
from google.colab import drive
from google.colab import userdata
from langchain.chains import RetrievalQA
from langchain.vectorstores import Chroma
from langchain.llms import HuggingFacePipeline
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from transformers import AutoModelForCausalLM, AutoTokenizer

In [3]:
# Defining the model, device and bitsandbytes config
model_id = 'mistralai/Mixtral-8x7B-Instruct-v0.1'

device = f'cuda: {cuda.current_device()}' if cuda.is_available() else 'cpu'
device

'cuda: 0'

In [5]:
# Importing HuggingFaceHub API Token
from google.colab import userdata

HF_KEY = userdata.get('HF_TOKEN')

In [7]:
# Mounting drive
from google.colab import drive

drive.mount('/content/drive/')

Mounted at /content/drive/


### Using LlamaCPP Approach

In [11]:
# Loading model

from langchain.llms import LlamaCpp

model = LlamaCpp(
    streaming=True,
    model_path = '/content/drive/MyDrive/LLM_Model/mistral-7b-instruct-v0.1.Q3_K_S.gguf',
    temperature = 0.75,
    top_p = 1,
    n_ctx = 4096
)
model

llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from /content/drive/MyDrive/LLM_Model/mistral-7b-instruct-v0.1.Q3_K_S.gguf (version GGUF V2)
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.1
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - k

LlamaCpp(client=<llama_cpp.llama.Llama object at 0x795dcfed70d0>, model_path='/content/drive/MyDrive/LLM_Model/mistral-7b-instruct-v0.1.Q3_K_S.gguf', n_ctx=4096, temperature=0.75, top_p=1.0)

Querying Model Directly

In [12]:
query = "What are some of the most popular Yoga asanas?"
prompt = f"""
<|system|>
You are a Yoga GPT that gives advices and suggestions related to Yoga. Please give correct and detailed stepwise instructions.
</s>
<|user|>
{query}
</s>
<|Yoga GPT|>
"""

t1 = time()
response = model.invoke(prompt)
t2 = time()
print(f"AI: {response} /n Time taken: {t2-t1} sec")


llama_print_timings:        load time =    4245.89 ms
llama_print_timings:      sample time =     145.39 ms /   256 runs   (    0.57 ms per token,  1760.76 tokens per second)
llama_print_timings: prompt eval time =   51329.55 ms /    69 tokens (  743.91 ms per token,     1.34 tokens per second)
llama_print_timings:        eval time =  171316.80 ms /   255 runs   (  671.83 ms per token,     1.49 tokens per second)
llama_print_timings:       total time =  223749.70 ms /   324 tokens


AI: Some of the most popular Yoga asanas (postures) include:

1. Downward-Facing Dog (Adho Mukha Svanasana): This pose helps to stretch and strengthen the muscles in your back, legs, and arms. It also helps to improve circulation and relieve stress.
2. Warrior II (Virabhadrasana II): This pose helps to strengthen the muscles in your legs, ankles, and feet. It also helps to improve balance and focus.
3. Tree Pose (Vrksasana): This pose helps to improve balance and stability, and can also help to relieve sciatica pain.
4. Child's Pose (Balasana): This pose is a gentle stretch for the hips, thighs, and ankles. It can also help to relieve stress and anxiety.
5. Sun Salutation (Surya Namaskar): This is a sequence of 12 poses that helps to warm up the body and improve flexibility.
6. Corpse Pose (Savasana): This pose is a deep relaxation pose that allows you to rest and rejuvenate.
7. Lotus P /n Time taken: 223.77555179595947 sec


## RAG Implementation
Using Chroma DB as vector database

In [16]:
%pip install -q PyPDF2 PyPDF

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/290.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.7/290.4 kB[0m [31m1.9 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m286.7/290.4 kB[0m [31m4.9 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m290.4/290.4 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [14]:
import PyPDF2

# Reading PDF and extracting ToC
def extract_ToC(pdf_path, start_page, end_page):

  with open(pdf_path, 'rb') as file:
    pdf_reader = PyPDF2.PdfReader(file)

    toc_entries = []

    for page in range(start_page, end_page+1):
      page = pdf_reader.pages[page]
      text = page.extract_text()
      text = text.replace("vii", "").replace("viii", "").replace("i17", "17")

      toc_lines = text.splitlines()

      for i in toc_lines:
        toc_entries.append(i)
    return toc_entries

pdf_path = "/content/drive/MyDrive/LLM_Model/Yoga_Education_for_Children_Vol_1.pdf"
toc = extract_ToC(pdf_path, 7, 8)
toc

['Contents',
 'Introduction  1',
 'Yoga and Education  ',
 ' 1. The Need for a Y oga-Based Education System  13',
 ' 2. Yoga and Children’s Problems  22',
 ' 3. Yoga with Pre-School Children  25',
 ' 4. Yoga Lessons Begin at Age Eight  31',
 ' 5. Student Unr est and Its Remedy  34',
 ' 6. Yoga and the Youth Problem  39',
 ' 7. Better Ways of Educatio n 45',
 ' 8. Yoga at School  50',
 ' 9. Yoga and Education  57',
 '10. Questions and Answers  65',
 'Yoga as Therapy  ',
 '11. Yoga for Emotional Disturbances  77',
 '12. Yoga for the Disabled  83',
 '13. Yoga Benefits Juvenile Diabetes  87',
 'Practices  ',
 '14. Yoga Techniques for Pre-School Children  93',
 '15. Yoga Techniques for 7–14 Y ear-Olds  101',
 '16. Yoga Techniques for the Classroom  110',
 '17. Introduction to Asana  133',
 '18. Pawanmuktasana Series  139',
 'Pawanmuktasana 1: Anti-Rheumatic Asanas  141',
 'Pawanmuktasana 2: Anti-Gastric Asanas  156',
 'Pawanmuktasana 3: Energizing Asanas  165',
 '19.  Eye Exercises  171',
 

In [19]:
# Loading documents
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("/content/drive/MyDrive/LLM_Model/Yoga_Education_for_Children_Vol_1.pdf")
pages = loader.load()

In [23]:
# Embedding text into vector database - ChromaDB
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2",
                                   model_kwargs={"device": "cpu"})
vector_db = Chroma.from_documents(documents=pages,
                                  embedding=embeddings,
                                  persist_directory="chroma_db")

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [24]:
# Initializing chain
retriever = vector_db.as_retriever()

qa = RetrievalQA.from_chain_type(
    llm = model,
    chain_type = "stuff",
    retriever = retriever,
    verbose = True
)

In [27]:
query = "What are some of the most popular Yoga that benefits Juvenile Diabetes?"
prompt = f"""
<|system|>
You are a Yoga GPT that gives advices and suggestions related to Yoga. Please give correct and detailed stepwise instructions.
</s>
<|user|>
{query}
</s>
<|Yoga GPT|>
"""

t1 = time()
response = qa.invoke(prompt)
t2 = time()
print(f"AI: {response} /n Time taken: {round(t2-t1, 3)} sec")



[1m> Entering new RetrievalQA chain...[0m


Llama.generate: prefix-match hit

llama_print_timings:        load time =    4245.89 ms
llama_print_timings:      sample time =     151.32 ms /   256 runs   (    0.59 ms per token,  1691.80 tokens per second)
llama_print_timings: prompt eval time =  743629.20 ms /  1288 tokens (  577.35 ms per token,     1.73 tokens per second)
llama_print_timings:        eval time =  186760.26 ms /   256 runs   (  729.53 ms per token,     1.37 tokens per second)
llama_print_timings:       total time =  931913.39 ms /  1544 tokens



[1m> Finished chain.[0m
AI: {'query': '\n<|system|>\nYou are a Yoga GPT that gives advices and suggestions related to Yoga. Please give correct and detailed stepwise instructions.\n</s>\n<|user|>\nWhat are some of the most popular Yoga that benefits Juvenile Diabetes?\n</s>\n<|Yoga GPT|>\n', 'result': "\n\nWhile there isn't a specific yoga style that has been proven to cure or prevent juvenile diabetes, certain yoga styles and poses can help manage symptoms and improve overall health. Here are some popular yoga styles and poses that may be beneficial for individuals with juvenile diabetes:\n\n1. Hatha Yoga: This is a gentle, slow-paced style of yoga that focuses on breath control and physical relaxation. It can help lower blood sugar levels and reduce stress, which can contribute to better insulin resistance.\n2. Vinyasa Yoga: This style of yoga involves flowing movements and linking breath with movement. It can improve blood circulation and flexibility, which can be helpful for ind

In [28]:
import json

# Checking doc sources
docs = vector_db.similarity_search(query)
print(f"Query: {query}")
print(f"Retrieved docs: {len(docs)}")

for doc in docs:
    doc_details = doc.to_json()['kwargs']
    print("Source: ", doc_details['metadata']['source'])
    print("Text: ", doc_details['page_content'], "\n")

Query: What are some of the most popular Yoga that benefits Juvenile Diabetes?
Retrieved docs: 4
Source:  /content/drive/MyDrive/LLM_Model/Yoga_Education_for_Children_Vol_1.pdf
Text:  8713
Yoga Benefits 
Juvenile Diabetes
Dr Swami Karmananda Saraswati
A few years ago diabetes among children was a rare  
  phenomenon. Diabetes was usually a disease of old 
age, first appearing in those around fifty or sixty years, 
who consumed too much sugar and starch in their diet, carried too much weight and did not exercise. This state of exhaustion of the pancreas can be controlled and restored by yoga therapy and dietary regulations which resensitize the body’s tissues to its own insulin, rejuvenate the pancreas and digestive gland and restore correct body weight. We always have a few of these patients in the ashram, where they learn to stabilize their blood sugar levels through yogic practices.
 However, times have changed and the spectrum of  
diabetes is changing. Medical students report that 