## Research And Generate
RAG System written using Ollama and LLangchain Libraries

1. Installing Required Libraries, in this case llangchain for Enconding stuff

In [1]:
!pip install --upgrade --quiet  langchain langchain-community langchainhub gpt4all langchain-chroma pymupdf

[0m

Importing Installed Libraries

In [1]:
# from langchain_community.document_loaders import WebBaseLoader
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.vectorstores.chroma import Chroma
from langchain_community.embeddings import GPT4AllEmbeddings

vectorstore = Chroma()
vectorstore.delete_collection()

  vectorstore = Chroma()


Loading an exmaple Website

In [2]:
loader = PyPDFLoader("nsbm_foc_data.pdf")
pages = []

async for page in loader.alazy_load():
    pages.append(page)

Splitting the downloaded data and storing them in a vectorized database object, Doing this only once is enough

In [3]:
vectorstore = Chroma.from_documents(documents=pages, embedding=GPT4AllEmbeddings())

Testing similarity search

In [4]:
question = "Who is the dean of Faculty of computing in NSBM?"
docs = vectorstore.similarity_search(question)
len(docs)
print(docs)

[Document(metadata={'creationdate': '2025-02-25T14:49:22+05:30', 'creator': 'Writer', 'page': 1, 'page_label': '2', 'producer': 'LibreOffice 7.3', 'source': 'nsbm_foc_data.pdf', 'total_pages': 4}, page_content="Click here\nDepartment of  Software Engineering & Information Systems\nClick here\nDegree Programmes\nThe Faculty of Computing offers a plethora of pathways and specializations for its undergraduates. \nThis vast choice ensures that the academic component of all interests and dream careers are fulfilled\nwhilst also promising a holistic educational experience in any discipline of your choice.\nDepartment of Computer and Data Science\nBSc (Hons) Computer Science – (Plymouth University – United Kingdom)\nBSc (Hons) in Computer Science – (UGC Approved – Offered By NSBM)\nBSc (Honours) in Data Science – (UGC Approved – Offered By NSBM)\nBSc (Hons) in Data Science – (Plymouth University – United Kingdom)\nDepartment of Computer Security and Network Systems\nBSc (Hons) Computer Networ

### *Okay  now the Research part is done*
Lets focus on generating part now

Installing libraries required to run a LLM locally.

In [5]:
from ctransformers import AutoModelForCausalLM
from transformers import AutoTokenizer



In [None]:
!pip install sentencepiece
!pip install numba


Let's Download the model now

In [8]:
!curl -L -o model.gguf https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/resolve/main/llama-2-13b-chat.Q6_K.gguf


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1121  100  1121    0     0   6491      0 --:--:-- --:--:-- --:--:--  6517
100  9.9G  100  9.9G    0     0   148M      0  0:01:08  0:01:08 --:--:--  109M


In [6]:
!ls
!nvidia-smi
!nvcc --version

DevNote.md  model.gguf	nsbm_foc_data.pdf  RAG_V2_ResearchGA.ipynb  README.md
Tue Feb 25 18:29:11 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce MX330           Off |   00000000:01:00.0 Off |                  N/A |
| N/A   68C    P8             N/A / ERR!  |       7MiB /   2048MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+--------------------

Setting the model parameters

In [6]:

model_path = "model.gguf"
model = AutoModelForCausalLM.from_pretrained(model_path, model_type="llama", gpu_layers=22, hf=True)
tokenizer = AutoTokenizer.from_pretrained("cognitivecomputations/TinyDolphin-2.8-1.1b")

  torch.utils._pytree._register_pytree_node(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Testing the model capabilities

In [9]:
print(model.generate("What is the meaning of life",max_new_tokens=512))

AttributeError: 'str' object has no attribute 'shape'

Let's Combine the context and the question

In [11]:
context = "\n".join([doc.page_content for doc in docs])

print(context)
prompt = f"Context:\n{context}\n\nQuestion: {question}\n\nAnswer:"
prompt = f"What is the meaning of life"


input_ids = tokenizer(prompt, return_tensors="pt").to(model.device)

output = model.generate(**input_ids, max_new_tokens=512)
answer = tokenizer.decode(output[0], skip_special_tokens=True)

print(answer)

Click here
Department of  Software Engineering & Information Systems
Click here
Degree Programmes
The Faculty of Computing offers a plethora of pathways and specializations for its undergraduates. 
This vast choice ensures that the academic component of all interests and dream careers are fulfilled
whilst also promising a holistic educational experience in any discipline of your choice.
Department of Computer and Data Science
BSc (Hons) Computer Science – (Plymouth University – United Kingdom)
BSc (Hons) in Computer Science – (UGC Approved – Offered By NSBM)
BSc (Honours) in Data Science – (UGC Approved – Offered By NSBM)
BSc (Hons) in Data Science – (Plymouth University – United Kingdom)
Department of Computer Security and Network Systems
BSc (Hons) Computer Networks – (Plymouth University – United Kingdom)
BSc (Hons) Computer Security – (Plymouth University – United Kingdom)
Bachelor of Information Technology (Major in Cyber Security) – (Victoria University – Australia)
BSc (Hons) in

Number of tokens (513) exceeded maximum context length (512).
Number of tokens (514) exceeded maximum context length (512).
Number of tokens (515) exceeded maximum context length (512).
Number of tokens (516) exceeded maximum context length (512).
Number of tokens (517) exceeded maximum context length (512).
Number of tokens (518) exceeded maximum context length (512).


What is the meaning of life?
- What is the purpose of life?
- What is the ultimate goal of life?
- What is the significance of life?
- What is the essence of life?
- What is the meaning of existence?
- What is the purpose of existence?
- What is the ultimate purpose of existence?
- What is the significance of existence?
- What is the essence of existence?
- What is the meaning of the universe?
- What is the purpose of the universe?
- What is the ultimate purpose of the universe?
- What is the significance of the universe?
- What is the essence of the universe?
- What is the meaning of the cosmos?
- What is the purpose of the cosmos?
- What is the ultimate purpose of the cosmos?
- What is the significance of the cosmos?
- What is the essence of the cosmos?
- What is the meaning of the universe as we know it?
- What is the purpose of the universe as we know it?
- What is the ultimate purpose of the universe as we know it?
- What is the significance of the universe as we know it?
- What i