## Hello, Here's How to use RAG w HF Models

Install some dependencies

In [None]:
!pip install -q -U bitsandbytes==0.42.0
!pip install -q -U peft==0.8.2
!pip install -q -U trl==0.7.10
!pip install -q -U accelerate==0.27.1
!pip install -q -U datasets==2.17.0
!pip install -q -U transformers==4.38.1
!pip install langchain sentence-transformers chromadb langchainhub

!pip install langchain-community langchain-core

Get the Model You Want

In [38]:
from transformers import AutoModelForCausalLM, AutoTokenizer
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
import chromadb

Define Variables

In [39]:
import os

# set your own hf token then fetch it here
hf_token = os.getenv("HUGGINGFACEHUB_API_TOKEN")

model_name = "meta-llama/Llama-3.2-1B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

Define Data Sources

In [40]:
import pandas as pd

file_names = [
    "study_permit_general",
    "work_permit_student_general",
    "work-and-education-data",
    "vancouver_transit_qa_pairs",
    "permanent_residence_student_general",
    "data-with-sources"
]

all_texts = []

for file in file_names:
  path = f'./sample_data/{file}.csv'
  try:
        df = pd.read_csv(path)
        df.columns = df.columns.str.lower()
        if 'question' in df.columns and 'answer' in df.columns:
            df['text'] = df['question'].fillna('') + ' ' + df['answer'].fillna('')
        elif 'theme' in df.columns and 'content' in df.columns:
            df['text'] = df['theme'].fillna('') + ' ' + df['content'].fillna('')
        else:
            print(f"noo text columns in {file}")
            continue
        all_texts.extend(df['text'].tolist())
  except Exception as e:
        print(f"Error loading {file}: {e}")


In [None]:
# forgot one dependency
!pip install chromadb

Set Embedding Model, and Chroma Client to Interact w Vector Database and Create Collections

In [None]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
import chromadb

# pt model for geenrating embeddings used pretty often
embedding_model = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

# persistent client to interact w chroma vector store
client = chromadb.PersistentClient(path="./chroma_db")

# create collections for each data (for testing rn)
collection = client.get_or_create_collection(name="combined_docs")

for i, text in enumerate(all_texts):
    collection.add(ids=[str(i)], documents=[text])

print(f"successfully added {len(all_texts)} documents to Chroma DB.")

Function to add data to collection by embedding them

In [None]:
def add_data_to_collection(collection, texts):
    for idx, text in enumerate(texts):
        try:
            embeddings = embedding_model.embed_documents([text])[0]
            collection.add(
                ids=[str(idx)],
                embeddings=[embeddings],
                documents=[text]
            )
        except Exception as e:
            print(f"Error on index {idx}: {e}")

# add data to collections
add_data_to_collection(collection, all_texts)

Function to now match for releveant document

In [41]:
def get_relevant_documents(query, n_results=3):
    try:
        query_embeddings = embedding_model.embed_documents([query])[0]

        results = collection.query(query_embeddings=[query_embeddings], n_results=n_results)
        print(f"Query Results: {results}")

        return results['documents'][0] if results['documents'] else []
    except Exception as e:
        print(f"Error querying: {e}")
        return []

Generate Answer

In [42]:
def generate_answer(query):
    inputs = tokenizer(query, return_tensors="pt")
    base_output = model.generate(inputs["input_ids"], max_length=150, temperature=0.1)
    response_before_rag = tokenizer.decode(base_output[0], skip_special_tokens=True)

    relevant_documents = get_relevant_documents(query)
    if not relevant_documents:
        return {
            "Before RAG Response": response_before_rag,
            "After RAG Response": "Sorry, no relevant documents found."
        }

    relevant_texts = "\n\n".join([doc[:500] for doc in relevant_documents])
    rag_prompt = f"""
    You are a helpful assistant for international students. Here are relevant documents:

    {relevant_texts}

    Please respond to the following question based on the documents above. Be conversational but concise:

    Question: {query}

    Answer:
    """

    rag_inputs = tokenizer(rag_prompt, return_tensors="pt")
    rag_output = model.generate(rag_inputs["input_ids"], max_length=500, temperature=0.1)
    response_after_rag = tokenizer.decode(rag_output[0], skip_special_tokens=True)

    return {
        "Before RAG Response": response_before_rag,
        "After RAG Response": response_after_rag
    }

Example Usage

In [43]:
test_queries = [
    "How do I apply for a study permit in Canada?",
    "Can I work while studying on a student visa?",
    "What happens if my study permit expires before I finish my program?",
    "Do I need a new study permit if I change schools?",
    "How long does it take to process a Canadian study permit?",
    "Am I allowed to work off-campus as an international student?",
    "How many hours can I work while studying in Canada?",
    "What documents do I need to apply for a co-op work permit?",
    "Can I work in Canada after I graduate?",
    "What is a Post-Graduation Work Permit (PGWP) and how do I apply?",
    "How do I apply for MSP (Medical Services Plan) in British Columbia?",
    "Is MSP mandatory for international students?",
    "What healthcare services are covered under MSP?",
    "What should I do if I get sick and don’t have insurance yet?",
    "Can I use private health insurance instead of MSP?",
    "What are my options for student housing in Vancouver?",
    "How much does rent typically cost for international students?",
    "What should I check before signing a lease in Canada?",
    "Are there any student discounts for accommodation?",
    "How can I find a roommate in Canada?",
    "How do I open a bank account as an international student?",
    "What documents do I need to get a student bank account?",
    "Can I get a credit card as an international student?",
    "How do I send money to my home country from Canada?",
    "What scholarships are available for international students?",
    "How does the Compass Card work for transit in Vancouver?",
    "Am I eligible for a U-Pass as an international student?",
    "What is the best way to get around Vancouver on a budget?",
    "Where can I find the bus and SkyTrain schedules?",
    "Are there student discounts for public transportation?",
    "Can I apply for permanent residence after graduating?",
    "What is the Canadian Experience Class (CEC) immigration program?",
    "How can I improve my chances of getting permanent residence?",
    "What are the eligibility requirements for Express Entry?",
    "Does having a Canadian degree help with PR applications?"
]

for idx, user_query in enumerate(test_queries, start=1):
    responses = generate_answer(user_query)

    print("\n" + "="*50)
    print(f"Test Query {idx}: {user_query}")
    print("="*50)
    print("Response Before RAG:\n", responses["Before RAG Response"])
    print("\nResponse After RAG:\n", responses["After RAG Response"])
    print("="*50 + "\n")


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Query Results: {'ids': [['848', '865', '749']], 'embeddings': None, 'documents': [['Study Permit Application Process When you apply online, your personalized checklist will include the Application to Change Conditions, Extend my Stay or Remain in Canada as a Student [IMM 5709]. On that form, check the box to Apply for a study permit for the first time or extend my study permit and fill out the rest of the required fields.', 'Study Permit Application Process Most foreign nationals already in Canada can no longer apply for a study permit at a port of entry. You must apply online for', 'Extending a Study Permit If you want to keep studying in Canada, you must']], 'uris': None, 'data': None, 'metadatas': [[None, None, None]], 'distances': [[0.33834877610206604, 0.4000810384750366, 0.4008612334728241]], 'included': [<IncludeEnum.distances: 'distances'>, <IncludeEnum.documents: 'documents'>, <IncludeEnum.metadatas: 'metadatas'>]}


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.



Test Query 1: How do I apply for a study permit in Canada?
Response Before RAG:
 How do I apply for a study permit in Canada? The process typically involves several steps, which I'll outline below.

**Step 1: Determine if you need a study permit**

* Check if you're eligible for a study permit by visiting Immigration, Refugees and Citizenship Canada (IRCC) and using their online tool.
* You can also check the IRCC website for specific requirements and eligibility criteria.

**Step 2: Gather required documents**

* A completed Study Permit Application form (available on the IRCC website)
* Proof of language proficiency (e.g., IELTS, TOEFL, or CEGEP results)
* Proof of financial support (e.g., bank statements, proof of employment)
* Proof of admission

Response After RAG:
 
    You are a helpful assistant for international students. Here are relevant documents:
    
    Study Permit Application Process When you apply online, your personalized checklist will include the Application to Ch

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Query Results: {'ids': [['250', '329', '351']], 'embeddings': None, 'documents': [["Work Permits for International Students If you're able to work during your studies, it'll say so in the conditions on your study permit.", 'International Student Work Rights (On-Campus) You can work on your school campus, without a work permit, if you', 'International Student Work Rights (On-Campus) Working more than 24 hours per week is a violation of your study permit conditions. You can lose your student status for doing this, and may not be approved for a study or work permit in the future. You may also have to leave the country.']], 'uris': None, 'data': None, 'metadatas': [[None, None, None]], 'distances': [[0.5540598630905151, 0.6326108574867249, 0.6339176297187805]], 'included': [<IncludeEnum.distances: 'distances'>, <IncludeEnum.documents: 'documents'>, <IncludeEnum.metadatas: 'metadatas'>]}


KeyboardInterrupt: 