# Final Project
## ADSP 32021 IP01 Machine Learning Operations
### 5: Finetuned Model Test
#### Group 2: Maria Clarissa Fionalita, Kajal Shukla, Mia Zhang, Priya Suvvaru Venkata

In [1]:
!python --version
!jupyter nbextension enable --py widgetsnbextension

Python 3.10.13
Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: [32mOK[0m


In [2]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "cpu"

In [3]:
# from huggingface_hub import notebook_login
# # https://huggingface.co/settings/tokens

# notebook_login()

# Load FineTuned OPT-125M

In [4]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from optimum.bettertransformer import BetterTransformer

from pprint import pprint

In [5]:
%%time

model_name = "facebook/opt-125m"
new_model_name = "model/opt_125_data_v4"

model = AutoModelForCausalLM.from_pretrained(new_model_name)
model = BetterTransformer.transform(model, keep_original_model=True) #enable CPU inference but not all models are supported

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
tokenizer.add_eos_token = True
tokenizer.add_bos_token, tokenizer.add_eos_token

The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.


CPU times: user 4.87 s, sys: 10.3 s, total: 15.1 s
Wall time: 5.88 s


(True, True)

In [6]:
def inference(text, model, tokenizer, max_input_tokens = 1000, max_output_tokens = 100):
    device = model.device
    # Tokenize
    input_ids = tokenizer.encode(text, return_tensors="pt", truncation=True, max_length=max_input_tokens).to(device)

    # Generate
    generated_tokens = model.generate(input_ids=input_ids.to(device), max_length=max_output_tokens, temperature = 0.4, pad_token_id=tokenizer.eos_token_id, do_sample = True)

    # Decode
    generated_text = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
    
    # Strip the prompt
    generated_text_answer = generated_text[0][len(text):]
    
    return generated_text_answer

In [7]:
def qa_gen(text, model, tokenizer, max_output_tokens = 100):
    # instruction = "instruction: please answer the following question\n"
    question = "question: " + str(text) + "\n"
    prompt = question + "answer:"
    print(prompt)
    print("-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------")
    print(inference(text = prompt, model = model, tokenizer = tokenizer, max_output_tokens = max_output_tokens))
    print("-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------")

In [8]:
%%time

text = "hello?"

qa_gen(text = text, model = model, tokenizer = tokenizer, max_output_tokens = 30)

question: hello?
answer:
-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------
 hello: hello -- hosiery, clothing, shoes, eyeglasses, headband, hairbrush,
-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------
CPU times: user 26.5 s, sys: 0 ns, total: 26.5 s
Wall time: 3.32 s


# N-Shot Learning

In [9]:
test_prompt = ["What types of exercise are best for people with asthma?", "How is obsessive-compulsive disorder diagnosed?", "When are you more likely to get a blood clot?", "How should you lift objects to prevent back pain?", "How can you be smart with antibiotics?"]

test_prompt[0]

'What types of exercise are best for people with asthma?'

# Zero-Shot

In [10]:
%%time

for prompt in test_prompt:
    qa_gen(text = prompt, model = model, tokenizer = tokenizer, max_output_tokens = 100)
    print()

question: What types of exercise are best for people with asthma?
answer:
-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------
 Aerobic exercise is also good for people with asthma. Aerobic exercise is also good for people with asthma. People with asthma can benefit from a aerobic exercise program, which includes walking, jogging and swimming. Aerobic exercise can also include light exercise such as jogging or swimming.
    
-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------

question: How is obsessive-compulsive disorder diagnosed?
answer:
-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------
 If you think you have obsessive compulsive disorder ( OCD), you probably have it. You might also have other mental health problems, like anxiety, depression, or learning disorders. To rule those out, you'll need to see a therapist and get treatment for the problem.
    
-------------------END OF TEXT GENER

# One Shot

In [11]:
%%time

one_shot_sample = """
question: What should I do if I want to stop dialysis?\n
answer: But you can choose not to have it or stop at any time. If you do, make sure to talk to your doctor about other treatments that can help you. Changes to your diet or lifestyle may improve your quality of life. If you want to stop dialysis because you feel depressed or ashamed, your doctor may urge you to speak to a counselor first. Sharing your feelings, taking antidepressants, or doing both of these things may help you make a more informed decision.\n
question: """

for prompt in test_prompt:
    one_shot_qa = one_shot_sample + prompt + "\n" + "answer:"
    print(one_shot_qa)
    print("-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------")
    print(inference(text = one_shot_qa, model = model, tokenizer = tokenizer, max_output_tokens = 200))
    print("-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------")


question: What should I do if I want to stop dialysis?

answer: But you can choose not to have it or stop at any time. If you do, make sure to talk to your doctor about other treatments that can help you. Changes to your diet or lifestyle may improve your quality of life. If you want to stop dialysis because you feel depressed or ashamed, your doctor may urge you to speak to a counselor first. Sharing your feelings, taking antidepressants, or doing both of these things may help you make a more informed decision.

question: What types of exercise are best for people with asthma?
answer:
-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------
 Walking, jogging, swimming, and bicycling are all good exercise choices.
    
-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------

question: What should I do if I want to stop dialysis?

answer: But you can choose not to have it or stop at any time. If you do, make sure to talk to your doct

# Few Shot

In [12]:
%%time

few_shot_sample = """
question: What should I do if I want to stop dialysis?\n
answer: But you can choose not to have it or stop at any time. If you do, make sure to talk to your doctor about other treatments that can help you. Changes to your diet or lifestyle may improve your quality of life. If you want to stop dialysis because you feel depressed or ashamed, your doctor may urge you to speak to a counselor first. Sharing your feelings, taking antidepressants, or doing both of these things may help you make a more informed decision.\n
question: What are some tips to stay healthy during dialysis?\n
answer: Hemodialysis patients are also at an increased risk for infections. Try these tips to stay healthy: Check your access site daily for redness, pus, and swelling. If you see any, call your doctor. Keep the bandage that covers your catheter clean and dry.\n
question:"""

for prompt in test_prompt:
    few_shot_qa = few_shot_sample + prompt + "\n" + "answer:"
    print(few_shot_qa)
    print("-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------")
    print(inference(text = one_shot_qa, model = model, tokenizer = tokenizer, max_output_tokens = 200))
    print("-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------")


question: What should I do if I want to stop dialysis?

answer: But you can choose not to have it or stop at any time. If you do, make sure to talk to your doctor about other treatments that can help you. Changes to your diet or lifestyle may improve your quality of life. If you want to stop dialysis because you feel depressed or ashamed, your doctor may urge you to speak to a counselor first. Sharing your feelings, taking antidepressants, or doing both of these things may help you make a more informed decision.

question: What are some tips to stay healthy during dialysis?

answer: Hemodialysis patients are also at an increased risk for infections. Try these tips to stay healthy: Check your access site daily for redness, pus, and swelling. If you see any, call your doctor. Keep the bandage that covers your catheter clean and dry.

question:What types of exercise are best for people with asthma?
answer:
-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------

# Retrieval Augmented Generation Prompt Engineering

## Load the Vector Store Database

In [13]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA

In [14]:
%%time

# Create a dictionary with model configuration options, specifying to use the CPU for computations
model_kwargs = {'device':'cpu'}

# Create a dictionary with encoding options, specifically setting 'normalize_embeddings' to False
encode_kwargs = {'normalize_embeddings': False}

# Initialize an instance of HuggingFaceEmbeddings with the specified parameters
embeddings = HuggingFaceEmbeddings(
    model_name = "sentence-transformers/all-MiniLM-l6-v2",     # Provide the pre-trained model's path
    model_kwargs = model_kwargs, # Pass the model configuration options
    encode_kwargs = encode_kwargs # Pass the encoding options
)

CPU times: user 561 ms, sys: 50.2 ms, total: 611 ms
Wall time: 391 ms


In [15]:
%%time

db = FAISS.load_local("data/RAG_data", embeddings)

CPU times: user 2.14 s, sys: 530 ms, total: 2.67 s
Wall time: 3.95 s


## Create the Retriever

https://github.com/langchain-ai/langchain/discussions/3115

In [16]:
# Create a retriever object from the 'db' with a search configuration where it retrieves up to 4 relevant splits/documents.
retriever = db.as_retriever(search_type = "mmr", search_kwargs={"k": 4})

In [17]:
# test retriever
for prompt in test_prompt:
    print(prompt)
    print(retriever.get_relevant_documents(prompt)[0].page_content)
    print()

What types of exercise are best for people with asthma?
Regular exercise can help you control your asthma. It can strengthen lung muscles, make it easier to manage your weight, and boost your immune system. Instead: Try different kinds of activities that are less challenging. Avoid weather conditions that might trigger symptoms.

How is obsessive-compulsive disorder diagnosed?
People with obsessive compulsive disorder ( OCD) have recurring and distressing thoughts, fears, or images (obsessions) that they cannot control. The anxiety (nervousness) produced by these thoughts leads to an urgent need to perform certain rituals or routines (compulsions). With BDD, a person's preoccupation with the defect often leads to ritualistic behaviors, such as constantly looking in a mirror or picking at the skin. The person with BDD eventually becomes so obsessed with the defect that his or her social, work, and home functioning suffers.

When are you more likely to get a blood clot?
You might need to

## Initialize the LLM Pipeline

### Define the Prompt Template

In [18]:
from langchain.prompts import PromptTemplate

qa_template = """You are a helpful assistant that can answer medical questions. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Context information is below:

{context}

Given the context information and not prior knowledge, answer the question: {question}
Answer: """

prompt_template = PromptTemplate(
    template = qa_template,
    input_variables = ["context", "question"]
)

In [19]:
# test prompt

print(
    prompt_template.invoke(
        {"context": "filler context", "question": "filler question"}
    ).to_string()
)

You are a helpful assistant that can answer medical questions. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Context information is below:

filler context

Given the context information and not prior knowledge, answer the question: filler question
Answer: 


### Load the Model into a Chain

In [20]:
import json
from pathlib import Path
from pprint import pprint
import ast

from langchain.llms.huggingface_pipeline import HuggingFacePipeline
from langchain.llms import HuggingFaceHub

In [21]:
%%time

pipe = pipeline("text-generation",
                model = model,
                tokenizer=tokenizer,
                model_kwargs = {"temperature": 0.4, "max_length": 100, "pad_token_id": tokenizer.eos_token_id, "do_sample": True},
                max_new_tokens = 100)

llm = HuggingFacePipeline(pipeline = pipe)

CPU times: user 11.3 ms, sys: 186 µs, total: 11.5 ms
Wall time: 1.44 ms


In [22]:
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
# https://python.langchain.com/docs/use_cases/question_answering/

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt_template
    | llm
    | StrOutputParser()
)

## Generate Output

In [23]:
%%time

for prompt in test_prompt:
    print("question:", prompt)
    result = rag_chain.invoke(prompt)
    print("-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------")
    print()
    print(result)
    print()
    print("-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------")
    print()

question: What types of exercise are best for people with asthma?
-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------

  
Question: If you have a cold air pollution.
    
Question: If you have a cold air conditioner.
    
Question: If you have a cold air.
    

-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------

question: How is obsessive-compulsive disorder diagnosed?
-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------

   disorder).
Question: " I am a person's reaction."
     
Question: " I am a person's reaction."
    
Question: " I am a person's reaction."
      Answer: I am a person's reaction."
    

-------------------END OF TEXT GENERATED BY LANGUAGE MODEL------------------------

question: When are you more likely to get a blood clot?
-------------------BELOW IS GENERATED BY LANGUAGE MODEL---------------------------

 

-------------------END OF TEXT GENERATED BY LANGUAGE MODEL-------