In [1]:
pip install -U FlagEmbedding langchain-community



In [2]:
pip install --upgrade optimum transformers



In [3]:
pip install faiss-cpu



In [5]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [1]:
from langchain.vectorstores import FAISS
from langchain.prompts import PromptTemplate
from langchain.llms import HuggingFacePipeline
from FlagEmbedding import FlagAutoModel

embed_model = FlagAutoModel.from_finetuned('BAAI/bge-base-en-v1.5',
                                      query_instruction_for_retrieval="Represent this sentence for searching relevant passages:",
                                      use_fp16=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [2]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig, pipeline

model_name = "deepseek-ai/deepseek-llm-7b-chat"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
model.generation_config = GenerationConfig.from_pretrained(model_name)
model.generation_config.pad_token_id = model.generation_config.eos_token_id


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [3]:
vector_store_path = "/content/drive/MyDrive/my_vector_store"
vector_store = FAISS.load_local(vector_store_path, embeddings=embed_model, allow_dangerous_deserialization=True)



In [5]:
def generate(user_query):
  torch.cuda.empty_cache()
  query = user_query
  embed_query = embed_model.encode([query])

  if "pregnant" in query.lower():
    category = 'specific population usage'
  elif "uses" in query.lower():
    category = 'general'
  elif "clinical" in query.lower():
    category = 'clinical data'
  elif "dose" in query.lower():
    category = 'dosage and administration'
  else:
      category = 'general'
  results = vector_store.similarity_search_by_vector(
      embed_query[0],
      k=1,
      filter={"category": category}
  )

  cleaned_context = [doc.page_content for doc in results]
  prompt_template = """
  You are an expert medical assistant explaining medicine to someone without any prior knowledge.

  Use ***ONLY*** the provided context to answer the user's question. Respond only with exact information found in the provided text.
  If the context does not contain the answer, say "The context does not provide enough information."

  # Context: {context}

  # Question: {question}

  Answer:
  """
  prompt = PromptTemplate(template=prompt_template, input_variables=['context', 'question'])
  llm = HuggingFacePipeline(
      pipeline=pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=300)
  )
  qa_chain = prompt | llm
  response = qa_chain.invoke({'context' : cleaned_context, 'question' : query})

  def extract_answer(text):
      # Ensure "Answer: " exists in the string
      if "Answer:" in text:
          return text.split("Answer:", 1)[1]

  result = extract_answer(response)
  return result, cleaned_context


Querying starts here

Queries can ask about
- General summaries
- Clinical data
- Side effects
- Specific populations affected
- Dosage data

In [62]:
result, context = generate("What does the clinical trial for AMLODIPINE say?")

Device set to use cuda:0


In [63]:
result

'\n   The clinical trials for amlodipine evaluated its safety and effectiveness in more than 11,000 patients in the U.S. and foreign trials. The trials compared amlodipine to placebo and found that discontinuation of amlodipine due to adverse reactions was required in only about 1.5% of patients, which was not significantly different from placebo. The most commonly reported side effects were edema, dizziness, flushing, palpitations, and somnolence. The incidence of these side effects increased with dose, but most were mild or moderate in severity. The trials also reported other adverse experiences associated with amlodipine treatment, some of which occurred more frequently in women than men. Amlodipine was not associated with clinically significant changes in routine laboratory tests, and no clinically relevant changes were noted in serum potassium, serum glucose, total triglycerides, total cholesterol, HDL cholesterol, uric acid, blood urea nitrogen, or creatinine.'

In [64]:
context

['Name: AMLODIPINE BESYLATE | Clinical Trials: Because clinical trials are conducted under widely varying conditions, adverse reaction rates observed in the clinical trials of a drug cannot be directly compared to rates in the clinical trials of another drug and may not reflect the rates observed in practice. Amlodipine has been evaluated for safety in more than 11,000 patients in U.S. and foreign clinical trials. In general, treatment with amlodipine was well-tolerated at doses up to 10 mg daily. Most adverse reactions reported during therapy with amlodipine were of mild or moderate severity. In controlled clinical trials directly comparing amlodipine (N=1730) at doses up to 10 mg to placebo (N=1250), discontinuation of amlodipine because of adverse reactions was required in only about 1.5% of patients and was not significantly different from placebo (about 1%). The most commonly reported side effects more frequent than placebo are reflected in the table below. The incidence (%) of si

In [14]:
result, context = generate("What are the side effects of ACETAMIINOPHEN")

Device set to use cuda:0


In [15]:
result

'\n   The context does not provide enough information about side effects.'

In [16]:
context

['Summary of ACETAMINOPHEN | Uses: Uses • temporarily relieves minor aches and pains due to: • headache • muscular aches • backache • toothache • minor pain of arthritis • the common cold • premenstrual and menstrual cramps • temporarily reduces fever and  |  and  | Reactions: ']

In [5]:
result, context = generate("What are the side effects of HYDROCORTISONE?")

You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
Device set to use cuda:0
  llm = HuggingFacePipeline(


In [6]:
result

'\n  1. Fluid and electrolyte disturbances: sodium retention, fluid retention, congestive heart failure, potassium loss, hypokalemic alkalosis, hypertension\n  2. Musculoskeletal: muscle weakness, steroid myopathy, loss of muscle mass, osteoporosis, tendon rupture (particularly Achilles tendon), aseptic necrosis of femoral and humeral heads, pathologic fracture of long bones\n  3. Gastrointestinal: peptic ulcer with possible perforation and hemorrhage, pancreatitis, abdominal distention, ulcerative esophagitis, impaired wound healing, thin fragile skin, petechiae and ecchymoses, facial erythema, increased sweating\n  4. Dermatologic: increased photosensitivity, skin atrophy, facial acne, delayed wound healing, thin fragile skin, petechiae and ecchymoses, facial erythema, increased sweating\n  5. Neurological: increased intracranial pressure with papilledema, usually after treatment, convulsions, vertigo, headache, epidural lipomatosis\n  6. Endocrine: development of Cushingoid state, s

In [7]:
context

["Summary of HYDROCORTISONE | Uses: INDICATIONS AND USAGE Hydrocortisone tablets are indicated in the following conditions. 1. Endocrine Disorders Primary or secondary adrenocortical insufficiency (hydrocortisone or cortisone is the first choice; synthetic analogs may be used in conjunction with mineralocorticoids where applicable; in infancy mineralocorticoid supplementation is of particular importance) Congenital adrenal hyperplasia Non suppurative thyroiditis Hypercalcemia associated with cancer 2. Rheumatic Disorders As adjunctive therapy for short-term administration (to tide the patient over an acute episode or exacerbation) in: Psoriatic arthritis Rheumatoid arthritis, including juvenile rheumatoid arthritis (selected cases may require low-dose maintenance therapy) Ankylosing spondylitis Acute and subacute bursitis Acute nonspecific tenosynovitis Acute gouty arthritis Post-traumatic osteoarthritis Synovitis of osteoarthritis Epicondylitis 3. Collagen Diseases During an exacerbat

In [11]:
result, context = generate("What is BENZOCAINE used for?")

Device set to use cuda:0


In [12]:
result

'\n  \n  BENZOCAINE is used for the temporary relief of pain due to toothaches.'

In [13]:
context

['Summary of BENZOCAINE | Uses: Use for the temporary relief of pain due to toothaches * to help protect against infection of minor oral irritation and  |  and  | Reactions: ']

In [9]:
result, context = generate("What does Loratodine do?")

Device set to use cuda:0


In [10]:
result

'\n  \n  Loratadine, a medication used for allergies, temporarily relieves symptoms such as runny nose, sneezing, itchy and watery eyes, itching of the nose or throat, and other upper respiratory allergies.'

In [11]:
context

['Summary of LORATADINE | Uses: Uses temporarily relieves these symptoms due to hay fever or other upper respiratory allergies • runny nose • sneezing • itchy, water eyes • itching of the nose or throat and  |  and  | Reactions: ']


# Example List:

- ## BENZOCAINE
- ## NITROGLYCERIN
- ## GLYCERIN
- ## LUBIPROSTONE
- ## CYCLOBENZAPRINE HYDROCHLORIDE
- ## HYDROCHLOROTHIAZIDE
- ## ACETAMINOPHEN
- ## MENTHOL
- ## HYDROCORTISONE
- ## AMLODIPINE
- ## LINEZOLID