<a href="https://colab.research.google.com/github/Chirag314/QA-chatbot/blob/main/Adam_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Following medium article has been referred to make this notebook.
https://medium.com/@Siddharth.jh/conversational-chat-bot-using-open-source-llm-model-dolly-2-0-with-added-memory-acfacc13a69e

In [1]:
!pip install langchain
!pip install pypdf
!pip install sentence_transformers #For embedding
!pip install chromadb #Store text data and embedding
!pip install accelerate
!pip install --upgrade accelerate
%pip install bitsandbytes



In [2]:
from langchain.docstore.document import Document
import os
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma

In [3]:
loader=PyPDFLoader('/content/ADaMIG_v1.3.pdf')
documents=loader.load()
text_splitter=CharacterTextSplitter(chunk_size=400,chunk_overlap=40)
documents=text_splitter.split_documents(documents)

In [4]:

!pip install chromadb
from langchain.vectorstores import Chroma



In [5]:
hf_embed=HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
vector_db=Chroma.from_documents(collection_name="document_docs",documents=documents,embedding=hf_embed,persist_directory="/content")
vector_db.similarity_search("dummy")
vector_db.persist()
pdf_vector_db_path="/content/"
hf_embed=HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
db=Chroma(collection_name="document_docs",embedding_function=hf_embed,persist_directory=pdf_vector_db_path)

Now that we have text chunks and their respective embedding stored in the database.

In [6]:
def get_similar_docs(question,similar_doc_count):
  return vector_db.similarity_search(question,k=similar_doc_count)
for doc in get_similar_docs("What are different domain types in CDISC ADaM guide?",2):
  print(doc.page_content)

CDISC Analysis Data Model Implementation Guide ( 1.3 Final)  
© 2021 Clinical Data Interchange Standards Consortium, Inc. All rights reserved   Page 8 
2021-11-29 • An analysis dataset that does not follow the ADaM fundamental principles  
It is important not to refer to non -ADaM analysis datasets as "ADaM datasets." 
To prevent confusion, non -ADaM analysis dataset names should not start with the prefix AD. It is good practice to 
start the names of non -ADaM analysis datasets with the two -letter prefix "AX" (see Figure 1.6.1).  
Currently ADaM has 3 structures: ADSL, BDS, and  OCCDS, which correspond to the SUBJECT LEVEL 
ANALYSIS DATASET, BASIC DATA STRUCTURE, and OCCURRENCE DATA STRUCTURE classes of 
ADaM datasets. Analysis datasets that follow the ADaM fundamental principles  and other ADaM conventions, but 
which do not follow one of the 3 defined structures (ADSL, BDS, OCCDS), are considered to be ADaM datasets 
with a class of ADAM OTHER. Controlled Terminology for the class 

Using above function ,we can fetch relevant chunk from database for our question. When we ask a question,using same method embedding is created for question and create the similarity score for all the chunk. Now we can fetchn n chunk which is more similar to our question.
Lets create pipeline for our model.

In [7]:
from transformers import AutoTokenizer, AutoModelForCausalLM,pipeline
import torch
from langchain import PromptTemplate
from langchain.llms import HuggingFacePipeline
from langchain.chains.question_answering import load_qa_chain

We have to define template as per model requirement. Template consist of instruction,context and question.

In [13]:
template=""" Below is an instruction that describes a task. Write a response that appropriately completes the request.

  Instruction:
  You know about CDISC ADaM specifiction ,provide best answer for question.
  Use only information in the following paragraphs to answer the question.
  Explain the answer with reference to these paragraphs.
  If you don't have the information in paragraph then give response "I don't know"

  {context}

  Question: {question}

  Response:
  """

def build_qa_chain():
  torch.cuda.empty_cache()
  model_name="databricks/dolly-v2-3b"  # can use dolly-v2-3b or dolly-v2-7b for faster inference
  #increate max_new tokens for a longer response
  instruct_pipeline=pipeline(model=model_name,torch_dtype=torch.bfloat16,trust_remote_code=True,device_map="auto",return_full_text=True,max_new_tokens=256,top_p=0.95,top_k=50,model_kwargs={'load_in_8bit': True})
  model=AutoModelForCausalLM.from_pretrained(model_name,device_map='auto',torch_dtype=torch.float16,trust_remote_code=True)

  prompt=PromptTemplate(input_variables=['context','question'],template=template)

  hf_pipe=HuggingFacePipeline(pipeline=instruct_pipeline)
  return load_qa_chain(llm=hf_pipe,chain_type='stuff',prompt=prompt,verbose=True)

In [14]:
qa_chain=build_qa_chain()

In [15]:
#Code to disply HTML in Jupyter notebook
def displayHTML(html):
  """Display HTML in Jupyter Notebook"""
  from IPython.display import HTML
  display(HTML(html))

Now we will see we how we can initiate the pipe line and pass our question and relevant content to the model to generate the response.

In [16]:
def answer_question(question):
  similar_docs=get_similar_docs(question,similar_doc_count=3)
  result=qa_chain({"input_documents":similar_docs,"question":question})
  result_html=f"<p><blockquote style=\"font-size=24\">{question}</blockquote></p>"
  result_html += f"<p><blockquote style=\"font-size:18px\">{result['output_text']}</blockquote></p>"
  result_html+="<p><hr/></p>"

  for d in result["input_documents"]:
    source_id=d.metadata['source']
    result_html+=f"<p><blockquote>{d.page_content}<br/>(source: <a href=\"https://gardening.stackexchange.com/a/{source_id}\">{source_id}</a>)</blockquote></p>"
  displayHTML(result_html)


We can call function answer_question with our question.

In [17]:
answer_question("What are special purpose datasets?")



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m Below is an instruction that describes a task. Write a response that appropriately completes the request.

  Instruction:
  You know about CDISC ADaM specifiction ,provide best answer for question.
  Use only information in the following paragraphs to answer the question.
  Explain the answer with reference to these paragraphs.
  If you don't have the information in paragraph then give response "I don't know"

  CDISC Analysis Data Model Implementation Guide ( 1.3 Final)  
© 2021 Clinical Data Interchange Standards Consortium, Inc. All rights reserved   Page 11 
2021-11-29 Very complex derivations may require the creation of intermediate analysis datasets. In these situations, traceability 
may be accomplished by submitting those intermediate analysis datasets along with the ir associated metadata. 
Traceability would then involve several steps. The an

This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (2048). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.



[1m> Finished chain.[0m

[1m> Finished chain.[0m


Adding memory to make bot conversational

In [19]:
from transformers import AutoTokenizer, AutoModelForCausalLM,pipeline,AutoModelForSeq2SeqLM
from langchain import PromptTemplate
from langchain.llms import HuggingFacePipeline
from langchain.chains.question_answering import load_qa_chain
from langchain.memory import ConversationSummaryBufferMemory

Here we are creating buffer memory using langchain and store the previous conversaion. But here we have limitaion of most of LLM model. We can send only limited token as input to model. So we will summarize instead of sending full conversaion.

In [23]:
def build_qa_chain():
  torch.cuda.empty_cache()
  template=""" You are a chatbot having voncersaion with a human . Your job is to help providing the best answer for questions on CDISC ADaM IG specificaations.
  Use only information in the following paragraphs to answer the question at the end. Explain the answer with reference to these paragraphs.
  If you don't have relevent informtion in paragraph then give response "I don't know".

  {context}

  {chat_history}

  {human_input}

  Response:

  """

  prompt=PromptTemplate(input_variables=['context','human_input','chat_history'],template=template)

  model_name="databricks/dolly-v2-3b"
  instruct_pipeline=pipeline(model=model_name, torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto",
                               return_full_text=True, max_new_tokens=256, top_p=0.95, top_k=50, model_kwargs={'load_in_8bit': True})
  hf_pipe=HuggingFacePipeline(pipeline=instruct_pipeline)

  summarize_model=AutoModelForSeq2SeqLM.from_pretrained("t5-small", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)
  summarize_tokenizer=AutoTokenizer.from_pretrained("t5-small",padding="left",model_max_length=512)
  pipe_summary=pipeline("summarization",model=summarize_model,tokenizer=summarize_tokenizer)

  hf_summary=HuggingFacePipeline(pipeline=pipe_summary)

  memory=ConversationSummaryBufferMemory(llm=hf_summary,memory_key="chat_history", input_key="human_input", max_token_limit=300, human_prefix = "", ai_prefix = "")

  print("Loading chain, this can take some time.....")

  return load_qa_chain(llm=hf_pipe,chain_type="stuff", prompt=prompt, verbose=True, memory=memory)

In [25]:
class ChatBot():
  def __init__(self, db):
    self.reset_context()
    self.db = db

  def reset_context(self):
    self.sources = []
    self.discussion = []
    # Building the chain will load Dolly and can take some time depending on the model size and your GPU
    self.qa_chain = build_qa_chain()
    displayHTML("<h1>Hi! I'm a chat bot specialized in CDISC documents. How Can I help you today?</h1>")

  def get_similar_docs(self, question, similar_doc_count):
    return self.db.similarity_search(question, k=similar_doc_count)

  def chat(self, question):
    # Keep the last 3 discussion to search similar content
    self.discussion.append(question)
    similar_docs = self.get_similar_docs(" \n".join(self.discussion[-3:]), similar_doc_count=2)
    # Remove similar doc if they're already in the last questions (as it's already in the history)
    similar_docs = [doc for doc in similar_docs if doc.metadata['source'] not in self.sources[-3:]]

    result = self.qa_chain({"input_documents": similar_docs, "human_input": question})
    # Cleanup the answer for better display:
    answer = result['output_text'].capitalize()
    result_html = f"<p><blockquote style=\"font-size:24\">{question}</blockquote></p>"
    result_html += f"<p><blockquote style=\"font-size:18px\">{answer}</blockquote></p>"
    result_html += "<p><hr/></p>"
    for d in result["input_documents"]:
      source_id = d.metadata["source"]
      self.sources.append(source_id)
      result_html += f"<p><blockquote>{d.page_content}<br/>(Source: <a href=\"https://gardening.stackexchange.com/a/{source_id}\">{source_id}</a>)</blockquote></p>"
    displayHTML(result_html)

chat_bot = ChatBot(vector_db)

Loading chain, this can take some time.....


In [26]:
#Call bot using below code
def displayHTML(html):
    """Display HTML in Jupyter notebook."""
    from IPython.display import HTML
    display(HTML(html))

#Initiate the bot
chat_bot.chat("Do you know about CDISC standards?")



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m You are a chatbot having voncersaion with a human . Your job is to help providing the best answer for questions on CDISC ADaM IG specificaations.
  Use only information in the following paragraphs to answer the question at the end. Explain the answer with reference to these paragraphs.
  If you don't have relevent informtion in paragraph then give response "I don't know".

  CDISC Analysis Data Model Implementation Guide ( 1.3 Final)  
© 2021 Clinical Data Interchange Standards Consortium, Inc. All rights reserved   Page 88 
2021-11-29 Appendix C:  Representations and Warranties, Limitations of 
Liability, and Disclaimers 
CDISC Patent Disclaimers  
It is possible that implementation of and complian ce with this standard may require use of subject matter covered by 
patent rights. By publication of this standard, no position is taken with respect to th



OutOfMemoryError: ignored

In [27]:
chat_bot.chat("Can you explain me about that?")



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m You are a chatbot having voncersaion with a human . Your job is to help providing the best answer for questions on CDISC ADaM IG specificaations.
  Use only information in the following paragraphs to answer the question at the end. Explain the answer with reference to these paragraphs.
  If you don't have relevent informtion in paragraph then give response "I don't know".

  CDISC Analysis Data Model Implementation Guide ( 1.3 Final)  
© 2021 Clinical Data Interchange Standards Consortium, Inc. All rights reserved   Page 88 
2021-11-29 Appendix C:  Representations and Warranties, Limitations of 
Liability, and Disclaimers 
CDISC Patent Disclaimers  
It is possible that implementation of and complian ce with this standard may require use of subject matter covered by 
patent rights. By publication of this standard, no position is taken with respect to th



OutOfMemoryError: ignored