# Notebook Title: AfyaMum Bot: AI-powered pregnancy support chatbot

## Introduction
This notebook utilizes the power of Llama 2, an open-source large language model, in combination with a comprehensive document on Maternal Reproductive Health to offer expectant mothers tailored information and support. Maternal health during pregnancy is of utmost importance, and this notebook aims to empower expectant mothers with personalized insights, risk assessment, prevention methods, and guidance for a healthier pregnancy journey.

## How it works?
1. **Data Input**: Expectant mothers provide information on how they feel through Whatsapp

2. **Personalization**: Llama 2 will process the input data to generate personalized recommendations and assessments.

The Personalized Maternal Reproductive Health Guide, powered by Llama-2, aims to provide expectant mothers with the knowledge, tools, and support needed for a healthy and informed pregnancy journey.


In [1]:
#!pip install python-dotenv
#!pip install openai

In [2]:
#!pip install --upgrade langchain

In [3]:
#!pip install chromadb

## Using OpenAI's GPT-3.5 to do prompt engineering

### 1.0: Loading the required packages

In [4]:
import os
import openai
import sys
sys.path.append('../..')
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

openai.api_key = os.environ['OPENAI_API_KEY']

In [5]:
import datetime
current_date = datetime.datetime.now().date()
if current_date < datetime.date(2023, 9, 2):
    llm_name = "gpt-3.5-turbo-0301"
else:
    llm_name = "gpt-3.5-turbo"
print(llm_name)

gpt-3.5-turbo


In [6]:
langchain_tracing_v2 = os.environ['LANGCHAIN_TRACING_V2']
langchain_endpoint = os.environ['LANGCHAIN_ENDPOINT']
langchain_api_key = os.environ['LANGCHAIN_API_KEY']

In [8]:
from langchain.vectorstores import Chroma
from langchain.embeddings.openai import OpenAIEmbeddings
persist_directory = 'docs/chroma/'
embedding = OpenAIEmbeddings()
vectordb = Chroma(persist_directory=persist_directory, embedding_function=embedding)

In [None]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model_name=llm_name, temperature=0)
llm.predict("Hello!")

'Hello! How can I assist you today?'

In [None]:
# !pip install tiktoken

In [None]:
# Build prompt
from langchain.prompts import PromptTemplate
template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Use three sentences maximum. Keep the answer as concise as possible. Always say "Thanks for asking and I'm optimistic everything will be well!" at the end of the answer. 
{context}
Question: {question}
Helpful Answer:"""
QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"],template=template,)

# Run chain
from langchain.chains import RetrievalQA
question = "What are the major causes of pregnancy complications?"
qa_chain = RetrievalQA.from_chain_type(llm,
                                       retriever=vectordb.as_retriever(),
                                       return_source_documents=True,
                                       chain_type_kwargs={"prompt": QA_CHAIN_PROMPT})


result = qa_chain({"query": question})
result["result"]

"The major causes of pregnancy complications can include pre-existing health conditions, such as diabetes or high blood pressure, infections during pregnancy, and problems with the placenta or umbilical cord. Thanks for asking and I'm optimistic everything will be well!"

### 1.1: Storing chat history in memory

In [None]:
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

### 1.2: Implementing ConversationalRetrievalChain

In [None]:
from langchain.chains import ConversationalRetrievalChain
retriever=vectordb.as_retriever()
qa = ConversationalRetrievalChain.from_llm(
    llm,
    retriever=retriever,
    memory=memory
)

In [None]:
question = "Is infections during a pregnancy a major cause of pregnancy complications and what number of expectant women encounter these complications?"
result = qa({"question": question})

In [None]:
result['answer']

'Infections during pregnancy can indeed be a major cause of pregnancy complications. Certain infections, such as urinary tract infections, bacterial vaginosis, and sexually transmitted infections, can increase the risk of preterm labor, premature rupture of membranes, and low birth weight. Other infections, like rubella, cytomegalovirus, and Zika virus, can lead to birth defects in the baby.\n\nAs for the number of expectant women who encounter these complications, it can vary depending on various factors such as geographic location, socioeconomic status, access to healthcare, and individual health conditions. It is challenging to provide an exact number without specific data or statistics. It is always recommended for pregnant women to seek regular prenatal care and follow the guidance of healthcare professionals to minimize the risk of infections and associated complications.'

In [None]:
question = "Which infections among the following: urinary tract infections, bacterial vaginosis, and sexually transmitted infections is the major cause of complications in pregnancy?"
result = qa({"question": question})

In [None]:
result['answer']

'Among the infections listed, sexually transmitted infections (STIs) are generally considered to be the major cause of complications in pregnancy. STIs such as chlamydia, gonorrhea, syphilis, and HIV can pose significant risks to both the mother and the developing fetus if left untreated. It is important for pregnant individuals to receive regular prenatal care and be tested for STIs to prevent complications.'

## Using Llama 2 Model to develop AfyaMum Bot

### Develop AfyaMum Bot

In [9]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA,  ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader
from langchain.embeddings import HuggingFaceEmbeddings
from langchain import PromptTemplate
from langchain.vectorstores import FAISS

In [None]:
# !pip install pypdf

In [None]:
# !pip install ctransformers

In [10]:
from langchain.llms import CTransformers
# Loading the model
def load_llm():
    # Load the locally downloaded model here
    llm = CTransformers(
        model = "llama-2-7b-chat.ggmlv3.q8_0.bin",
        model_type="llama",
        max_new_tokens = 512,
        temperature = 0.5
    )
    return llm

In [11]:
custom_prompt_template = """Use the following pieces of information to answer the user's question in a simple manner which will enable him or her to easily understand.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: {context}
Question: {question}

Only return the helpful answer, risk level e.g high, medium or low risk, and possible remedies if low risk involved
below and nothing else.
Helpful answer for expectant mothers:

Symptom Assessment: Provide an assessment of the symptoms described in the query.

Risk Evaluation: Evaluate the potential risks and complications expectant mothers may be predisposed to based on the conditions presented in the query.

Personalized Guidance: Offer recommended actions, preventive measures, and self-care tips tailored to the user's specific situation.

Specialist Referral: Indicate whether specialized medical attention may be needed and, if so, provide guidance on seeking such care.

Risk Level: Assess the risk level associated with the symptoms or conditions mentioned (e.g., high, medium, or low risk).

Possible Remedies (for Low Risk): If the risk level is low, suggest appropriate remedies or self-care steps.
"""

In [12]:
def set_custom_prompt():
    """
    Prompt template for ConversationalRetrievalQAChain
    """
    prompt = PromptTemplate(template=custom_prompt_template,
                            input_variables=['context', 'question'])
    return prompt

In [13]:
DB_FAISS_PATH = 'vectorstore/db_faiss'

In [None]:
#!pip install faiss_cpu

In [14]:
def create_vector_db(file):
    # load documents
    loader = PyPDFLoader(file)
    documents = loader.load()
    # split documents
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    docs = text_splitter.split_documents(documents)
    # define embedding
    embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2', model_kwargs={'device': 'cpu'})
    # create vector database from data
    # db = DocArrayInMemorySearch.from_documents(docs, embeddings)
    db = FAISS.from_documents(docs, embeddings)
    db.save_local(DB_FAISS_PATH)
    

In [15]:
create_vector_db("./Data/Oxford_Gynecology.pdf")

  from .autonotebook import tqdm as notebook_tqdm


In [None]:
# ConversationalRetrievalChain
def conversational_retrieval_chain(llm, prompt, db, k, chain_type):
    # create a chatbot chain. Memory is managed externally.
    # define retriever
    retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": k})
    qa = ConversationalRetrievalChain.from_llm(
        llm=llm, 
        chain_type=chain_type, 
        retriever=retriever, 
        return_source_documents=True,
        return_generated_question=True,
        combine_docs_chain_kwargs={'prompt': prompt}
    )
    return qa 

In [None]:
# !pip install param

In [None]:
# !pip install pypdf

In [None]:
import param

class cbfs(param.Parameterized):
    chat_history = param.List([])
    answer = param.String("")
    db_query = param.String("")
    db_response = param.List([])

    def __init__(self, **params):
        super(cbfs, self).__init__(**params)
        # self.loaded_file = "./Data/Gynecology.pdf"
        # self.qa = load_db(self.loaded_file, "stuff", 4)
        llm = load_llm()
        prompt = set_custom_prompt()
        embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2', model_kwargs={'device': 'cpu'})
        db = FAISS.load_local(DB_FAISS_PATH, embeddings)
        self.qa = conversational_retrieval_chain(llm, prompt, db, 4, "stuff")

    def convchain(self, query):
        result = self.qa({"question": query, "chat_history": self.chat_history})
        self.chat_history.extend([(query, result["answer"])])
        self.db_query = result["generated_question"]
        self.db_response = result["source_documents"]
        self.answer = result['answer']
        return self.answer

##  AfyaMum Bot Instance

In [None]:
#!pip install sentence_transformers

In [None]:
#!pip install "langchain[docarray]"

In [None]:
cb = cbfs()
cb.convchain("What is preclampsia?")

'During preterminal cord is a velamented in the placentralges well. Fetal cord extends into the vasa preterminal cord passes over the umbilaterally displacerv\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nBleakin place the fetal vessels traverse the fetal vessels, but does not present between  The following are essential for the umbilamin. or  What can help@ If the cervagain, or velamentary exposed through a velamented. A few mmembarranged in the placentrstretcheducation of vasa preterminal cord is not present on the membraneously found anteriorly soft (Mosted.\n\n\n\n\n\n\n\nIf you should consultation to ensure that is present over the umbilaterally absent or when placent of blood from or percepticants, or other types of the presence. It may cause\nIt has an access.\n\n\n\n\n\nDuring preterm birth defective, or without the fetal cord is a) The placentrstretched, if you can occur in contact your healthcarefulc. In addition'