# LLM using Langchain

In [1]:
!pip install langchain langchain-mistralai

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/opt/ohpc/pub/apps/python/3.8.12/bin/python3.8 -m pip install --upgrade pip' command.[0m[33m
[0m

In [2]:
# Set up the API key
import os
os.environ["MISTRAL_API_KEY"] = "HXTGgSOzr4NAO85hGiCGhsbIPm9JpcCY"

# Import the ChatMistralAI model from langchain_mistralai
from langchain_mistralai import ChatMistralAI

# Initialize the model
model = ChatMistralAI(model="mistral-large-latest")

In [3]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="What's your name?")])

AIMessage(content="I don't have a name. I'm a text-based AI model created by the Mistral AI team. You can give me a name if you'd like! What would you like to call me?", response_metadata={'token_usage': {'prompt_tokens': 9, 'total_tokens': 54, 'completion_tokens': 45}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-9e5d66aa-f024-4d90-a714-6ec9b014fc02-0', usage_metadata={'input_tokens': 9, 'output_tokens': 45, 'total_tokens': 54})

In [4]:
from langchain.prompts import ChatPromptTemplate
from langchain_mistralai import ChatMistralAI
from langchain_core.output_parsers import StrOutputParser

# Define the chat prompt template
chat_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}")
])

# Initialize the model
llm_model = ChatMistralAI(model="mistral-large-latest")

# Create a chain with the model and the prompt template
chain = chat_prompt|llm_model|StrOutputParser()

# Use the invoke method to generate a response
response = chain.invoke({"input": "Provide me with a python code to detect palindrome?"})
print(response)

Certainly! A palindrome is a word, phrase, number, or other sequence of characters that reads the same forward and backward (ignoring spaces, punctuation, and capitalization). Below is a Python code to detect if a given string is a palindrome:

```python
def is_palindrome(s):
    # Remove non-alphanumeric characters and convert to lowercase
    cleaned_string = ''.join(filter(str.isalnum, s)).lower()
    # Check if cleaned string is equal to its reverse
    return cleaned_string == cleaned_string[::-1]

# Example usage
if __name__ == "__main__":
    test_string = "A man, a plan, a canal, Panama"
    if is_palindrome(test_string):
        print(f'"{test_string}" is a palindrome.')
    else:
        print(f'"{test_string}" is not a palindrome.')
```

This code defines a function `is_palindrome` that:
1. Removes non-alphanumeric characters from the input string.
2. Converts the resulting string to lowercase to ensure the comparison is case-insensitive.
3. Checks if this cleaned string is 

# Retrieval-Augmented Generation (RAG)

### Vector DB

In [5]:
!pip install langchain-community pypdf transformers pypdf sentence-transformers langchain-huggingface chromadb==0.3.29

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/opt/ohpc/pub/apps/python/3.8.12/bin/python3.8 -m pip install --upgrade pip' command.[0m[33m
[0m

In [6]:
import os
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Path to your PDF directory
pdf_directory = '/home/u3/prasan/cancer'
pdf_files = [os.path.join(pdf_directory, file) for file in os.listdir(pdf_directory) if file.endswith('.pdf')]

# Load and split the documents into pages
documents = []
for pdf_file in pdf_files:
    loader = PyPDFLoader(pdf_file)
    pages = loader.load_and_split()
    documents.extend(pages)

# Apply RecursiveCharacterTextSplitter for recursive splitting
splitter = RecursiveCharacterTextSplitter(chunk_size=1200, chunk_overlap=150)  # Split into chunks of 1000 characters with 100 overlap
split_documents = splitter.split_documents(documents)

In [7]:
print (len(documents))
print (len(split_documents))

115
208


In [8]:
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document

# Load the model from HuggingFace's SentenceTransformers to create embeddings
model_name = 'sentence-transformers/all-MiniLM-L6-v2'
huggingface_embeddings = HuggingFaceEmbeddings(model_name=model_name)

# Create the Chroma vector store
persist_directory = 'chroma-vector-store'
if not os.path.exists(persist_directory):
    os.makedirs(persist_directory)

vectorstore = Chroma.from_documents(split_documents, huggingface_embeddings, persist_directory=persist_directory)

# Perform a similarity search for the example query to test the RAG (Retrieve and Generate)
query = "What should i do when i discover i have cancer?"
results = vectorstore.similarity_search(query, k=3)  # Retrieve the top 3 most similar documents

# Display the results
print("Top 3 most relevant documents:")
for result in results:
    print(f"Page {result.metadata['page']}: {result.page_content[:300]}")  # Print the first 300 characters of the chunk

  from tqdm.autonotebook import tqdm, trange


Top 3 most relevant documents:
Page 7: 2
Learning that you have cancer can come as a shock. How did you react? You may 
have felt numb, frightened, or angry. You may not have believed what the doctor 
was saying. You may have felt all alone, even if your friends and family were in 
the same room with you. These feelings are normal. 
For 
Page 7: 2
Learning that you have cancer can come as a shock. How did you react? You may 
have felt numb, frightened, or angry. You may not have believed what the doctor 
was saying. You may have felt all alone, even if your friends and family were in 
the same room with you. These feelings are normal. 
For 


In [9]:
from langchain.chains import RetrievalQA
from langchain.prompts import ChatPromptTemplate

# Convert the vector store to a retriever
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3}
)

system_prompt = (
    "You are an assistant. Use the following pieces of retrieved context to answer the question. "
    "If you don't know the answer, say that you don't know. Don't try to make up an answer. "
   
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{question}"),
    ]
)

# Create a Retrieval QA chain with the retriever and language model
llm_model = ChatMistralAI(model="mistral-large-latest")
qa_chain = RetrievalQA.from_chain_type(
    llm=llm_model,
    chain_type="stuff",
    retriever=retriever,
    chain_type_kwargs={"prompt": prompt},
    return_source_documents=True
)

# Example query
question = "What should I do when I discover I have cancer?"
response = qa_chain.invoke({"query": question})
print(f"Answer: {response}")



In [10]:
# Display the retrieved contexts
print("Retrieved Contexts:")
for context_doc in response["source_documents"]:
    print(f"\nContext from Page {context_doc.metadata.get('page', 'Unknown')}:")
    print(context_doc.page_content[:300])  # Display the first 300 characters for readability

# Display the final answer
print("\nAnswer:")
print(response['result'])

Retrieved Contexts:

Context from Page 7:
2
Learning that you have cancer can come as a shock. How did you react? You may 
have felt numb, frightened, or angry. You may not have believed what the doctor 
was saying. You may have felt all alone, even if your friends and family were in 
the same room with you. These feelings are normal. 
For 

Context from Page 7:
2
Learning that you have cancer can come as a shock. How did you react? You may 
have felt numb, frightened, or angry. You may not have believed what the doctor 
was saying. You may have felt all alone, even if your friends and family were in 
the same room with you. These feelings are normal. 
For 

Answer:
Based on the provided context, here are some suggestions on what to do when you discover you have cancer:

1. **Acknowledge and Accept Your Feelings**: Understand that feelings of shock, numbness, fear, anger, or disbelief are normal. It's also normal for these feelings to change often.

2. **Take Time to Process**: The fir

# Chatbot on Langchain (context-aware multi-turn conversational bot)


Before we had: query -> retriever

Now we will have: (query, conversation history) -> LLM -> rephrased query -> retriever -> LLM

In [11]:
from langchain.prompts import ChatPromptTemplate
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

contextualize_q_system_prompt = (
    "Given a chat history and the latest user question which might reference context in the chat history, "
    "formulate a standalone question which can be understood without the chat history. "
    "Do NOT answer the question, just reformulate it if needed and otherwise return it as is."
)

contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

history_aware_retriever = create_history_aware_retriever(
    llm_model, retriever, contextualize_q_prompt
)

In [12]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

question_answer_chain = create_stuff_documents_chain(llm_model, qa_prompt)

# Final RAG chain
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

In [None]:
# # Example of a chat history with prior conversation
# chat_history = [
#     ("human", "What is cancer?"),
#     ("assistant", "Cancer is a group of diseases characterized by uncontrolled cell growth.")
# ]

# # New question from the user
# question = "What should I do when I discover I have cancer?"

# # Invoke the RAG chain with the chat history and the new question
# response = rag_chain.invoke({"input": question, "chat_history": chat_history})

# # Print the full response to understand its structure
# print("Full response:", response)

# # Access the answer generated by the chain
# if 'answer' in response:
#     print("Answer:", response['answer'])
# else:
#     print("No answer found in response.")

# # Display the retrieved context (if any)
# if 'context' in response:
#     print("Retrieved Contexts:")
#     for context_doc in response['context']:
#         print(f"\nContext from {context_doc.metadata.get('source', 'Unknown')}, Page {context_doc.metadata.get('page', 'Unknown')}:")
#         print(context_doc.page_content[:300])  # Display first 300 characters for readability
# else:
#     print("No context found.")

In [13]:
from langchain.memory import ConversationBufferMemory

# Initialize memory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Example usage
def get_response(question: str):
    # Get chat history from memory
    chat_history = memory.load_memory_variables({})["chat_history"]
    
    # Run the chain
    response = rag_chain.invoke({
        "input": question,
        "chat_history": chat_history
    })
    
    # Save the interaction to memory
    memory.save_context(
        {"input": question},
        {"output": response["answer"]}
    )
    
    return response["answer"]

In [14]:
question1 = "What should I do when I discover I have cancer?"
answer1 = get_response(question1)
print("Answer 1:", answer1)

Answer 1: The provided context discusses various feelings and reactions people might have upon learning they have cancer, but it does not provide specific advice on what to do upon receiving a cancer diagnosis. Therefore, I don't have a specific answer for you based on the given information. It's always recommended to consult with healthcare professionals for guidance tailored to your specific situation.


In [15]:
question2 = "What about treatment options?"
answer2 = get_response(question2)
print("Answer 2:", answer2)

print("\nChat History:")
chat_history = memory.load_memory_variables({})["chat_history"]
for message in chat_history:
    print(f"{message.type}: {message.content}")

Answer 2: Based on the provided context, here's what you should do regarding treatment options upon discovering you have cancer:

1. **Educate yourself**: Learn as much as you can about the available treatment options for your specific type of cancer.

2. **Discuss with your healthcare team**: Talk openly with your doctor about the potential treatments, their benefits, and side effects. They can provide personalized advice based on your health profile and the specifics of your cancer.

3. **Consider side effects and their management**: Understand that treatments can have side effects, such as impacting fertility or affecting your diet. Many side effects can be managed with the help of your healthcare team, such as adjusting medications or suggesting suitable foods to eat.

4. **Get a second opinion**: If you're unsure about the proposed treatment options, don't hesitate to seek a second opinion from another specialist.

Chat History:
human: What should I do when I discover I have cance

In [None]:
memory.clear()