<a href="https://colab.research.google.com/github/chelseasinnya/AI-RAG-Bot-Workshop/blob/main/Day3Chelsea.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Building a Gen AI RAG Chatbot from Scratch**


**Step 1: Install the necessary libraries**

In [None]:
!pip install langchain chromadb pypdf langchain_community sentence_transformers langchain_huggingface pyngrok streamlit langchain-groq


Collecting langchain
  Downloading langchain-0.3.1-py3-none-any.whl.metadata (7.1 kB)
Collecting chromadb
  Downloading chromadb-0.5.11-py3-none-any.whl.metadata (6.8 kB)
Collecting pypdf
  Downloading pypdf-5.0.1-py3-none-any.whl.metadata (7.4 kB)
Collecting langchain_community
  Downloading langchain_community-0.3.1-py3-none-any.whl.metadata (2.8 kB)
Collecting sentence_transformers
  Downloading sentence_transformers-3.1.1-py3-none-any.whl.metadata (10 kB)
Collecting langchain_huggingface
  Downloading langchain_huggingface-0.1.0-py3-none-any.whl.metadata (1.3 kB)
Collecting pyngrok
  Downloading pyngrok-7.2.0-py3-none-any.whl.metadata (7.4 kB)
Collecting streamlit
  Downloading streamlit-1.39.0-py2.py3-none-any.whl.metadata (8.5 kB)
Collecting langchain-groq
  Downloading langchain_groq-0.2.0-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-core<0.4.0,>=0.3.6 (from langchain)
  Downloading langchain_core-0.3.8-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-spli

# Now that we understand the purposes of Streamlit and ngrok, let’s create a chat interface that stores chat history. We will use ngrok to establish a tunnel between our Google Colab environment and an external endpoint, allowing us to expose a URL for others to access our chat application.

Steps to follow:
1)Ngrok by deafault only executes a single file so we will to create a .py file of our notebook code

In [None]:

 #This exports all content in this to an app.py file which ngrok will use to run the app
%%writefile app.py
import os
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import TextLoader, PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_groq import ChatGroq
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
import streamlit as st

# Define constants
GROQ_API_KEY = "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"  # Set your Groq API key here
PERSIST_DIRECTORY = "/content/chroma_db/"
folder_path="/content/data"
DOCUMENT_PATH = "/content/data/2307.pdf"  # Make sure this path is correct
# Here we define the title for our page or chatbot interface
st.title("RAG Chatbot Interface")

# Remove caching from this function
def load_or_create_vector_store(texts, embeddings):
    """Load or create a new Chroma vector store."""
    if os.path.exists(PERSIST_DIRECTORY):
        return Chroma(persist_directory=PERSIST_DIRECTORY, embedding_function=embeddings)

    return Chroma.from_documents(texts, embeddings, persist_directory=PERSIST_DIRECTORY)

def load_documents(document_path):
    """Load and split documents."""
    #  use text loader for loading csv or txt files
    all_texts = []  # To store the texts from all PDFs
    # Iterate over each file in the folder
    for filename in os.listdir(folder_path):
        # Only process files that end with .pdf
        if filename.endswith(".pdf"):
            file_path = os.path.join(folder_path, filename)
            loader = PyPDFLoader(file_path)  # Load the individual PDF
            documents = loader.load()  # Load documents from the PDF

            # Split the loaded documents into smaller chunks
            text_splitter = RecursiveCharacterTextSplitter(
                chunk_size=1000,
                chunk_overlap=200,
                length_function=len,
                separators=["\n\n", "\n", " ", ""]
            )
            texts = text_splitter.split_documents(documents)
            all_texts.extend(texts)
    return all_texts
# Load documents and create embeddings
loading_message = st.empty()  # Create an empty placeholder for the loading message
loading_message.text("Loading documents and setting up embeddings...")  # Set the initial loading message

texts = load_documents(DOCUMENT_PATH)
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/msmarco-distilbert-base-v4")
db = load_or_create_vector_store(texts, embeddings)
db.persist()

# Clear the loading message
loading_message.empty()  # Remove the loading message

# Set up the language model with the API key
llm = ChatGroq(model="llama-3.1-70b-versatile", api_key=GROQ_API_KEY)  # Pass the API key here

# Define the prompt template
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
Answer: """
QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"], template=template)

# Set up the RAG pipeline
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=db.as_retriever(),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
)

# Initialize chat history
if "messages" not in st.session_state:
    st.session_state.messages = []

# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# Accept user input
if prompt := st.chat_input("What is up?"):
    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": prompt})
    # Display user message in chat message container
    with st.chat_message("user"):
        st.markdown(prompt)

    # Get the assistant's response
    response = qa_chain({"query": prompt})  # Use the prompt to get the response
    st.session_state.messages.append({"role": "assistant", "content": response['result']})

    # Display assistant response in chat message container
    with st.chat_message("assistant"):
        st.markdown(response['result'])


Overwriting app.py


In [None]:
from pyngrok import ngrok

# Set authentication token if you haven't already done so
ngrok.set_auth_token("2mtuglaiZefK2DOwsvFD7ec9qd3_3MfJEnrqkg6oqsHboDyhv")

# Start Streamlit server on a specific port
!nohup streamlit run app.py --server.port 5011 &

# Start ngrok tunnel to expose the Streamlit server
ngrok_tunnel = ngrok.connect(addr='5011', proto='http', bind_tls=True)

# Print the URL of the ngrok tunnel
print(' * Tunnel URL:', ngrok_tunnel.public_url)

nohup: appending output to 'nohup.out'
 * Tunnel URL: https://596e-34-126-182-175.ngrok-free.app
