<a href="https://colab.research.google.com/github/AarifCha/RAG-HF-Langchain/blob/main/Website_RAG_with_Streamlit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install Dependencies

Lets first pip install all the dependencies need for this project.

In [7]:
# !pip install --quiet torch transformers sentence-transformers
# !pip install --quiet langchain langchain_community
# !pip install --quiet pypdf
# !pip install --quiet chromadb
!pip install --quiet streamlit
!pip install --quiet streamlit_chat
# We will use ngrok to host our webapp
!pip install pyngrok



# Writing the .py file for Streamlit

First we will write a StreamlitApp.py file that contains the Streamlit application code that we will later run and host using ngrok.

The layout of the application is pretty simple. We will have two tabs, one where we chat with the bot, and one where we can upload documents. Currently the bot just returns the uppercased version of the input message, except for when the message is "file names", in which case it returns the names of the files that have been uploaded.

Soon I will modify this to have the bot respond using the RAG setup in the HuggingFaceWLangChain.ipynb notebook.

In [13]:
%%writefile StreamlitApp.py
import streamlit as st

st.set_page_config(page_title="Chatbot", page_icon=":robot:")
st.title("Chatbot :robot_face:")

# Initialize chat history
if "messages" not in st.session_state:
    st.session_state.messages = []

# Initialize PDF files
if "pdf_files" not in st.session_state.keys():
    st.session_state["pdf_files"] = []

# Initialize uploaded files
if "uploaded_files" not in st.session_state.keys():
    st.session_state["uploaded_files"] = []

# Layout: Two tabs, one for the chat and one for document uploading
tab1, tab2 = st.tabs(["Chat", "Documents"])

# The chat window is cleared out every time we upload a file or
# input new chat. Therefore we need to print out the chat after
# any operation we do. To make it easier we will just make a function
# that prints out the chat history and just call it at the end of
# every operation.
def print_all_chat():
    # This reprints all the chat history
    with tab1:
        with col1:
            for message in reversed(st.session_state.messages):
                with st.chat_message(message["role"]):
                    st.markdown(message["content"])

def files_update():
    # Call back when a set of files is uploaded.
    # We simply append the files that haven't been previously uploaded
    # into pdf_files session_state and clear out the uploaded_files.

    # This will eventually be replaced by code that updates the vector store
    # with the newly uploaded files.

    for uploaded_file in st.session_state["uploaded_files"]:
        if uploaded_file not in st.session_state["pdf_files"]:
           st.session_state["pdf_files"].append(uploaded_file)

    st.session_state["uploaded_files"] = []

    print_all_chat()

def submit():
    # Callback for when a new chat input is submitted
    prompt = st.session_state.chat_input
    st.session_state.chat_input = ""

    # We first check if the input is "file names", if so
    # we print out the names of all the files that have been uploaded.
    # If not, we will simply uppercase the input and return it as a response.

    # This will eventually be replaced by code that uses a RAG to generate a
    # response to the inputted question.

    if prompt == "file names":
        response = "Here are the file names:\n"
        for file in st.session_state["pdf_files"]:
            response += f"- {file.name}\n"
    else:
        response = prompt.upper()

    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": prompt})
    # Add assistant response to chat history
    st.session_state.messages.append({"role": "assistant", "content": response})

    # Print out the previous messages
    print_all_chat()

with tab1:
    # In this tab we set up two columns, one for the chat window and one for
    # the clear chat button. When a new chat is inputted we call the submit
    # function to generate a response and reprint the chat.
    col1, col2 = st.columns([4, 1])

    # Column for the chat window
    with col1:
        st.text_input("Enter your message", key="chat_input", on_change=submit)

    # Column for the clear button (clears the chat history)
    with col2:
        if button := st.button("Clear Chat"):
            st.session_state.messages = []
            st.rerun()

with tab2:
    # In this tab, we will have a form that lets us upload
    # documents. The uploaded documents are processed according
    # to the files_update function.
    with st.form("my-form", clear_on_submit=True):
        file = st.file_uploader("FILE UPLOADER")
        submitted = st.form_submit_button("UPLOAD!")

    # After the upload button has been clicked
    # and files is not empty we add the files to the
    # uploaded_files session state and preocess them.
    if submitted and file is not None:
        st.write("UPLOADED!")
        st.session_state["uploaded_files"] = [file]
        files_update()

Overwriting StreamlitApp.py


# Hosting the WebApp

Here we will use ngrok to host our webapp. To do so, you will need to make an account with them to get an API authtoken. It is free to use, but you must add a credit/debit card with them. It will not be charged and this is their way of preventing abuse. For more information check out their website: https://ngrok.com/

We will use port 8501 since it is the unofficial TCP for Streamlit (https://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers). It is not a randomly chosen number!

In [14]:
import getpass
from pyngrok import ngrok, conf

# First we ask the user for their ngrok authtoken inorder to
print("Enter your authtoken, which can be copied from https://dashboard.ngrok.com/get-started/your-authtoken")
conf.get_default().auth_token = getpass.getpass()

# We then connect to the 8501 port and get our url
public_url = ngrok.connect('8501').public_url
print("Here is your website link:\n",public_url)

# Now we simply run the streamlit app on port 8501 and our
# webapp is ready to be used!
!streamlit run --server.port 8501 StreamlitApp.py >/dev/null

Enter your authtoken, which can be copied from https://dashboard.ngrok.com/get-started/your-authtoken
··········
Here is your website link:
 https://b353-34-29-74-191.ngrok-free.app
