## Retrieval Augmented Generation for LLM Bots with LangChain

In [1]:
# Either you can store the  OpenAI key in the “OPENAI_API_KEY” environment variable.
# or pass it here as below from a config.ini
import configparser
workingFolder=r'C:\Users\jfrancis\OneDrive - GalaxE. Solutions, Inc\GalaxE D Drive\AI Journey\Gen AI'
# Read the configuration file
config = configparser.ConfigParser()
config.read(workingFolder+'\\config.ini')
OPENAI_API_KEY=config.get('General','OPENAI_API_KEY')
ACTIVELOOP_TOKEN=config.get('General','ACTIVELOOP_TOKEN')
ACTIVELOOP_ORG_ID=config.get('General','ACTIVELOOP_ORG_ID')
HUGGINGFACEHUB_API_TOKEN=config.get('General','HUGGINGFACEHUB_API_TOKEN')
GOOGLE_API_KEY=config.get('General','GOOGLE_API_KEY')
GOOGLE_CSE_ID=config.get('General','GOOGLE_CSE_ID')
COHERE_API_KEY=config.get('General','COHERE_API_KEY')
APIFY_API_TOKEN=config.get('General','APIFY_API_TOKEN')

In [2]:
# Get the token from OPENAI/Active loop website before this. Now we are taking from the config.ini
import os
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
os.environ["ACTIVELOOP_TOKEN"] = ACTIVELOOP_TOKEN
os.environ["APIFY_API_TOKEN"] = APIFY_API_TOKEN
os.environ["COHERE_API_KEY"] = COHERE_API_KEY

# create Deep Lake dataset
# TODO: use your organization id here. (by default, org id is your username)
my_activeloop_org_id = ACTIVELOOP_ORG_ID

In [3]:
#langchain==0.0.208
#deeplake==3.6.5
#openai==0.27.8
#tiktoken==0.4.0
#cohere==4.34.0
#apify-client==1.5.0
#streamlit==1.26.0

### What is Retrieval Augmented Generation (RAG) in AI?

Retrieval Augmented Generation, or RAG, is an advanced technique in AI that bridges information retrieval and text generation. It is designed to handle intricate and knowledge-intensive tasks by pulling relevant information from external sources and feeding it into a Large Language Model for text generation. When RAG receives an input, it searches for pertinent documents from specified sources (e.g., Wikipedia, company knowledge base, etc.), combines this retrieved data with the input, and then provides a comprehensive output with references. This innovative structure allows RAG to seamlessly integrate new and evolving information without retraining the entire model from scratch. It also enables you to fine-tune the model, enhancing its knowledge domain beyond what it was trained on.

### Step 1: Loading the Data with RecursiveCharacterTextSplitter

In this stage, we are gathering the data needed to provide context to the chatbot. We use ApifyLoader to scrape the content from a specific website. The RecursiveCharacterTextSplitter is then used to split the data into smaller, manageable chunks. Next, we embed the data using CohereEmbeddings which translates the text data into numerical data (vectors) that the chatbot can learn from. Lastly, we load the transformed data into Deep Lake.

Helper Functions

    ApifyWrapper(): Scrapes the content from websites.


In [4]:
from langchain.document_loaders import ApifyDatasetLoader
from langchain.utilities import ApifyWrapper
from langchain.document_loaders.base import Document
import os

apify = ApifyWrapper()
loader = apify.call_actor(
    actor_id="apify/website-content-crawler",
    run_input={"startUrls": [{"url": "https://python.langchain.com/docs/get_started/introduction"}]},
    dataset_mapping_function=lambda dataset_item: Document(
        page_content=dataset_item["text"] if dataset_item["text"] else "No content available",
        metadata={
            "source": dataset_item["url"],
            "title": dataset_item["metadata"]["title"]
        }
    ),
)

docs = loader.load()




In [5]:
docs

[Document(page_content='Tuesday, September 26, 2023 \nThe sample return capsule on Sunday. \nImage: NASA. \nRelated articles\n26 September 2023: NASA\'s OSIRIS-REx arrives in Houston, US after returning asteroid samples to Earth\n29 August 2023: US government sues SpaceX, claims hiring discrimination against asylees\n2 July 2023: European Space Agency\'s Euclid telescope launches from Florida, US\n17 May 2023: Scientists: Rock that hit New Jersey home is 4.6 billion-year-old meteorite\n9 May 2023: First NASA TROPICS satellites launch to monitor tropical storms\nCollaborate!\nPillars of Wikinews writing\nWriting an article\nYesterday, a capsule from NASA\'s Origins, Spectral Interpretation, Resource Identification and Security – Regolith Explorer (OSIRIS-REx) spacecraft containing samples from the asteroid Bennu arrived in the US city of Houston, Texas. The capsule was en route to the Johnson Space Center (JSC) after landing in the Utah Test and Training Range the day before. This was N

ApifyWrapperRecursiveCharacterTextSplitter(): Splits the scraped content into manageable chunks.

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# we split the documents into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=20, length_function=len
)
docs_split = text_splitter.split_documents(docs)

In [7]:
docs_split

[Document(page_content="Tuesday, September 26, 2023 \nThe sample return capsule on Sunday. \nImage: NASA. \nRelated articles\n26 September 2023: NASA's OSIRIS-REx arrives in Houston, US after returning asteroid samples to Earth\n29 August 2023: US government sues SpaceX, claims hiring discrimination against asylees\n2 July 2023: European Space Agency's Euclid telescope launches from Florida, US\n17 May 2023: Scientists: Rock that hit New Jersey home is 4.6 billion-year-old meteorite\n9 May 2023: First NASA TROPICS satellites launch to monitor tropical storms\nCollaborate!\nPillars of Wikinews writing\nWriting an article\nYesterday, a capsule from NASA's Origins, Spectral Interpretation, Resource Identification and Security – Regolith Explorer (OSIRIS-REx) spacecraft containing samples from the asteroid Bennu arrived in the US city of Houston, Texas. The capsule was en route to the Johnson Space Center (JSC) after landing in the Utah Test and Training Range the day before. This was NASA


    CohereEmbeddings(): Translates text data into numerical data.
    DeepLake(): Stores and retrieves the transformed data.


In [8]:
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.vectorstores import DeepLake

embeddings = CohereEmbeddings(model = "embed-english-v2.0")

username = my_activeloop_org_id # replace with your username from app.activeloop.ai
db_id = 'kb-material'# replace with your database name
DeepLake.force_delete_by_path(f"hub://{username}/{db_id}")

dbs = DeepLake(dataset_path=f"hub://{username}/{db_id}", embedding_function=embeddings)
dbs.add_documents(docs_split)

 

Your Deep Lake dataset has been successfully created!
The dataset is private so make sure you are logged in!


 

Dataset(path='hub://jfrancis/kb-material', tensors=['embedding', 'id', 'metadata', 'text'])

  tensor      htype      shape     dtype  compression
  -------    -------    -------   -------  ------- 
 embedding  embedding  (4, 4096)  float32   None   
    id        text      (4, 1)      str     None   
 metadata     json      (4, 1)      str     None   
   text       text      (4, 1)      str     None   


['88eedeed-822f-11ee-b60b-401c83da435e',
 '88eedeee-822f-11ee-98a7-401c83da435e',
 '88eedeef-822f-11ee-b082-401c83da435e',
 '88eedef0-822f-11ee-b8f9-401c83da435e']

### Step 2: Retrieve Data

In this step, we’re setting up the environment to retrieve data from DeepLake using the CohereEmbeddings for transforming numerical data back to text. We’ll then use ContextualCompressionRetriever & CohereRerank to search, rank and retrieve the relevant data.

First we set the COHERE_API_KEY and ACTIVELOOP_TOKEN environment variables, allowing us to access the Cohere and ActiveLoop services.

    DeepLake() retrieve data
    CohereEmbeddings()

Following this, we create a DeepLake object, passing in the dataset path to the DeepLake instance, setting it to read-only mode and passing in the embedding function.

Next, we define a data_lake function. Inside this function, we instantiate a CohereEmbeddings object with a specific model, embed-english-v2.0.

    ContextualCompressionRetriever() & CohereRerank()
    Reranking (cohere.com)

We then instantiate a CohereRerank object with a specific model and number of top items to consider (top_n), and finally create a ContextualCompressionRetriever object, passing in the compressor and retriever objects. The data_lake function returns the DeepLake object, the compression retriever, and the retriever.

The data retrieval process is set up by calling the data_lake function and unpacking its return values into dbs, compression_retriever, and retriever.

The Rerank endpoint acts as the last stage reranker of a search flow.


In [10]:
import streamlit as st
from langchain.vectorstores import DeepLake
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CohereRerank

@st.cache_resource()
def data_lake():
    embeddings = CohereEmbeddings(model = "embed-english-v2.0")

    dbs = DeepLake(
        dataset_path="hub://jfrancis/kb-material", 
        read_only=True, 
        embedding_function=embeddings
        )
    retriever = dbs.as_retriever()
    retriever.search_kwargs["distance_metric"] = "cos"
    retriever.search_kwargs["fetch_k"] = 20
    retriever.search_kwargs["maximal_marginal_relevance"] = True
    retriever.search_kwargs["k"] = 20

    compressor = CohereRerank(
        model = 'rerank-english-v2.0',
        top_n=5
        )
    compression_retriever = ContextualCompressionRetriever(
        base_compressor=compressor, base_retriever=retriever
        )
    return dbs, compression_retriever, retriever

dbs, compression_retriever, retriever = data_lake()


2023-11-13 19:49:54.087 
  command:

    streamlit run C:\ProgramData\Anaconda3b\envs\env_llm\Lib\site-packages\ipykernel_launcher.py [ARGUMENTS]


Deep Lake Dataset in hub://jfrancis/kb-material already exists, loading from the storage


### Step 3: Use ConversationBufferWindowMemory to Build Conversation Chain with Memory

In this step, we will build a memory system for our chatbot using the ConversationBufferWindowMemory.

The memory function instantiates a ConversationBufferWindowMemory object with a specific buffer size (k), a key for storing chat history, and parameters for returning messages and output key. The function returns the instantiated memory object.

We then instantiate the memory by calling the memory function.

In [11]:
from langchain.memory import ConversationBufferWindowMemory

@st.cache_resource()
def memory():
    memory=ConversationBufferWindowMemory(
        k=3,
        memory_key="chat_history",
        return_messages=True, 
        output_key='answer'
        )
    return memory

memory=memory()


The chatbot uses the ChatOpenAI() function to initiate our LLM Chat model. 
Next, we build the conversation chain using the ConversationalRetrievalChain. We use the from_llm class method, passing in the llm, retriever, memory, and several additional parameters. The resulting chain object is stored in the qa variable.

In [13]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

qa = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=compression_retriever,
memory=memory,
verbose=True,
chain_type="stuff",
return_source_documents=True
)

### Step 4: Building the Chat UI

In this final step, we set up the chat user interface (UI).

We start by creating a button that, when clicked, triggers the clearing of cache and session states, effectively starting a new chat session.

Then, we initialize the chat history if it does not exist and display previous chat messages from the session state.

In [None]:
# This part onwards you won't be able to run in Jupyter notebook.
# You will need to save this entire code as .py file and run form CLI
streamlit run name_of_your_chatbot.py #run with the name of your file

In [None]:
# Create a button to trigger the clearing of cache and session states
if st.sidebar.button("Start a New Chat Interaction"):
    st.clear_cache_and_session()

# Initialize chat history
if "messages" not in st.session_state:
    st.session_state.messages = []

# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

In [9]:
import streamlit as st
import io
import re
import sys
from typing import Any, Callable

def capture_and_display_output(func: Callable[..., Any], args, **kwargs) -> Any:
    # Capture the standard output
    original_stdout = sys.stdout
    sys.stdout = output_catcher = io.StringIO()

    # Run the given function and capture its output
    response = func(args, **kwargs)

    # Reset the standard output to its original value
    sys.stdout = original_stdout

    # Clean the captured output
    output_text = output_catcher.getvalue()
    clean_text = re.sub(r"\x1b[.?[@-~]", "", output_text)

    # Custom CSS for the response box
    st.markdown("""
    <style>
        .response-value {
            border: 2px solid #6c757d;
            border-radius: 5px;
            padding: 20px;
            background-color: #f8f9fa;
            color: #3d3d3d;
            font-size: 20px;  # Change this value to adjust the text size
            font-family: monospace;
        }
    </style>
    """, unsafe_allow_html=True)

    # Create an expander titled "See Verbose"
    with st.expander("See Langchain Thought Process"):
        # Display the cleaned text in Streamlit as code
        st.code(clean_text)

    return response

The chat_ui function is used to handle the chat interactions. Inside this function, we accept user input, add the user’s message to the chat history and display it, load the memory variables which include the chat history, and predict and display the chatbot’s response.

The function also displays the top 2 retrieved sources relevant to the response and appends the chatbot’s response to the session state. The chat_ui function is then called, passing in the ConversationalRetrievalChain object.

In [14]:
def chat_ui(qa):
    # Accept user input
    if prompt := st.chat_input(
        "Ask me questions: How can I retrieve data from Deep Lake in Langchain?"
    ):

        # Add user message to chat history
        st.session_state.messages.append({"role": "user", "content": prompt})

        # Display user message in chat message container
        with st.chat_message("user"):
            st.markdown(prompt)

        # Display assistant response in chat message container
        with st.chat_message("assistant"):
            message_placeholder = st.empty()
            full_response = ""

            # Load the memory variables, which include the chat history
            memory_variables = memory.load_memory_variables({})

            # Predict the AI's response in the conversation
            with st.spinner("Searching course material"):
                response = capture_and_display_output(
                    qa, ({"question": prompt, "chat_history": memory_variables})
                )

            # Display chat response
            full_response += response["answer"]
            message_placeholder.markdown(full_response + "▌")
            message_placeholder.markdown(full_response)

            #Display top 2 retrieved sources
            source = response["source_documents"][0].metadata
            source2 = response["source_documents"][1].metadata
            with st.expander("See Resources"):
                st.write(f"Title: {source['title'].split('·')[0].strip()}")
                st.write(f"Source: {source['source']}")
                st.write(f"Relevance to Query: {source['relevance_score'] * 100}%")
                st.write(f"Title: {source2['title'].split('·')[0].strip()}")
                st.write(f"Source: {source2['source']}")
                st.write(f"Relevance to Query: {source2['relevance_score'] * 100}%")

        # Append message to session state
        st.session_state.messages.append(
            {"role": "assistant", "content": full_response}
        )


In [15]:
# Run function passing the ConversationalRetrievalChain
chat_ui(qa)

<img src="https://github.com/JohnsonFrancis/GenerativeAIProjects/blob/main/RAG%20for%20LLM%20Bots/ChatApp1.png"/>

<img src="https://github.com/JohnsonFrancis/GenerativeAIProjects/blob/main/RAG%20for%20LLM%20Bots/ChatApp2.png"/>