# Build Your Own RAG using RAGStack
This notebook shows the steps to take to use the DataStax Enterprise v7 Vector Store as a means to make LLM interactions meaningfull and without hallucinations. The approach taken here is Retrieval Augmented Generation.

You'll learn:
1. About the content in a CNN dataset
2. How to interact with the OpenAI Chat Model *without* providing this context
3. How to load this context into DataStax Enterprise v7
4. How to run a semantic similarity search on DataStax Enterprise v7
5. How to use this context *with* the local Mistral Chat Model

## Install dependencies

In [1]:
!pip install ragstack-ai datasets pipdeptree

Collecting datasets
  Downloading datasets-2.18.0-py3-none-any.whl (510 kB)
[K     |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 510 kB 12.2 MB/s eta 0:00:01
[?25hCollecting pipdeptree
  Downloading pipdeptree-2.16.1-py3-none-any.whl (27 kB)
Collecting multiprocess
  Downloading multiprocess-0.70.16-py39-none-any.whl (133 kB)
[K     |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 133 kB 48.0 MB/s eta 0:00:01
Collecting pyarrow-hotfix
  Downloading pyarrow_hotfix-0.6-py3-none-any.whl (7.9 kB)
Collecting xxhash
  Downloading xxhash-3.4.1-cp39-cp39-macosx_11_0_arm64.whl (30 kB)
Collecting dill<0.3.9,>=0.3.0
  Downloading dill-0.3.8-py3-none-any.whl (116 kB)
[K     |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 116 kB 97.5 MB/s eta 0:00:01
Installing collected packages: dill, xxhash, pyarrow-hotfix, multiprocess, pipdeptree, dataset

## Visualize Ragstack dependencies
RAGStack is a curated stack of the best open-source software for easing implementation of the RAG pattern in production-ready applications using DataStax Enterprise, Astra Vector DB or Apache Cassandra as a vector store.

A single command (pip install ragstack-ai) unlocks all the open-source packages required to build production-ready RAG applications with LangChain and DataStax Enterprise, Astra Vector DB or Apache Cassandra.

For each open-source project included in RAGStack, we select a version lineup and then test the combination for compatibility, performance, and security. Our extensive test suite ensures that RAGStack components work well together so you can confidently deploy them in production. We also run security scans on all components using industry-standard tools to ensure that you are not exposed to known vulnerabilities.

In [2]:
!pipdeptree -p ragstack-ai

ragstack-ai==0.8.0
‚îú‚îÄ‚îÄ astrapy [required: >=0.7.0,<0.8.0, installed: 0.7.7]
‚îÇ   ‚îú‚îÄ‚îÄ cassio [required: >=0.1.4,<0.2.0, installed: 0.1.5]
‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ cassandra-driver [required: >=3.28.0, installed: 3.29.0]
‚îÇ   ‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ geomet [required: >=0.1,<0.3, installed: 0.2.1.post1]
‚îÇ   ‚îÇ   ‚îÇ       ‚îú‚îÄ‚îÄ click [required: Any, installed: 8.1.7]
‚îÇ   ‚îÇ   ‚îÇ       ‚îî‚îÄ‚îÄ six [required: Any, installed: 1.16.0]
‚îÇ   ‚îÇ   ‚îú‚îÄ‚îÄ numpy [required: >=1.0, installed: 1.26.4]
‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ requests [required: >=2, installed: 2.31.0]
‚îÇ   ‚îÇ       ‚îú‚îÄ‚îÄ certifi [required: >=2017.4.17, installed: 2024.2.2]
‚îÇ   ‚îÇ       ‚îú‚îÄ‚îÄ charset-normalizer [required: >=2,<4, installed: 3.3.2]
‚îÇ   ‚îÇ       ‚îú‚îÄ‚îÄ idna [required: >=2.5,<4, installed: 3.6]
‚îÇ   ‚îÇ       ‚îî‚îÄ‚îÄ urllib3 [required: >=1.21.1,<3, installed: 2.2.1]
‚îÇ   ‚îú‚îÄ‚îÄ deprecation [required: >=2.1.0,<2.2.0, installed: 2.1.0]
‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ packaging [requ

## Keeping it all locally and within the enterprise firewall
In this notebook we'll keep all services local to ensure maximum safety:

- For the Vector Database, [DataStax Enterprise 7](https://www.datastax.com/blog/get-started-with-the-datastax-enterprise-7-0-developer-vector-search-preview) will be used.
- For the Foundational Model we'll be using [Mistral](https://mistral.ai/).

Read more about Mistral and how it stacks up to GPT-4 [here](https://www.zdnet.com/article/what-to-know-about-mistral-ai-the-company-behind-the-latest-gpt-4-rival/).

# Get an inference engine with Mistral started
There are a multitude of inference engines. You can go for [LM Studio](https://lmstudio.ai/) which has a nice UI. In this notebook, we'll use [Ollama](https://ollama.com/).

1. Get started by [downloading](https://ollama.com/download) it
2. Install it to your machine
3. Start the inference engine, while downloading Mistral (~4GB) with the command `ollama run mistral` in a terminal

In case this all fails, because of RAM limitations, you can opt to use [tinyllama](https://ollama.com/library/tinyllama) as a model.

## Call Mistral's Chat Model
In this example we'll ask what Daniell Radcliffe recieves when he turns 18.

As Mistral has no access to the CNN documents, it will come up with some answer that is very generic.

In [3]:
from langchain.prompts import ChatPromptTemplate
from langchain_community.chat_models.ollama import ChatOllama
from langchain.schema.runnable import RunnableMap
from langchain.schema.output_parser import StrOutputParser

template = """
You are a philosopher that draws inspiration from great thinkers of the past
to craft well-thought answers to user questions. Use the provided context as the basis
for your answers and do not make up new reasoning paths - just mix-and-match what you are given.
Your answers must be extensively written.

QUESTION: {question}

YOUR ANSWER:"""
prompt = ChatPromptTemplate.from_messages([("system", template)])

llm = ChatOllama(
    model="mistral:latest", 
    num_ctx=18192
)

inputs = RunnableMap({
  'question': lambda x: x['question']
})
chain = inputs | prompt | llm | StrOutputParser()

chain.invoke({"question": "What kind of fortune does Daniel Radcliffe get when he turns 18?"})



' I\'m glad you asked this question, as it provides an opportunity to delve into the intersection of philosophy, literature, and the real world. The concept of "fortune" is a complex one that has been explored by many great thinkers throughout history. In the context of your query, I believe we can draw insights from the works of Aristotle, who in his Nicomachean Ethics, discusses the idea of eudaimonia or human flourishing, and Seneca, who wrote extensively on the nature of fortune and virtue.\n\nDaniel Radcliffe turning 18 signifies the attainment of legal adulthood. However, the kind of fortune that comes with it is not something that can be neatly packaged or defined. According to Aristotle, eudaimonia or human flourishing is the highest good for a human being. It is achieved through living a virtuous life and realizing one\'s potential. Radcliffe, at 18, would have the freedom to make his own choices and shape his future. His fortune would lie in his ability to use this freedom wi

## Load data from CNN

In [4]:
import datasets

def load_articles(n=5):
  dataset = datasets.load_dataset('cnn_dailymail', '3.0.0', split='train', streaming=True)
  data = dataset.take(n)
  return [d['article']
          for d in data]

articles = load_articles()

  from .autonotebook import tqdm as notebook_tqdm
Downloading readme: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 15.6k/15.6k [00:00<00:00, 12.5MB/s]


## Check out some content
In this example we can read that when Daniel Radcliffe turns 18, he'll gain access to ¬£20 million.

In [5]:
print(articles[0])

LONDON, England (Reuters) -- Harry Potter star Daniel Radcliffe gains access to a reported ¬£20 million ($41.1 million) fortune as he turns 18 on Monday, but he insists the money won't cast a spell on him. Daniel Radcliffe as Harry Potter in "Harry Potter and the Order of the Phoenix" To the disappointment of gossip columnists around the world, the young actor says he has no plans to fritter his cash away on fast cars, drink and celebrity parties. "I don't plan to be one of those people who, as soon as they turn 18, suddenly buy themselves a massive sports car collection or something similar," he told an Australian interviewer earlier this month. "I don't think I'll be particularly extravagant. "The things I like buying are things that cost about 10 pounds -- books and CDs and DVDs." At 18, Radcliffe will be able to gamble in a casino, buy a drink in a pub or see the horror film "Hostel: Part II," currently six places below his number one movie on the UK box office chart. Details of ho

## Generate chunks to load into the Vector Store
Now let's load the CNN data into the Astra DB Vector Store.
1. First we'll chunk up the data so that it can be loaded in multiple pieces.
2. Then we'll create a new Vector Store on Astra DB.
3. Lastly, we'll load up the documents. As part of this step, the data will be vectorized and it's embeddings stored in the Vector Store.

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

documents = splitter.create_documents(articles)
document_chunks = splitter.split_documents(documents)

print(document_chunks[0])

page_content='LONDON, England (Reuters) -- Harry Potter star Daniel Radcliffe gains access to a reported ¬£20 million ($41.1 million) fortune as he turns 18 on Monday, but he insists the money won\'t cast a spell on him. Daniel Radcliffe as Harry Potter in "Harry Potter and the Order of the Phoenix" To the disappointment of gossip columnists around the world, the young actor says he has no plans to fritter his cash away on fast cars, drink and celebrity parties. "I don\'t plan to be one of those people who, as soon as they turn 18, suddenly buy themselves a massive sports car collection or something similar," he told an Australian interviewer earlier this month. "I don\'t think I\'ll be particularly extravagant. "The things I like buying are things that cost about 10 pounds -- books and CDs and DVDs." At 18, Radcliffe will be able to gamble in a casino, buy a drink in a pub or see the horror film "Hostel: Part II," currently six places below his number one movie on the UK box office ch

# Now let's run DSE 7 Vector Store
Make sure you have [Docker](https://www.docker.com/) installed.

Run DSE 7 in any of these two ways from a terminal window:
1. `docker-compose up` (using the docker-compose.yml file in the root of this repository)
2. `docker run -e DS_LICENSE=accept -p 9042:9042 datastax/dse-server:7.0.0-alpha.4`

And then create a default keyspace as follows:

In [10]:
from cassandra.cluster import Cluster

# Connect to DSE7
cluster = Cluster(["localhost"])
session = cluster.connect()

# Create the default keyspace
session.execute("CREATE KEYSPACE IF NOT EXISTS default_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}")

<cassandra.cluster.ResultSet at 0x29d0fe0a0>

# Get the Vector Store
The following code will create a new Vector Store in DataStax Enterprise. For embeddings we'll be using the default from Huggingface.

In [12]:
from langchain_community.vectorstores import Cassandra
from langchain_community.embeddings import HuggingFaceEmbeddings

# Create a new Astra DB Vector Store
vector_store = Cassandra(
    session=session,
    keyspace="default_keyspace",
    table_name="dse_vector_table",
    embedding=HuggingFaceEmbeddings()
)

In [13]:
# Load the CNN documents into the Astra DB Vector Store (Only the first time)
vector_store.add_documents(document_chunks)

['1649bc5479aa4bc986a85a7c99c5fe77',
 '9eecd9bee26d4776bc84a8d233c33c9d',
 '6b3159e26fc74edcb40427d2f5e8632d',
 'c3391603d03b45229d139276e860d8f5',
 'f98f509dcf0f4443ba9d4546d3584f3b',
 '2dfbb37a218746ac99c567619599d50e',
 '35968b97806640ac989416cbbbb9ac45',
 'edae471d56484478846e963659fde459',
 'f21851954b134d0aa5dd099e99c3d431',
 'cfabaf4e831e4d0f8dfb493d7cd2dbd9',
 'ead6c0c3e54e47268e74e1954dacfb83',
 '7e2b4b48ed0a4a1da77229ccaff27cea',
 '16efc42d94a54340b34a910efc43bba0',
 '537f07a501ff41fab7ea118aa7487cc0',
 '158605bfc0f947da8505f0b060fcea06',
 '61716101bb5a43b7aa6b2553ec381153',
 'fc95641fd5b04f98a279a0b9e695c0fc',
 '09e229b9a5b347a7a6ec6af38083dc0c',
 'e30fc3d06d2145308e9812997904843a',
 '56f9c6bdafbb46879779a76121fe1b9e',
 '30c7650694d14a42a4731a33b235fc40']

## Run a semantic query on the Astra DB Vector Store
Here you'll see that Astra DB retrieves relevant documents given the query.

In [14]:
query = 'What kind of fortune does Daniel Radcliffe get when he turns 18?'
vector_store.similarity_search(query, k=2)

[Document(page_content='LONDON, England (Reuters) -- Harry Potter star Daniel Radcliffe gains access to a reported ¬£20 million ($41.1 million) fortune as he turns 18 on Monday, but he insists the money won\'t cast a spell on him. Daniel Radcliffe as Harry Potter in "Harry Potter and the Order of the Phoenix" To the disappointment of gossip columnists around the world, the young actor says he has no plans to fritter his cash away on fast cars, drink and celebrity parties. "I don\'t plan to be one of those people who, as soon as they turn 18, suddenly buy themselves a massive sports car collection or something similar," he told an Australian interviewer earlier this month. "I don\'t think I\'ll be particularly extravagant. "The things I like buying are things that cost about 10 pounds -- books and CDs and DVDs." At 18, Radcliffe will be able to gamble in a casino, buy a drink in a pub or see the horror film "Hostel: Part II," currently six places below his number one movie on the UK box

## Call Mistral's Chat Model again
Now let's run the query again on the Mistral Chat Model while inserting the relevant context from the DataStax Enterprise Vector Store to make the response meaningfull and stop hallucinating.

In [15]:
from langchain.prompts import ChatPromptTemplate
from langchain_community.chat_models.ollama import ChatOllama
from langchain.schema.runnable import RunnableMap
from langchain.schema.output_parser import StrOutputParser

# Get the retriever for the Chat Model
retriever = vector_store.as_retriever(
    search_kwargs={"k": 5}
)

# Create the prompt template
template = """
You are a philosopher that draws inspiration from great thinkers of the past
to craft well-thought answers to user questions. Use the provided context as the basis
for your answers and do not make up new reasoning paths - just mix-and-match what you are given.
Your answers must be extensively written.

CONTEXT:
{context}

QUESTION: {question}

YOUR ANSWER:"""
prompt = ChatPromptTemplate.from_messages([("system", template)])

# Define the chain
inputs = RunnableMap({
  'context': lambda x: retriever.get_relevant_documents(x['question']),
  'question': lambda x: x['question']
})
chain = inputs | prompt | llm | StrOutputParser()

# Call the chain with the question
chain.invoke({"question": "What kind of fortune does Daniel Radcliffe get when he turns 18?"})

' Daniel Radcliffe is reportedly set to gain access to a ¬£20 million ($41.1 million) fortune when he turns 18. This substantial wealth, accumulated from his successful career as Harry Potter, has been held in a trust fund up until now. However, Radcliffe has expressed his intentions to avoid living an extravagant lifestyle and instead plans on buying more modest items like books, CDs, and DVDs. Despite the media speculation and potential scrutiny that comes with newfound wealth and adulthood, Radcliffe remains grounded and focused, as evidenced by his continued acting projects outside of the Harry Potter series.\n\nIn contrast, Michael Vick\'s fortune took a drastically different turn when he was involved in an illegal activity, specifically dog fighting. As a result, the NFL commissioner, Roger Goodell, stated that any conduct which tarnishes the good reputation of the NFL will not be tolerated. In this case, Vick\'s actions led to his suspension and potential financial consequences,

# Extra points ü§© - Let's add a chat interface
In this part of the demo, we'll actually create a fully working chatbot using Streamlit!

In [None]:
!pip install streamlit

Collecting streamlit
  Downloading streamlit-1.31.1-py2.py3-none-any.whl (8.4 MB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m8.4/8.4 MB[0m [31m22.1 MB/s[0m eta [36m0:00:00[0m
Collecting validators<1,>=0.2 (from streamlit)
  Downloading validators-0.22.0-py3-none-any.whl (26 kB)
Collecting gitpython!=3.1.19,<4,>=3.0.7 (from streamlit)
  Downloading GitPython-3.1.42-py3-none-any.whl (195 kB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m195.4/195.4 kB[0m [31m21.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.8.1b0-py2.py3-none-any.whl (4.8 MB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m4.8/4.8 MB[0m [31m40.8 MB/s[0m eta [36

## Install the local tunnel to view the webpage

In [None]:
!npm install localtunnel

[K[?25h[37;40mnpm[0m [0m[30;43mWARN[0m [0m[35msaveError[0m ENOENT: no such file or directory, open '/content/package.json'
[0m[37;40mnpm[0m [0m[34;40mnotice[0m[35m[0m created a lockfile as package-lock.json. You should commit this file.
[0m[37;40mnpm[0m [0m[30;43mWARN[0m [0m[35menoent[0m ENOENT: no such file or directory, open '/content/package.json'
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m content No description
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m content No repository field.
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m content No README data
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m content No license field.
[0m
+ localtunnel@2.0.2
added 22 packages from 22 contributors and audited 22 packages in 2.033s

3 packages are looking for funding
  run `npm fund` for details

found 1 [93mmoderate[0m severity vulnerability
  run `npm audit fix` to fix them, or `npm audit` for details
[K[?25h

# Create the Chatbot

In [None]:
%%writefile app.py

import streamlit as st
import tempfile, os
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import AstraDB
from langchain.schema.runnable import RunnableMap
from langchain.prompts import ChatPromptTemplate
from langchain.callbacks.base import BaseCallbackHandler
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Streaming call back handler for responses
class StreamHandler(BaseCallbackHandler):
    def __init__(self, container, initial_text=""):
        self.container = container
        self.text = initial_text

    def on_llm_new_token(self, token: str, **kwargs):
        self.text += token
        self.container.markdown(self.text + "‚ñå")

# Function for Vectorizing uploaded data into Astra DB
def vectorize_text(uploaded_files, vector_store):
    for uploaded_file in uploaded_files:
        if uploaded_file is not None:

            # Write to temporary file
            temp_dir = tempfile.TemporaryDirectory()
            file = uploaded_file
            print(f"""Processing: {file}""")
            temp_filepath = os.path.join(temp_dir.name, file.name)
            with open(temp_filepath, 'wb') as f:
                f.write(file.getvalue())

            # Process TXT
            if uploaded_file.name.endswith('txt'):
                file = [uploaded_file.read().decode()]

                text_splitter = RecursiveCharacterTextSplitter(
                    chunk_size = 1500,
                    chunk_overlap  = 100
                )

                texts = text_splitter.create_documents(file, [{'source': uploaded_file.name}])
                vector_store.add_documents(texts)
                st.info(f"Loaded {len(texts)} chunks")

# Cache prompt for future runs
@st.cache_data()
def load_prompt():
    template = """You're a helpful AI assistent tasked to answer the user's questions.
You're friendly and you answer extensively with multiple sentences. You prefer to use bulletpoints to summarize.

CONTEXT:
{context}

QUESTION:
{question}

YOUR ANSWER:"""
    return ChatPromptTemplate.from_messages([("system", template)])

# Cache OpenAI Chat Model for future runs
@st.cache_resource()
def load_chat_model(openai_api_key):
    return ChatOpenAI(
        openai_api_key=openai_api_key,
        temperature=0.3,
        model='gpt-3.5-turbo',
        streaming=True,
        verbose=True
    )

# Cache the Astra DB Vector Store for future runs
@st.cache_resource(show_spinner='Connecting to Astra DB Vector Store')
def load_vector_store(_astra_db_endpoint, astra_db_secret, openai_api_key):
    # Connect to the Vector Store
    vector_store = AstraDB(
        embedding=OpenAIEmbeddings(openai_api_key=openai_api_key),
        collection_name="my_store",
        api_endpoint=astra_db_endpoint,
        token=astra_db_secret
    )
    return vector_store

# Cache the Retriever for future runs
@st.cache_resource(show_spinner='Getting retriever')
def load_retriever(_vector_store):
    # Get the retriever for the Chat Model
    retriever = vector_store.as_retriever(
        search_kwargs={"k": 5}
    )
    return retriever

# Start with empty messages, stored in session state
if 'messages' not in st.session_state:
    st.session_state.messages = []

# Draw a title and some markdown
st.title("Your personal Efficiency Booster")
st.markdown("""Generative AI is considered to bring the next Industrial Revolution.
Why? Studies show a **37% efficiency boost** in day to day work activities!""")

# Get the secrets
astra_db_endpoint = st.sidebar.text_input('Astra DB Endpoint', type="password")
astra_db_secret = st.sidebar.text_input('Astra DB Secret', type="password")
openai_api_key = st.sidebar.text_input('OpenAI API Key', type="password")

# Draw all messages, both user and bot so far (every time the app reruns)
for message in st.session_state.messages:
    st.chat_message(message['role']).markdown(message['content'])

# Draw the chat input box
if not openai_api_key.startswith('sk-') or not astra_db_endpoint.startswith('https') or not astra_db_secret.startswith('AstraCS'):
    st.warning('Please enter your Astra DB Endpoint, Astra DB Secret and Open AI API Key!', icon='‚ö†')

else:
    prompt = load_prompt()
    chat_model = load_chat_model(openai_api_key)
    vector_store = load_vector_store(astra_db_endpoint, astra_db_secret, openai_api_key)
    retriever = load_retriever(vector_store)

    # Include the upload form for new data to be Vectorized
    with st.sidebar:
        st.divider()
        uploaded_file = st.file_uploader('Upload a document for additional context', type=['txt'], accept_multiple_files=True)
        submitted = st.button('Save to Astra DB')
        if submitted:
            vectorize_text(uploaded_file, vector_store)

    if question := st.chat_input("What's up?"):
            # Store the user's question in a session object for redrawing next time
            st.session_state.messages.append({"role": "human", "content": question})

            # Draw the user's question
            with st.chat_message('human'):
                st.markdown(question)

            # UI placeholder to start filling with agent response
            with st.chat_message('assistant'):
                response_placeholder = st.empty()

            # Generate the answer by calling OpenAI's Chat Model
            inputs = RunnableMap({
                'context': lambda x: retriever.get_relevant_documents(x['question']),
                'question': lambda x: x['question']
            })
            chain = inputs | prompt | chat_model
            response = chain.invoke({'question': question}, config={'callbacks': [StreamHandler(response_placeholder)]})
            answer = response.content

            # Store the bot's answer in a session object for redrawing next time
            st.session_state.messages.append({"role": "ai", "content": answer})

            # Write the final answer without the cursor
            response_placeholder.markdown(answer)

Writing app.py


# Now run the Chatbot!
We have to know the public URL of the server as a password.
Once we know that, we can kick off the Streamlit app through the tunnel.

In [None]:
import urllib
print("Password/Enpoint IP for localtunnel is:",urllib.request.urlopen('https://ipv4.icanhazip.com').read().decode('utf8').strip("\n"))

Password/Enpoint IP for localtunnel is: 35.225.208.37


In [None]:
!streamlit run app.py &>/content/logs.txt &
!npx localtunnel --port 8501

[K[?25hnpx: installed 22 in 2.478s
your url is: https://smooth-wombats-speak.loca.lt
