In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
%cd /content/drive/MyDrive/RAG_Workshop

/content/drive/MyDrive/RAG_Workshop


In [None]:
%pwd

'/content/drive/MyDrive/RAG_Workshop'

In [None]:
!pip install -q llama-index qdrant_client llama-index-vector-stores-qdrant python-dotenv llama-index-experimental

In [None]:
def custom_print(text, max_line_length=80):
    words = text.split(' ')
    line = ''
    for word in words:
        if len(line) + len(word) + 1 <= max_line_length:
            line += word + ' '
        else:
            print(line)
            line = word + ' '
    print(line)

In [None]:
import nest_asyncio

nest_asyncio.apply()

# Basic RAG

Retrieval-Augmented Generation (RAG) combines the strengths of retrieval-based and generative AI models to enhance the context and accuracy of AI-generated text. In this section, we introduce the fundamental components of a basic RAG system. We'll start by loading our documents, breaking them down into manageable chunks, and converting these chunks into embeddings, which are numerical representations suitable for machine processing.

![Basic RAG Pipeline](https://miro.medium.com/v2/resize:fit:1200/1*J7vyY3EjY46AlduMvr9FbQ.png)
*Image Source: [Source](https://medium.com/@drjulija/what-is-retrieval-augmented-generation-rag-938e4f6e03d1)*

## Loading document

Our first step involves loading the documents that our RAG system will use as its knowledge base. These documents can range from simple text files to complex PDFs containing vast amounts of information. Once loaded, we split each document into smaller segments, known as 'chunks'. This process is crucial for making the data more manageable for the subsequent embedding step, ensuring that our model can efficiently process and retrieve information from these documents.


In [None]:
from llama_index.core import SimpleDirectoryReader

# load data
documents = SimpleDirectoryReader(input_files=["./transcripts/AIAct.pdf"]).load_data()

## Define Vector Store

After chunking our documents, we need a place to store the resulting embeddings. This is where the Vector Store comes into play. Think of it as a specialized database optimized for storing and querying high-dimensional vectors. By storing our embeddings in the Vector Store, we facilitate efficient retrieval of relevant document chunks based on user queries, laying the groundwork for our RAG system's retrieval component.

![Vector Store](https://miro.medium.com/v2/resize:fit:1400/1*wyaikzoEA397xahVKEscnA.png)
*Image Source: [Source](https://medium.com/aimonks/introduction-to-chromadb-vector-store-for-generative-ai-llms-28f90535086)*

In [None]:
from llama_index.core import StorageContext
from llama_index.vector_stores.qdrant import QdrantVectorStore
import qdrant_client

# define the vector_store
client = qdrant_client.QdrantClient(location=":memory:")
vector_store = QdrantVectorStore(client=client, collection_name="test_store")
storage_context = StorageContext.from_defaults(vector_store=vector_store)

## Define the basic index for querying

With our Vector Store ready, the next step involves creating an index to organize our embeddings in a way that optimizes retrieval performance. This index acts as a searchable catalog of embeddings, allowing our RAG system to quickly find the most relevant document chunks in response to a query. This process is akin to how a library's indexing system enables fast retrieval of books based on specific topics or keywords.


In [None]:
from llama_index.core import Settings, VectorStoreIndex
from dotenv import load_dotenv
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

load_dotenv("env")

RagLLM=OpenAI(
    model="gpt-3.5-turbo",
    temperature=0.2
    )
embed_model=OpenAIEmbedding()

Settings.llm = RagLLM
Settings.embed_model = embed_model
Settings.chunk_size = 512


# build VectorStoreIndex that takes care of chunking documents
# and encoding chunks to embeddings for future retrieval
index = VectorStoreIndex.from_documents(documents,storage_context=storage_context)

## Defining the query engine

The heart of our RAG system is the Query Engine, which orchestrates the retrieval and generation steps. It utilizes the previously set up index to fetch relevant document chunks and then leverages a Large Language Model (LLM) to generate coherent and contextually appropriate responses based on those chunks. This seamless integration of retrieval and generation is what makes RAG systems particularly powerful for tasks requiring nuanced understanding and synthesis of information.

In [None]:
# The QueryEngine class is equipped with the generator
# and facilitates the retrieval and generation steps
query_engine = index.as_query_engine()

In [None]:
question = "What does the document say about emotion recognition?"

![Semantic Similarity](https://miro.medium.com/v2/resize:fit:1400/1*pAfG2R9dO8_SWvbkRNg4bg.png)
*Image Source: [Source](https://medium.com/@adrian.white/cosine-similarity-in-snowflake-ove-eed3b57f4e6f)*

In [None]:
from llama_index.core.response.pprint_utils import pprint_response

# Use your Default RAG
response = query_engine.query(question)
pprint_response(response)

Final Response: The document mentions that there are concerns about
the definitions of 'emotion recognition' being technically flawed and
recommends adjustments. Additionally, AccessNow calls for a wider ban
on the use of AI for emotion recognition, categorizing people based on
physiological, behavioral, or biometric data, as well as dangerous
uses in the context of policing, migration, asylum, and border
management.


# Success Requirements for RAG

In order for a RAG system to be deemed as a success (in the sense of providing useful and relevant answers to user questions), there are really only two high level requirements:

1- Retrieval must be able to find the most relevant documents to a user query.

2- Generation must be able to make good use of the retrieved documents to sufficiently answer the user query.

# Advanced Rag

As we move beyond basic RAG, we explore advanced techniques to refine and enhance our system's performance. This includes optimizing the size of our document chunks for balanced retrieval and generation, improving the relevancy of retrieved content through semantic reranking, and employing sophisticated response synthesizers for more coherent outputs. These enhancements aim to address common challenges and elevate the quality of the RAG-generated text.

![Advanced RAG](https://miro.medium.com/v2/resize:fit:2000/0*Gr_JqzdpHu7enWG9.png)
*Image Source: [Source](https://pub.towardsai.net/advanced-rag-techniques-an-illustrated-overview-04d193d8fec6)*


## Chunk-Size Optimization

Chunk-Size Optimization is a critical process in enhancing the performance of a RAG system. It involves tuning the size of the text chunks that are processed by the system to ensure an optimal balance between retrieval efficiency and the quality of generated responses. The right chunk size can significantly impact the system's ability to retrieve relevant information and generate coherent and contextually appropriate responses. This section will explore strategies for determining the optimal chunk size through empirical testing and evaluation.


In [None]:
from llama_index.core import load_index_from_storage
from llama_index.core.node_parser import SimpleNodeParser
import os
from pathlib import Path

def _build_index(chunk_size, docs):
    index_out_path = f"./storage_{chunk_size}"
    if not os.path.exists(index_out_path):
        Path(index_out_path).mkdir(parents=True, exist_ok=True)
        # parse docs
        node_parser = SimpleNodeParser.from_defaults(chunk_size=chunk_size)
        base_nodes = node_parser.get_nodes_from_documents(docs)

        # build index
        index = VectorStoreIndex(base_nodes)
        # save index to disk
        index.storage_context.persist(index_out_path)
    else:
        # rebuild storage context
        storage_context = StorageContext.from_defaults(
            persist_dir=index_out_path
        )
        # load index
        index = load_index_from_storage(
            storage_context,
        )
    return index

In [None]:
from llama_index.core import ServiceContext
from llama_index.experimental.param_tuner.base import ParamTuner, RunResult
from llama_index.core.evaluation import SemanticSimilarityEvaluator, BatchEvalRunner
from llama_index.core.evaluation.eval_utils import get_responses
import numpy as np

### Recipe
### Perform hyperparameter tuning as in traditional ML via grid-search
### 1. Define an objective function that ranks different parameter combos
### 2. Build ParamTuner object
### 3. Execute hyperparameter tuning with ParamTuner.tune()

# 1. Define objective function
def objective_function(params_dict):
    chunk_size = params_dict["chunk_size"]
    docs = params_dict["docs"]
    top_k = params_dict["top_k"]
    eval_qs = params_dict["eval_qs"]
    ref_response_strs = params_dict["ref_response_strs"]

    # build RAG pipeline
    index = _build_index(chunk_size, docs)  # helper function not shown here
    query_engine = index.as_query_engine(similarity_top_k=top_k)

    # perform inference with RAG pipeline on a provided questions `eval_qs`
    pred_response_objs = get_responses(
        eval_qs, query_engine, show_progress=True
    )

    # perform evaluations of predictions by comparing them to reference
    # responses `ref_response_strs`
    evaluator = SemanticSimilarityEvaluator(embed_model=OpenAIEmbedding())
    eval_batch_runner = BatchEvalRunner(
        {"semantic_similarity": evaluator}, workers=2, show_progress=True
    )
    eval_results = eval_batch_runner.evaluate_responses(
        eval_qs, responses=pred_response_objs, reference=ref_response_strs
    )

    # get semantic similarity metric
    mean_score = np.array(
        [r.score for r in eval_results["semantic_similarity"]]
    ).mean()

    return RunResult(score=mean_score, params=params_dict)




In [None]:
eval_qs = [
    "What is the primary goal of the EU's proposed AI Act of April 2021?",
    "Which AI systems are considered to pose 'unacceptable' risks under the EU's proposed AI Act?",
    "What are the obligations for 'high-risk' AI systems according to the EU's proposed AI Act?",
    "What does the EU's proposed AI Act require from AI systems that present only 'limited risk'?",
    "When did the EU Member States agree on their general position regarding the AI Act?",
    "What are the significant amendments proposed by the EU Parliament to the AI Act?",
    "How does the proposed AI Act define AI systems?",
    "What are the concerns raised about the AI Act's definition of AI systems?",
    "What is the risk-based approach proposed by the EU's AI Act?",
    "What are the proposed sanctions for non-compliance with the AI Act?"
]

ref_response_strs = [
    "The primary goal is to ensure the proper functioning of the single market by creating conditions for the development and use of trustworthy AI systems in the Union.",
    "AI systems that deploy subliminal techniques or exploit vulnerable groups, among others, are considered to pose unacceptable risks.",
    "High-risk AI systems must undergo an ex-ante conformity assessment, be registered in an EU database, and comply with specific requirements such as risk management and data governance.",
    "AI systems presenting limited risk are subject to transparency obligations, such as clear user information.",
    "The EU Member States agreed on their general position in December 2021.",
    "Significant amendments include revising the definition of AI systems, broadening the list of prohibited AI systems, and imposing obligations on general-purpose and generative AI models.",
    "AI systems are defined as software developed with specific techniques and approaches that can generate outputs influencing the environments they interact with.",
    "Concerns include the broad definition that could cover simple algorithms and the lack of clarity that may lead to legal uncertainty.",
    "The risk-based approach classifies AI systems based on their level of risk, from unacceptable to high, limited, and minimal, tailoring legal interventions accordingly.",
    "Sanctions include administrative fines up to €30 million or 6% of the total worldwide annual turnover, depending on the infringement's severity."
]


In [None]:
docs= SimpleDirectoryReader(input_files=["./transcripts/AIAct.pdf"]).load_data()


# 2. Build ParamTuner object
param_dict = {"chunk_size": [256, 512, 1024]} # params/values to search over
fixed_param_dict = { # fixed hyperparams
  "top_k": 2,
    "docs": docs,
    "eval_qs": eval_qs[:10],
    "ref_response_strs": ref_response_strs[:10],
}
param_tuner = ParamTuner(
    param_fn=objective_function,
    param_dict=param_dict,
    fixed_param_dict=fixed_param_dict,
    show_progress=True,
)

# 3. Execute hyperparameter search
results = param_tuner.tune()
best_result = results.best_run_result
best_chunk_size = results.best_run_result.params["chunk_size"]

Param combinations.:   0%|          | 0/3 [00:00<?, ?it/s]


  0%|          | 0/10 [00:00<?, ?it/s][A
 10%|█         | 1/10 [00:02<00:22,  2.48s/it][A
 20%|██        | 2/10 [00:02<00:09,  1.25s/it][A
 30%|███       | 3/10 [00:03<00:05,  1.23it/s][A
 60%|██████    | 6/10 [00:03<00:01,  3.30it/s][A
 80%|████████  | 8/10 [00:03<00:00,  4.80it/s][A
100%|██████████| 10/10 [00:03<00:00,  2.83it/s]

  0%|          | 0/10 [00:00<?, ?it/s][A
 10%|█         | 1/10 [00:00<00:05,  1.58it/s][A
 30%|███       | 3/10 [00:00<00:02,  3.48it/s][A
 50%|█████     | 5/10 [00:01<00:01,  4.05it/s][A
 60%|██████    | 6/10 [00:01<00:01,  3.41it/s][A
 70%|███████   | 7/10 [00:02<00:00,  3.14it/s][A
 80%|████████  | 8/10 [00:02<00:00,  3.14it/s][A
 90%|█████████ | 9/10 [00:02<00:00,  3.67it/s][A
100%|██████████| 10/10 [00:02<00:00,  3.54it/s]

  0%|          | 0/10 [00:00<?, ?it/s][A
 10%|█         | 1/10 [00:01<00:15,  1.74s/it][A
 20%|██        | 2/10 [00:01<00:06,  1.19it/s][A
 30%|███       | 3/10 [00:02<00:04,  1.54it/s][A
 50%|█████     | 5/10 [00

In [None]:
print("best chunk size is: ", best_chunk_size)

best chunk size is:  256


# Response Synthesizer
A Response Synthesizer is what generates a response from an LLM, using a user query and a given set of text chunks. The output of a response synthesizer is a Response object.

The method for doing this can take many forms, from as simple as iterating over text chunks, to as complex as building a tree. The main idea here is to simplify the process of generating a response using an LLM across your data.

# Tree summarize
Query the LLM using the `summary_template` prompt as many times as needed so that all concatenated chunks have been queried, resulting in as many answers that are themselves recursively used as chunks in a tree_summarize LLM call and so on, until there's only one chunk left, and thus only one final answer.

![Tree Summarize](https://static.bluelabellabs.com/wp-content/uploads/2024/02/Tree-Summarize.png)
*Image Source: [Source](https://www.bluelabellabs.com/blog/llamaindex-response-modes-explained/)*


In [None]:
from llama_index.core import get_response_synthesizer
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response.pprint_utils import pprint_response

# configure retriever
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=5)

# configure response synthesizer
response_synthesizer = get_response_synthesizer(
    response_mode="tree_summarize")

# assemble query engine
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,)

response = query_engine.query(question)

pprint_response(response)

Final Response: The document mentions that there are concerns about
the definitions of 'emotion recognition' being technically flawed and
recommends adjustments. Additionally, there are calls for a wider ban
on the use of AI for emotion recognition, categorizing people based on
physiological, behavioral, or biometric data, and for stronger impact
assessment and transparency requirements in this regard.


## Semantic Reranker

The Semantic Reranker is an advanced component that enhances the relevance of the retrieved document chunks before they are used for response generation. By applying semantic analysis, the reranker evaluates and prioritizes chunks based on their contextual alignment with the user's query, ensuring that the most pertinent information is considered. This process not only improves the accuracy and relevance of the generated responses but also addresses common issues such as information redundancy and irrelevance. This section will delve into the implementation and integration of a Semantic Reranker within the RAG framework.

![Placeholder for Image](https://framerusercontent.com/images/4atMCrE67i6XDQdSySevR8sVhM.png)
*Image Source: [Source](https://dify.ai/blog/hybrid-search-rerank-rag-improvement)*


In [None]:
%pip install -q llama-index-postprocessor-cohere-rerank

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m145.3/145.3 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
from llama_index.postprocessor.cohere_rerank import CohereRerank
import os

api_key = os.getenv("COHERE_API_KEY")
cohere_rerank = CohereRerank(api_key=api_key, top_n=2)

# assemble query engine
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,
    node_postprocessors=[cohere_rerank])

response = query_engine.query(question)

pprint_response(response)

Final Response: The document mentions that some stakeholders argue
that the definitions of 'emotion recognition' are technically flawed
and recommend adjustments. Additionally, it states that AI systems
presenting 'limited risk', such as emotion recognition systems, would
be subject to a limited set of transparency obligations.


## Integration with langchain (Optional)

To further expand our generative AI application's capabilities, we can integrate LangChain, a framework that provides additional flexibility and power through its tools, agents and chains. LangChain allows for the creation of complex workflows, enabling our app to handle more intricate queries and perform a broader range of tasks. This integration showcases the extensibility of RAG systems and their potential for customization to meet specific application needs.

![LangChain as a central hub](https://www.kdnuggets.com/wp-content/uploads/c_langchain_101_build_gptpowered_applications_1.jpg)
*Image Source: [Source](https://www.kdnuggets.com/2023/04/langchain-101-build-gptpowered-applications.html)*



In [None]:
!pip -q install langchain

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m812.8/812.8 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m274.6/274.6 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.9/86.9 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.0/53.0 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.8/144.8 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
from llama_index.core.langchain_helpers.agents import IndexToolConfig, LlamaIndexTool
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)

tool_config = IndexToolConfig(
    query_engine=query_engine,
    name=f"Vector Index",
    description=f"useful for when you want to answer queries about the document",
    tool_kwargs={"return_direct": True},
    memory = memory)

# create the tool
tool = LlamaIndexTool.from_tool_config(tool_config)

response=tool.run(question)

custom_print(response)

The document mentions that some stakeholders argue the definitions of 'emotion 
recognition' are technically flawed and recommend adjustments. Additionally, it 
states that AI systems presenting 'limited risk', such as emotion recognition 
systems, would be subject to a limited set of transparency obligations. 


# Making ready for the production

In this section, we will get the code ready for the production. This guides you through the process of packaging our RAG application, ensuring it's ready for a production environment. We'll cover essential steps such as dependency management, application wrapping with Streamlit for user interaction, and setting up a secure tunnel for public access. This prepares our RAG system to be deployed and utilized in practical scenarios, bringing the power of advanced AI to end-users.


In [None]:
!pip install -q -r requirements.txt

In [None]:
%%writefile app.py


import streamlit as st
from llama_index.core import(
    SimpleDirectoryReader,
    VectorStoreIndex,
    ServiceContext,
    Document,
    get_response_synthesizer,
    StorageContext
)
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.langchain_helpers.agents import IndexToolConfig, LlamaIndexTool
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.postprocessor.cohere_rerank import CohereRerank
import qdrant_client
from langchain.memory import ConversationBufferMemory
import time
import nest_asyncio
from dotenv import load_dotenv
import tempfile
import os

nest_asyncio.apply()
#import openai

load_dotenv()

llm=OpenAI(
    model="gpt-3.5-turbo",
    temperature=0.2
    )
embed_model = OpenAIEmbedding()

client = qdrant_client.QdrantClient(location=":memory:")
vector_store = QdrantVectorStore(client=client, collection_name="test_store")

storage_context = StorageContext.from_defaults(vector_store=vector_store)

cohere_api_key = os.getenv("COHERE_API_KEY")
cohere_rerank = CohereRerank(api_key=cohere_api_key, top_n=2)

# Set the header of the Streamlit application
st.header("Workshop Transcript Chatbot")

# Initialize session state to store the chat history
if "messages" not in st.session_state.keys(): # Initialize the chat message history
    st.session_state.messages = [
        {"role": "assistant", "content": "Ask me a question about DPS Workshops!"}
    ]

@st.cache_resource(show_spinner=False)
def load_data(uploaded_files):
    """
    Load and index workshop transcripts uploaded by the user.

    Args:
        uploaded_files: A list of uploaded file objects.

    Returns:
        VectorStoreIndex: An indexed representation of the workshop transcripts.
    """
    with st.spinner(text="Indexing uploaded workshop docs – hang tight! This might take some time."):
        with tempfile.TemporaryDirectory() as temp_dir:
            for uploaded_file in uploaded_files:
                if uploaded_file is not None:
                    file_path = os.path.join(temp_dir, uploaded_file.name)
                    with open(file_path, "wb") as f:
                        f.write(uploaded_file.getbuffer())

            reader = SimpleDirectoryReader(input_dir=temp_dir, recursive=True)
            docs = reader.load_data()

            if docs:
                service_context_for_indexing = ServiceContext.from_defaults(embed_model = embed_model)
                # Execute pipeline and time the process
                index =  VectorStoreIndex.from_documents(docs,storage_context=storage_context)

                return index
            else:
                return None

# Streamlit file uploader
uploaded_files = st.file_uploader("Upload workshop documents", accept_multiple_files=True)

if uploaded_files:
    index = load_data(uploaded_files)

    if index:
        # Set up the ServiceContext with the LLM for the querying stage
        service_context_for_querying = ServiceContext.from_defaults(
            llm=llm,
            embed_model=embed_model
            )

        # configure retriever
        retriever = VectorIndexRetriever(
            index=index,
            similarity_top_k=5,
        )

        # configure response synthesizer
        response_synthesizer = get_response_synthesizer(
            response_mode="tree_summarize",
        )

        # assemble query engine
        query_engine = RetrieverQueryEngine(
            retriever=retriever,
            response_synthesizer=response_synthesizer,
            node_postprocessors=[cohere_rerank])


        memory = ConversationBufferMemory(
            memory_key='chat_history', return_messages=True
            )


        tool_config = IndexToolConfig(
            query_engine=query_engine,
            name=f"Vector Index",
            description=f"useful for when you want to answer queries about the document",
            tool_kwargs={"return_direct": True},
            memory = memory
            )

        # create the tool
        tool = LlamaIndexTool.from_tool_config(tool_config)

        # Chat interface for user input and displaying chat history
        if prompt := st.chat_input("Your question"):
            st.session_state.messages.append({"role": "user", "content": prompt})
        for message in st.session_state.messages:
            with st.chat_message(message["role"]):
                st.write(message["content"])

        # Generate and display the response from the chat engine
        if st.session_state.messages[-1]["role"] != "assistant":
            with st.chat_message("assistant"):
                with st.spinner("Thinking..."):
                    # Retrieve the response from the chat engine based on the user's prompt
                    response=tool.run(prompt)
                    #st.write(response.response)
                    st.write(response)
                    #message = {"role": "assistant", "content": response.response}
                    message = {"role": "assistant", "content": response}
                    st.session_state.messages.append(message) # Add response to message history


In [None]:
!npm install localtunnel

In [None]:
!streamlit run app.py &>/content/logs.txt &

In [None]:
!curl https://loca.lt/mytunnelpassword


In [None]:
!npx localtunnel --port 8501