# **1. Install Required Libraries**

This section installs all the necessary Python libraries for the RAG application.  
Use the `-q` flag for quiet installation.


1.  `chromadb`: For vector database operations, crucial for storing and retrieving document embeddings.

2.   `streamlit`: To build the interactive web application interface.

3.  `torch`: PyTorch, a foundational deep learning framework often used by Hugging Face models.

4.  `jedi`: An autocompletion and static analysis library for Python, often a dependency for development environments.

5.  `transformers`: Hugging Face's library, providing access to state-of-the-art NLP models.

6.  `llama-index`: A data framework designed for building LLM applications, simplifying data ingestion, indexing, and querying.

7. `llama-index-llms-huggingface`: Integration for using Hugging Face Large Language Models with LlamaIndex.

8. `llama-index-postprocessor-colbert-rerank`: A LlamaIndex post-processor to re-rank search results, improving relevance.

9. `llama-index-vector-stores-chroma`: Integration for using ChromaDB as the vector store with LlamaIndex.

10. `llama-index-embeddings-huggingface`: Integration for using Hugging Face embedding models with LlamaIndex.


In [1]:
!pip install -q chromadb
!pip install -q streamlit
!pip install -q torch
!pip install -q jedi
!pip install -q transformers
!pip install -q llama-index
!pip install -q llama-index-llms-huggingface
!pip install -q llama-index-postprocessor-colbert-rerank
!pip install -q llama-index-vector-stores-chroma
!pip install -q llama-index-embeddings-huggingface

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.3/19.3 MB[0m [31m88.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m284.2/284.2 kB[0m [31m24.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m64.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m101.6/101.6 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.4/16.4 MB[0m [31m94.1 MB/s[0m eta [36m0:00:

# **2. Navigate to the Project Directory**

It's important to ensure the notebook is operating from the correct directory where the `rag_app.py` and other project files are located.  
This section verifies and changes the working directory.

In [2]:
!pwd

/content


In [3]:
%cd "/content/drive/MyDrive/RAG Project"

/content/drive/MyDrive/RAG Project


In [4]:
!pwd

/content/drive/MyDrive/RAG Project


# **Install Bitsandbytes**

This library is often used for 8-bit quantization of models, which can significantly reduce memory usage and potentially speed up inference, especially beneficial when working with large models on GPUs.

In [5]:
!pip install -q bitsandbytes

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.0/67.0 MB[0m [31m10.9 MB/s[0m eta [36m0:00:00[0m
[?25h

# **Hugging Face Hub Login**

This step logs you into the Hugging Face Hub, which is necessary to access private models or models that require authentication (e.g., for commercial use or rate limits).
Your Hugging Face token is securely retrieved from Colab's user data secrets.

In [6]:
from huggingface_hub import login
from google.colab import userdata

hf_token = userdata.get('HuggingFace')

if hf_token:
  login(token=hf_token)
  print("Logged in to HuggingFace")
else:
  print("HuggingFace login token not found")

Logged in to HuggingFace


# **Install pyngrok**

`pyngrok` facilitates creating secure tunnels to expose local servers to the internet, which is essential for sharing Streamlit apps running in a Colab environment.

In [7]:
!pip install -q pyngrok

# **Install ngrok**

Install ngrok silently to ensure the executable is available.

In [8]:
!pip install -q ngrok

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/3.1 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/3.1 MB[0m [31m49.4 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m3.1/3.1 MB[0m [31m62.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m38.1 MB/s[0m eta [36m0:00:00[0m
[?25h

# **Configure ngrok Authtoken**

This step ensures your ngrok client is authenticated. Adding an authtoken allows for longer ngrok session times and enables features like custom subdomains. Your ngrok token is also retrieved securely from Colab secrets.

In [9]:
from google.colab import userdata

ngrok_auth_token = userdata.get('ngrok')

if ngrok_auth_token:
  get_ipython().system(f'ngrok config add-authtoken {ngrok_auth_token}')
  print("ngrok authtoken added")
else:
  print("ngrok authtoken not found")

Authtoken saved to configuration file: /root/.config/ngrok/ngrok.yml
ngrok authtoken added


# **Setup and Run Streamlit with ngrok**

This section launches your Streamlit application.  
ngrok creates a secure tunnel to expose your locally running Streamlit app to the internet, providing a publicly accessible URL.

In [10]:
!streamlit run rag_app.py &>/dev/null&

from pyngrok import ngrok
public_url = ngrok.connect('8501')
print(f"Streamlit App URL: {public_url}")

Streamlit App URL: NgrokTunnel: "https://5252-34-105-8-145.ngrok-free.app" -> "http://localhost:8501"
