# **[Streamlit Setup] AyikaBot - Generative QA Chatbot for Climate Education**

This project sets up and tests a climate-focused Q&A chatbot using a fine-tuned pre-trained T5 model. It uses Streamlit for the web interface and ngrok for tunneling the local app online from Google Colab. I did this because it is ideal for testing before deploying to Streamlit Cloud.

#### **Features:**
- Loads a custom fine-tuned T5 model for climate education Q&A.
- Runs a Streamlit app to interact with the chatbot.
- Uses ngrok to generate a public link for live testing from Google Colab.

#### **Useful Links:**
- Checkout more about the project here: https://github.com/eadewusic/Domain-Specific-QA-Chatbot-using-Transformer-Models
- You can also find the fine-tuned pretrained model and other files here: https://huggingface.co/Climi/Climate-Education-QA-Chatbot

**Author:** Eunice Adewusi

**Date:** June 2025

In [1]:
!pip install streamlit
!pip install pyngrok

Collecting streamlit
  Downloading streamlit-1.46.0-py3-none-any.whl.metadata (9.0 kB)
Collecting watchdog<7,>=2.1.5 (from streamlit)
  Downloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.3/44.3 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
Collecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB)
Downloading streamlit-1.46.0-py3-none-any.whl (10.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.1/10.1 MB[0m [31m86.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pydeck-0.9.1-py2.py3-none-any.whl (6.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m99.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl (79 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.1/79.1 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[?25hI

In [2]:
# Using ngrok tunnel
!npm install -g localtunnel

[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K⠧[1G[0K⠇[1G[0K⠏[1G[0K⠋[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K⠧[1G[0K⠇[1G[0K⠏[1G[0K⠋[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K⠧[1G[0K⠇[1G[0K⠏[1G[0K⠋[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K
added 22 packages in 5s
[1G[0K⠦[1G[0K
[1G[0K⠦[1G[0K3 packages are looking for funding
[1G[0K⠦[1G[0K  run `npm fund` for details
[1G[0K⠦[1G[0K

In [3]:
# Import necessary libraries
import streamlit as st
import time
import re
from typing import Tuple, List, Optional
import sys
import os
import subprocess
import threading
import time
import requests
from transformers import T5Tokenizer, TFT5ForConditionalGeneration

In [4]:
# set ngrok authentication token
from pyngrok import ngrok
ngrok.set_auth_token("insert_your_auth_key_from_ngrok_site")



In [None]:
# Get the correct tunnel password from loca.lt
# this is needed for ngrok to host the app
password = requests.get("https://loca.lt/mytunnelpassword").text.strip()
print(f"\n Tunnel Password: {password}\n")

- When this cell is run, the output displays something like "Tunnel Password: ##.###.##.###" (where # is an actual digit)

- Input that in the custom ngrok page (you'll get a link from the last cell in this notebook) for your app to run

Hiding mine :)

In [6]:
# check folder content
!ls /content/climate_chatbot_BEST_exp4c

added_tokens.json	       generation_config.json	spiece.model
ARCHITECTURE.md		       model_architecture.json	tf_model.h5
ayikabot_complete_pipeline.py  optimal_generation.py	tokenizer_config.json
comprehensive_results.json     run_chatbot.py
config.json		       special_tokens_map.json


In [7]:
# Load the best fine-tuned pre-trained T5 model and tokenizer
model = TFT5ForConditionalGeneration.from_pretrained("/content/climate_chatbot_BEST_exp4c")
tokenizer = T5Tokenizer.from_pretrained("/content/climate_chatbot_BEST_exp4c")

# Save loaded model and tokenizer to a new directory
model.save_pretrained("/content/ayikabot_clean")
tokenizer.save_pretrained("/content/ayikabot_clean")

All model checkpoint layers were used when initializing TFT5ForConditionalGeneration.

All the layers of TFT5ForConditionalGeneration were initialized from the model checkpoint at /content/climate_chatbot_BEST_exp4c.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFT5ForConditionalGeneration for predictions without further training.


('/content/ayikabot_clean/tokenizer_config.json',
 '/content/ayikabot_clean/special_tokens_map.json',
 '/content/ayikabot_clean/spiece.model',
 '/content/ayikabot_clean/added_tokens.json')

You can also find the fine-tuned pretrained model and other files here: https://huggingface.co/Climi/Climate-Education-QA-Chatbot

This was done to get a more focused and clean directory with the very important files for deployment

In [8]:
# The "/content/ayikabot_streamlit_app.py" file contains all the code needed for streamlit to run.
# Whatever is to be changed on the UI should be changed in this file
def run_streamlit():
    with open("streamlit_log.txt", "w") as f:
        subprocess.run(
            ["streamlit", "run", "/content/ayikabot_streamlit_app.py", "--server.port", "8501", "--server.headless", "true"],
            stdout=f,
            stderr=subprocess.STDOUT
        )

# Start Streamlit in a background thread
streamlit_thread = threading.Thread(target=run_streamlit)
streamlit_thread.start()

# Wait for Streamlit to boot up fully
print("Waiting for Streamlit to start...")
time.sleep(20)  # Increased wait time to 20 seconds for larger applications

# Kill any existing ngrok tunnels before starting a new one
print("Killing any existing ngrok tunnels...")
ngrok.kill()
time.sleep(5) # Give ngrok time to shut down

# Now start ngrok tunnel AFTER Streamlit is fully running
try:
    print("Starting new ngrok tunnel...")
    public_url = ngrok.connect(8501).public_url # Access the public_url attribute
    print(f"App is live at: {public_url}")
except Exception as e:
    print(f"Failed to start ngrok tunnel: {e}")

Waiting for Streamlit to start...
Killing any existing ngrok tunnels...
Starting new ngrok tunnel...
App is live at: https://02a0-34-125-57-200.ngrok-free.app


The streamlit app link is https://02a0-34-125-57-200.ngrok-free.app

This still needs to be deployed because the generated public URL becomes invalid when Colab runtime stops, as the ngrok tunnel closes. I used this approach to test the code and functionalities before deploying to Streamlit Cloud