# Learning Gate

## Reto | Construcción y despliegue en producción de un dashboard interactivo

### Data Science 2 - Web Applications for Data Science

## Introduccion

Una empresa de streaming te ofrece una base de datos de filmes gratuita para que puedas realizar aplicaciones, construir dashboards para visualizar filmes, realizar búsquedas, filtrar filmes por director, entre otras funciones. Por lo que deberás desarrollar un proyecto de Python y Streamlit desde el cual se acceda a los datos en Firestore para consultar la información requerida.

## Objetive


Construir y desplegar en producción un dashboard interactivo en Streamlit y Firestore.


## Installing Dependencies, Connecting to Github and Creating the DataBase

### Installing Dependencies

In [1]:
!pip install firebase-admin
!pip install streamlit
!pip install pyngrok
!pip install toml
!git --version

Collecting streamlit
  Downloading streamlit-1.41.1-py2.py3-none-any.whl.metadata (8.5 kB)
Collecting watchdog<7,>=2.1.5 (from streamlit)
  Downloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.3/44.3 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
Collecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB)
Downloading streamlit-1.41.1-py2.py3-none-any.whl (9.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.1/9.1 MB[0m [31m26.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pydeck-0.9.1-py2.py3-none-any.whl (6.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m20.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl (79 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.1/79.1 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[

### Connecting to Github

In [2]:
!git clone https://xxxxxxxxxxxxxxxxxxxxx@github.com/hazutecuhtli/TLG_StreamlitDeployment.git

Cloning into 'TLG_StreamlitDeployment'...
remote: Enumerating objects: 7, done.[K
remote: Counting objects: 100% (7/7), done.[K
remote: Compressing objects: 100% (5/5), done.[K
remote: Total 7 (delta 0), reused 4 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (7/7), done.


- ***Validating the cloned github repository***

In [3]:
%cd /content/TLG_StreamlitDeployment
!ls -la

/content/TLG_StreamlitDeployment
total 32
drwxr-xr-x 3 root root 4096 Jan 21 18:04 .
drwxr-xr-x 1 root root 4096 Jan 21 18:02 ..
drwxr-xr-x 8 root root 4096 Jan 21 18:02 .git
-rw-r--r-- 1 root root 2412 Jan 21 18:04 names-firebase.json
-rw-r--r-- 1 root root   25 Jan 21 18:02 README.md
-rw-r--r-- 1 root root  124 Jan 21 18:02 requirements.txt
-rw-r--r-- 1 root root 4239 Jan 21 18:02 streamlit_app.py


- ***Github Connection***

In [16]:
#!git remote -v

### Filling the firestore database by migrating movies information from a csv file

- ***Creating a directory to save firestore secrets***

In [5]:
!mkdir .streamlit
!ls -la

total 36
drwxr-xr-x 4 root root 4096 Jan 21 18:04 .
drwxr-xr-x 1 root root 4096 Jan 21 18:02 ..
drwxr-xr-x 8 root root 4096 Jan 21 18:02 .git
-rw-r--r-- 1 root root 2412 Jan 21 18:04 names-firebase.json
-rw-r--r-- 1 root root   25 Jan 21 18:02 README.md
-rw-r--r-- 1 root root  124 Jan 21 18:02 requirements.txt
drwxr-xr-x 2 root root 4096 Jan 21 18:04 .streamlit
-rw-r--r-- 1 root root 4239 Jan 21 18:02 streamlit_app.py


- ***Creating a Toml file to store firestore token credentials***

In [6]:
import toml

output_file = ".streamlit/secrets.toml"

with open("names-firebase.json") as json_file:
  json_text = json_file.read()

config = {"textkey":json_text}
toml_config = toml.dumps(config)

with open(output_file, "w") as target:
  target.write(toml_config)

- ***Validating the created Toml file***

In [18]:
#!cat .streamlit/secrets.toml

- ***Migrating movies information from a csv file to the firestore NoSQL database***

In [10]:
# Importing Libraries
import firebase_admin, os
from firebase_admin import credentials, firestore
import pandas as pd
# Gatherind firestore credentials for the firestore connection
cred = credentials.Certificate("names-firebase.json")
firebase_admin.initialize_app(cred)
# Connecting to the movies firestore data collection
db = firestore.client()
doc_ref = db.collection("movies")
# Reading data from the csv file
df = pd.read_csv(os.path.join(os.getcwd(), 'movies.csv'))
# Migrating data from the csv file to the firestore database
tmp = df.to_dict(orient='records')
list(map(lambda x: doc_ref.add(x), tmp))

[(DatetimeWithNanoseconds(2025, 1, 21, 2, 29, 54, 977585, tzinfo=datetime.timezone.utc),
  <google.cloud.firestore_v1.document.DocumentReference at 0x7a91cfe80a50>),
 (DatetimeWithNanoseconds(2025, 1, 21, 2, 29, 55, 99676, tzinfo=datetime.timezone.utc),
  <google.cloud.firestore_v1.document.DocumentReference at 0x7a91ce3dabd0>),
 (DatetimeWithNanoseconds(2025, 1, 21, 2, 29, 55, 275400, tzinfo=datetime.timezone.utc),
  <google.cloud.firestore_v1.document.DocumentReference at 0x7a91cea313d0>),
 (DatetimeWithNanoseconds(2025, 1, 21, 2, 29, 55, 575627, tzinfo=datetime.timezone.utc),
  <google.cloud.firestore_v1.document.DocumentReference at 0x7a91cf07ee50>),
 (DatetimeWithNanoseconds(2025, 1, 21, 2, 29, 55, 690277, tzinfo=datetime.timezone.utc),
  <google.cloud.firestore_v1.document.DocumentReference at 0x7a91ce6b7310>),
 (DatetimeWithNanoseconds(2025, 1, 21, 2, 29, 55, 962478, tzinfo=datetime.timezone.utc),
  <google.cloud.firestore_v1.document.DocumentReference at 0x7a91cfe68b90>),
 (Dat

## Installing Ngrok

Ngrok will be used to emulate the deployment of a local app in production for development purposes.

In [8]:
# Downloading Ngrok
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
# Installing
!unzip ngrok-stable-linux-amd64.zip

--2025-01-21 18:13:44--  https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
Resolving bin.equinox.io (bin.equinox.io)... 13.248.244.96, 35.71.179.82, 99.83.220.108, ...
Connecting to bin.equinox.io (bin.equinox.io)|13.248.244.96|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13921656 (13M) [application/octet-stream]
Saving to: ‘ngrok-stable-linux-amd64.zip’


2025-01-21 18:13:47 (6.23 MB/s) - ‘ngrok-stable-linux-amd64.zip’ saved [13921656/13921656]

Archive:  ngrok-stable-linux-amd64.zip
  inflating: ngrok                   


## Development Environment for the Streamlit app deployment

#### Creating the Ngrok temporal Server

Unfortunately, I was not able to implement the creation of the Ngrok server with the shared code in the TLG content, and therefore, I will use the following code for the mentioned process. Ngrok will use the 8501 port to execute the google colab code as follow:

In [9]:
#ngrok.kill()
web_options = 'NGROK'
authtoken = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
if web_options == "NGROK":
  from pyngrok import ngrok
  # Terminate open tunnels if exist
  ngrok.kill()

  # Setting the authtoken (optional)
  # Get your authtoken from https://dashboard.ngrok.com/auth
  NGROK_AUTH_TOKEN = authtoken
  ngrok.set_auth_token(NGROK_AUTH_TOKEN)

  # Open an HTTPs tunnel on port 7860 for http://localhost:7860
  public_url = ngrok.connect(8501, proto="http")
  !echo "----------------------------------------"
  !echo "use once web server has initiated"
  !echo "----------------------------------------"
  print("Tracking URL:", public_url)

----------------------------------------
use once web server has initiated
----------------------------------------
Tracking URL: NgrokTunnel: "https://5926-35-187-148-226.ngrok-free.app" -> "http://localhost:8501"


### Python code for the streamlit app to be deployed

In [10]:
%%writefile /content/TLG_StreamlitDeployment/streamlit_app.py

# *****************************************************************************
# Importing Libraries
# *****************************************************************************
import os, json
import streamlit as st
import pandas as pd
from google.cloud import firestore
from google.oauth2 import service_account

# *****************************************************************************
# Defining Functions
# *****************************************************************************

# Function to load data when the streamlit app starts

@st.cache_data
def loading_data():
  names_ref = list(db.collection(u'movies').stream())
  names_dict = list(map(lambda x: x.to_dict(), names_ref))
  data = pd.DataFrame(names_dict)
  return data

# Function that display the filtered results on the main streamlit space

def display_results(space, df, Title, flag=True, Message=''):
  if flag:
    space.subheader(Title)
    space.dataframe(df)
  else:
    space.subheader(Title)
    space.write(Message)

# *****************************************************************************
# Main
# *****************************************************************************

if __name__=="__main__":

  # Defining credentials to connecto the firestore db
  key_dict = json.loads(st.secrets["textkey"])
  creds = service_account.Credentials.from_service_account_info(key_dict)
  # Connecting to the firesore database
  db = firestore.Client(credentials=creds)
  dbNames = db.collection("movies")

  # Displaying the app main title
  st.title('Informacion Sobre Peliculas')
  # Cache loading of firestore data to reduce latency
  data = loading_data()

  # Creating a sidebar space within the streamlit app for a friendly interface
  sidebar = st.sidebar

  # Creating sidebar space to filter movies information from the loaded db
  sidebar.title("Obteniendo Informacion")
  sidebar.write("Parametros para filtrado de datos")
  # Defining variables to store the user filtering parameters for the search
  movie2look = sidebar.text_input("Movie: ")
  movie_search = sidebar.button("Movie Name Search")
  genre2look = sidebar.text_input("Genre: ")
  genre_search = sidebar.button("Genre Search")
  # Obtaining Movies Names informations from the loaded firestore db
  if movie_search:
    Title = f"Resulstados de la busqueda para: {movie2look}"
    if movie2look:
      filtered_df = data[data.name.str.lower().str.contains(movie2look.lower())]
      filtered_df.reset_index(drop=True, inplace=True)
      search_result = filtered_df.shape[0]
      display_results(st, filtered_df, f"Se encontraron {search_result} resultados")
    else:
      message = 'Please enter a movie name to look'
      display_results(st, ' ', Title, False, message)
  # Obtaining Movies Genres informations from the loaded firestore db
  if genre_search:
    Title = f"Result for searching movies containing: {genre2look}"
    if genre2look:
      filtered_df = data[data.genre.str.lower().str.contains(genre2look.lower())]
      filtered_df.reset_index(drop=True, inplace=True)
      search_result = filtered_df.shape[0]
      display_results(st, filtered_df, f"Se encontraron {search_result} resultados")
    else:
      message = 'Please enter a genre to look'
      display_results(st, ' ', Title, False, message)

  # Creating sidebar space to add movies information to the loaded db
  sidebar.title("Adding New Movie")
  sidebar.write("Defining Movie")
  # Defining the movie information that will be added to the db
  name = sidebar.text_input("Name: ")
  genre = sidebar.text_input("Genre : ")
  director = sidebar.text_input("Director: ")
  company = sidebar.text_input("Company: ")
  submit = sidebar.button("Add New Movie")
  # Adding new movie to the firestore db
  if submit:
    if name and genre and director and company:
      doc_ref = db.collection("movies").document("name")
      doc_ref.set({
        "name": name,
        "genre": genre,
        "director": director,
        "company": company
      })
      new_row = pd.DataFrame({"name":name, "genre":genre,
                              "director":director,
                              "company":company}, index=[0])

      sidebar.write("New Movie added correctly!")


Overwriting /content/TLG_StreamlitDeployment/streamlit_app.py


#### Deployment for the streamlit in the development environment

In [11]:
!streamlit run /content/TLG_StreamlitDeployment/streamlit_app.py


Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
[0m
[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://172.28.0.12:8501[0m
[34m  External URL: [0m[1mhttp://35.187.148.226:8501[0m
[0m
[34m  Stopping...[0m
[34m  Stopping...[0m


### Updating the Github Repository

- ***Creating the Requirements.txt file containing app dependencies***

In [15]:
%%writefile /content/TLG_StreamlitDeployment/requirements.txt
firebase-admin==6.6.0
streamlit==1.41.1
pandas==2.2.2
numpy==1.26.4
pyngrok==7.2.3
google-cloud==0.34.0
google-auth==2.27.0

Overwriting /content/TLG_StreamlitDeployment/requirements.txt


- ***Adding the files to be updated in the Github repo***

In [12]:
#!git add requirements.txt
#!git add streamlit_app.py
#!git add .streamlit
!git add README.md

- ***Defining GitHub credentials***

In [13]:
!git config --global user.email "useremal@mail.com"
!git config --global user.name "username"

- ***Github Commits***

In [14]:
!git commit -m "Second commit, version 1.0"

[main fc7fadb] Second commit, version 1.0
 1 file changed, 28 insertions(+), 1 deletion(-)
 rewrite README.md (100%)


- ***Pushing the GitHub Commits***

In [15]:
!git push -u origin main

Enumerating objects: 5, done.
Counting objects:  20% (1/5)Counting objects:  40% (2/5)Counting objects:  60% (3/5)Counting objects:  80% (4/5)Counting objects: 100% (5/5)Counting objects: 100% (5/5), done.
Delta compression using up to 2 threads
Compressing objects:  33% (1/3)Compressing objects:  66% (2/3)Compressing objects: 100% (3/3)Compressing objects: 100% (3/3), done.
Writing objects:  33% (1/3)Writing objects:  66% (2/3)Writing objects: 100% (3/3)Writing objects: 100% (3/3), 1.27 KiB | 1.27 MiB/s, done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
To https://github.com/hazutecuhtli/TLG_StreamlitDeployment.git
   277206c..fc7fadb  main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.


## FIN