## <span style='color:#ff5f27'> 📝 Imports

In [1]:
from xgboost import XGBRegressor
import hopsworks
from functions.llm_chain import load_model, get_llm_chain, generate_response
import pandas as pd
import warnings
warnings.filterwarnings("ignore")

## <span style="color:#ff5f27;"> 🔮 Connect to Hopsworks Feature Store </span>

In [2]:
project = hopsworks.login()
fs = project.get_feature_store() 

Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.


In [3]:
# Get_or_create the 'air_quality_fv' feature view
feature_view = fs.get_feature_view(
    name='air_quality_fv',
    version=1
)

# Initialize batch scoring
feature_view.init_batch_scoring(1)


## <span style="color:#ff5f27;">🪝 Retrieve AirQuality Model from Model Registry</span>

In [4]:
# Retrieve the model registry
mr = project.get_model_registry()

# Retrieve the 'air_quality_xgboost_model' from the model registry
retrieved_model = mr.get_model(
    name="air_quality_xgboost_model",
    version=1,
)

# Download the saved model artifacts  to a local directory
saved_model_dir = retrieved_model.download()

Connected. Call `.close()` to terminate connection gracefully.
Downloading model artifact (1 dirs, 6 files)... DONE

In [5]:
# Loading the XGBoost regressor model and label encoder from the saved model directory
# model_air_quality = joblib.load(saved_model_dir + "/xgboost_regressor.pkl")
model_air_quality = XGBRegressor()

model_air_quality.load_model(saved_model_dir + "/model.json")

# Displaying the retrieved XGBoost regressor model
model_air_quality

In [6]:
from functions.air_quality_data_retrieval import *
date_start = "2024-02-02"
date_end = "2024-02-04"
res = get_historical_data_in_date_range(date_start, date_end, feature_view, model_air_quality)
print(res)

Finished: Reading data from Hopsworks, using ArrowFlight (0.85s) 
         date  pm25
0  2024-02-02  22.0
1  2024-02-03  12.0
2  2024-02-04  17.0
3  2024-02-05  20.0


## <span style='color:#ff5f27'>⬇️ LLM Loading

In [7]:
import time
start_time = time.time()

# Load the LLM and its corresponding tokenizer.
model_llm, tokenizer = load_model()

duration = time.time() - start_time
print(f"The code execution took {duration} seconds.")

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Loading model from disk


`low_cpu_mem_usage` was None, now set to True since model is quantized.


The code execution took 11.15595006942749 seconds.


## <span style='color:#ff5f27'>⛓️ LangChain

In [8]:
import time
start_time = time.time()


# Create and configure a language model chain.
llm_chain = get_llm_chain(
    model_llm,
    tokenizer,
)

duration = time.time() - start_time
print(f"The code execution took {duration} seconds.")

The code execution took 0.39171886444091797 seconds.


## <span style='color:#ff5f27'>🧬 Model Inference


In [9]:
QUESTION7 = "Hi!"

response7 = generate_response(
    QUESTION7,
    feature_view,
    model_llm, 
    tokenizer,
    model_air_quality,
    llm_chain,
    verbose=True,
)

print(response7)

🗓️ Today's date: Saturday, 2024-03-23
📖 

Hello! How can I assist you with air quality information today?


In [10]:
QUESTION = "Who are you?"

response = generate_response(
    QUESTION,
    feature_view,
    model_llm,
    tokenizer,
    model_air_quality,
    llm_chain,
    verbose=True,
)

print(response)

🗓️ Today's date: Saturday, 2024-03-23
📖 

I am an AI assistant designed to provide air quality information based on the data provided by the user. I can help you with air quality information for a specific date or location.


In [11]:
QUESTION1 = "What was the average air quality from 2024-01-10 till 2024-01-14?"

response1 = generate_response(
    QUESTION1, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    llm_chain,
    verbose=True,
)

print(response1)

Finished: Reading data from Hopsworks, using ArrowFlight (0.81s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:
Date: 2024-01-10; Air Quality: 9.0
Date: 2024-01-11; Air Quality: 8.0
Date: 2024-01-12; Air Quality: 9.0
Date: 2024-01-13; Air Quality: 14.0
Date: 2024-01-14; Air Quality: 13.0
Date: 2024-01-15; Air Quality: 8.0

The average air quality from 2024-01-10 to 2024-01-14 was 10.2. This indicates that the air quality during that period was generally moderate, and it may have been safe for most people to go outside. However, it's always a good idea to check local air quality advisories and take precautions if you have respiratory sensitivities.


In [12]:
QUESTION11 = "When and what was the air quality like last week?"

response11 = generate_response(
    QUESTION11, 
    feature_view, 
    model_llm,
    tokenizer,
    model_air_quality,
    llm_chain,
    verbose=True,
)

print(response11)

Finished: Reading data from Hopsworks, using ArrowFlight (0.88s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:
Date: 2024-03-19; Air Quality: 17.0
Date: 2024-03-20; Air Quality: 17.0
Date: 2024-03-21; Air Quality: 41.0
Date: 2024-03-22; Air Quality: 14.0
Date: 2024-03-23; Air Quality: 20.0

Last week, the air quality was as follows:

- On 2024-03-19, the air quality was 17.0, which indicates that the air quality was generally moderate and it may have been safe for most people to go outside.
- On 2024-03-20, the air quality was 17.0, which is the same as the day before. This also indicates moderate air quality and it may have been safe for most people to go outside.
- On 2024-03-21, the air quality was 41.0, which indicates that the air quality was very poor and it may have been best to limit outdoor activities, especially if you have respiratory sensitivities.
- On 2024-03-22, the air quality was 14.0, which indicates that the air quality was unhealthy for sensitiv

In [13]:
QUESTION12 = "When and what was the minimum air quality from 2024-01-10 till 2024-01-14?"

response12 = generate_response(
    QUESTION12, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    llm_chain,
    verbose=True,
)

print(response12)

Finished: Reading data from Hopsworks, using ArrowFlight (0.89s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:
Date: 2024-01-10; Air Quality: 9.0
Date: 2024-01-11; Air Quality: 8.0
Date: 2024-01-12; Air Quality: 9.0
Date: 2024-01-13; Air Quality: 14.0
Date: 2024-01-14; Air Quality: 13.0
Date: 2024-01-15; Air Quality: 8.0

The minimum air quality from 2024-01-10 to 2024-01-14 was on 2024-01-15, with an air quality of 8.0. This indicates that the air quality on that day was generally good, and it may have been safe for most people to go outside. However, it's always a good idea to check local air quality advisories and take precautions if you have respiratory sensitivities.


In [14]:
QUESTION2a = "What was the air quality like last week?"

response2 = generate_response(
    QUESTION2a,
    feature_view, 
    model_llm,
    tokenizer,
    model_air_quality,
    llm_chain,
    verbose=True,
)

print(response2)

Finished: Reading data from Hopsworks, using ArrowFlight (0.89s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:
Date: 2024-03-19; Air Quality: 17.0
Date: 2024-03-20; Air Quality: 17.0
Date: 2024-03-21; Air Quality: 41.0
Date: 2024-03-22; Air Quality: 14.0
Date: 2024-03-23; Air Quality: 20.0

Last week, the air quality was as follows:

- On 2024-03-19, the air quality was 17.0, which indicates that the air quality was generally moderate and it may have been safe for most people to go outside.
- On 2024-03-20, the air quality was 17.0, which is the same as the day before. This also indicates moderate air quality and it may have been safe for most people to go outside.
- On 2024-03-21, the air quality was 41.0, which indicates that the air quality was very poor and it may have been best to limit outdoor activities, especially if you have respiratory sensitivities.
- On 2024-03-22, the air quality was 14.0, which indicates that the air quality was unhealthy for sensitiv

In [15]:
QUESTION2 = "What was the air quality like yesterday?"

response2 = generate_response(
    QUESTION2,
    feature_view, 
    model_llm,
    tokenizer,
    model_air_quality,
    llm_chain,
    verbose=True,
)

print(response2)

Finished: Reading data from Hopsworks, using ArrowFlight (0.89s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:
Date: 2024-03-22; Air Quality: 14.0

Yesterday, the air quality was 20.0, which indicates that the air quality was unhealthy for sensitive groups, similar to the day before. It may have been safe for most people to go outside, but it's best to check local air quality advisories for updates.


In [16]:
QUESTION3 = "What will the air quality be like on 2024-03-20?"

response3 = generate_response(
    QUESTION3, 
    feature_view, 
    model_llm, 
    tokenizer,
    model_air_quality,
    llm_chain,
    verbose=True,
)

print(response3)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.39s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:


On 2024-03-20, the air quality was 17.0, which indicates that the air quality was generally moderate and it may have been safe for most people to go outside.


In [17]:
QUESTION4 = "What will the air quality be like the day after tomorrow?"

response4 = generate_response(
    QUESTION4, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    llm_chain,
    verbose=True,
)

print(response4)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.38s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:


I'm sorry, but I don't have any information about the air quality for the day after tomorrow. Could you please provide me with the date so I can check the air quality for that specific day?


In [18]:
QUESTION5 = "What will the air quality be like this Sunday?"

response5 = generate_response(
    QUESTION5, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    llm_chain,
    verbose=True,
)

print(response5)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.41s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:


On Sunday, 2024-03-24, the air quality was 20.0, which indicates that the air quality was unhealthy for sensitive groups. It may have been safe for most people to go outside, but it's best to check local air quality advisories for updates.


In [19]:
QUESTION7 = "What will the air quality be like for the rest of the week?"

response7 = generate_response(
    QUESTION7, 
    feature_view,
    model_llm,
    tokenizer, 
    model_air_quality, 
    llm_chain,
    verbose=True,
)

print(response7)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.39s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:
Date: 2024-03-23 00:00:00; Air Quality: 29.37
Date: 2024-03-24 00:00:00; Air Quality: 34.19
Date: 2024-03-25 00:00:00; Air Quality: 46.52
Date: 2024-03-26 00:00:00; Air Quality: 43.97
Date: 2024-03-27 00:00:00; Air Quality: 21.64
Date: 2024-03-28 00:00:00; Air Quality: 24.7
Date: 2024-03-29 00:00:00; Air Quality: 13.1

For the rest of the week, the air quality is expected to be generally unhealthy for sensitive groups, with some days being unhealthy for the general population. It's best to check local air quality advisories for updates and to plan outdoor activities accordingly.


In [20]:
QUESTION = "Will the air quality be safe or not for the next week?"

response = generate_response(
    QUESTION7, 
    feature_view, 
    model_llm, 
    tokenizer,
    model_air_quality,
    llm_chain,
    verbose=True,
)

print(response)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.38s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:
Date: 2024-03-23 00:00:00; Air Quality: 29.37
Date: 2024-03-24 00:00:00; Air Quality: 34.19
Date: 2024-03-25 00:00:00; Air Quality: 46.52
Date: 2024-03-26 00:00:00; Air Quality: 43.97
Date: 2024-03-27 00:00:00; Air Quality: 21.64
Date: 2024-03-28 00:00:00; Air Quality: 24.7
Date: 2024-03-29 00:00:00; Air Quality: 13.1

For the rest of the week, the air quality is expected to be generally unhealthy for sensitive groups, with some days being unhealthy for the general population. It's best to check local air quality advisories for updates and to plan outdoor activities accordingly.


In [21]:
QUESTION = "Is tomorrow's air quality level dangerous?"

response = generate_response(
    QUESTION, 
    feature_view, 
    model_llm, 
    tokenizer,
    model_air_quality, 
    llm_chain,
    verbose=True,
)

print(response)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.42s) 
🗓️ Today's date: Saturday, 2024-03-23
📖 Air Quality Measurements:


On Monday, 2024-03-25, the air quality was 30.0, which indicates that the air quality was unhealthy for sensitive groups. It may have been safe for most people to go outside, but it's best to check local air quality advisories for updates.


In [22]:
QUESTION = "Can you please explain different air quality levels?"

response = generate_response(
    QUESTION, 
    feature_view, 
    model_llm, 
    tokenizer,
    model_air_quality, 
    llm_chain,
    verbose=True,
)

print(response)

🗓️ Today's date: Saturday, 2024-03-23
📖 

Of course! Here are the different air quality levels and their descriptions:

1. Good (0-50): The air quality is considered good, and it is generally safe for everyone to be outside.
2. Moderate (51-100): The air quality is acceptable, but sensitive groups like children, the elderly, and those with respiratory issues may experience some discomfort.
3. Unhealthy for Sensitive Groups (101-150): The air quality is unhealthy for sensitive groups, and everyone else should limit prolonged outdoor exertion.
4. Unhealthy (151-200): The air quality is unhealthy for the general population, and everyone should limit prolonged outdoor exertion.
5. Very Unhealthy (201-300): The air quality is very unhealthy, and everyone should avoid prolonged outdoor exertion.
6. Hazardous (>300): The air quality is hazardous and poses a serious health risk to everyone, and all outdoor activities should be avoided.

These levels are based on the Air Quality Index (AQI), wh

In [23]:
import gradio as gr
from transformers import pipeline
import numpy as np
import hopsworks
from xgboost import XGBRegressor
from functions.llm_chain import load_model, get_llm_chain, generate_response


2024-03-23 18:51:12,356 INFO: HTTP Request: GET https://api.gradio.app/gradio-messaging/en "HTTP/1.1 200 OK"


In [24]:
# Initialize the ASR pipeline
transcriber = pipeline("automatic-speech-recognition", model="openai/whisper-base.en")

def transcribe(audio):
    sr, y = audio
    y = y.astype(np.float32)
    if y.ndim > 1 and y.shape[1] > 1:
        y = np.mean(y, axis=1)
    y /= np.max(np.abs(y))
    return transcriber({"sampling_rate": sr, "raw": y})["text"]

def generate_query_response(user_query):
    response = generate_response(
        user_query,
        feature_view,
        model_llm,
        tokenizer,
        model_air_quality,
        llm_chain,
        verbose=False,
    )
    return response

def handle_input(text_input=None, audio_input=None):
    if audio_input is not None:
        user_query = transcribe(audio_input)
    else:
        user_query = text_input
    
    if user_query:
        return generate_query_response(user_query)
    else:
        return "Please provide input either via text or voice."

iface = gr.Interface(
    fn=handle_input,
    inputs=[gr.Textbox(placeholder="Type here or use voice input..."), gr.Audio()],
    outputs="text",
    title="🌤️ AirQuality AI Assistant 💬",
    description="Ask your questions about air quality or use your voice to interact."
)

iface.launch(share=True)


Running on local URL:  http://127.0.0.1:7860
2024-03-23 18:51:14,155 INFO: HTTP Request: GET http://127.0.0.1:7860/startup-events "HTTP/1.1 200 OK"
2024-03-23 18:51:14,231 INFO: HTTP Request: GET https://checkip.amazonaws.com/ "HTTP/1.1 200 "
2024-03-23 18:51:14,638 INFO: HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-03-23 18:51:15,001 INFO: HTTP Request: POST https://api.gradio.app/gradio-initiated-analytics/ "HTTP/1.1 200 OK"
2024-03-23 18:51:16,211 INFO: HTTP Request: HEAD http://127.0.0.1:7860/ "HTTP/1.1 200 OK"
2024-03-23 18:51:17,289 INFO: HTTP Request: GET https://api.gradio.app/v2/tunnel-request "HTTP/1.1 200 OK"
Running on public URL: https://c674f22c8da86c7674.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
2024-03-23 18:51:18,652 INFO: HTTP Request: HEAD https://c674f22c8da86c7674.gradio.live "HTTP/1.1 200 OK"




2024-03-23 18:51:19,374 INFO: HTTP Request: POST https://api.gradio.app/gradio-launched-telemetry/ "HTTP/1.1 200 OK"
Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.43s) 
Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.42s) 
Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5240
Connected. Call `.close()` to terminate connection gracefully.
Finished: Reading data from Hopsworks, using ArrowFlight (0.42s) 
Connection closed.
Connected. C

---