## <span style='color:#ff5f27'> 📝 Imports

In [2]:
import joblib

from functions.llm_chain import load_model, get_llm_chain, generate_response

## <span style="color:#ff5f27;"> 🔮 Connect to Hopsworks Feature Store </span>

In [3]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store() 

Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://snurran.hops.works/p/5242
Connected. Call `.close()` to terminate connection gracefully.


## <span style="color:#ff5f27;"> ⚙️ Feature View Retrieval</span>

In [4]:
# Retrieve the 'air_quality_fv' feature view
feature_view = fs.get_feature_view(
    name='air_quality_fv',
    version=1,
)

# Initialize batch scoring
feature_view.init_batch_scoring(1)

## <span style="color:#ff5f27;">🪝 Retrieve AirQuality Model from Model Registry</span>

In [5]:
# Retrieve the model registry
mr = project.get_model_registry()

# Retrieve the 'air_quality_xgboost_model' from the model registry
retrieved_model = mr.get_model(
    name="air_quality_xgboost_model",
    version=1,
)

# Download the saved model artifacts to a local directory
saved_model_dir = retrieved_model.download()

Connected. Call `.close()` to terminate connection gracefully.
Downloading model artifact (0 dirs, 6 files)... DONE

In [6]:
# Load the XGBoost regressor model and label encoder from the saved model directory
model_air_quality = joblib.load(saved_model_dir + "/xgboost_regressor.pkl")
encoder = joblib.load(saved_model_dir + "/label_encoder.pkl")

# Display the retrieved XGBoost regressor model
model_air_quality

## <span style='color:#ff5f27'>⬇️ LLM Loading

In [7]:
# Load the LLM and its corresponding tokenizer.
model_llm, tokenizer = load_model()

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


2024-03-17 13:56:41,741 INFO: We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

## <span style='color:#ff5f27'>⛓️ LangChain

In [8]:
# Create and configure a language model chain.
llm_chain = get_llm_chain(
    model_llm, 
    tokenizer,
)



## <span style='color:#ff5f27'>🧬 Model Inference


In [9]:
QUESTION7 = "Hi!"

response7 = generate_response(
    QUESTION7,
    feature_view,
    model_llm, 
    tokenizer,
    model_air_quality,
    encoder,
    llm_chain,
    verbose=True,
)

print(response7)

🗓️ Today's date: Sunday, 2024-03-17
📖 

Hello! How can I help you with air quality today?


In [10]:
QUESTION = "Who are you?"

response = generate_response(
    QUESTION,
    feature_view,
    model_llm,
    tokenizer,
    model_air_quality,
    encoder,
    llm_chain,
    verbose=True,
)

print(response)

🗓️ Today's date: Sunday, 2024-03-17
📖 

I am an AI Air Quality assistant, here to help you with any air quality-related questions or concerns you may have. I can provide information on current and historical air quality data for your location, offer advice on whether it's safe to go outside, and suggest ways to improve air quality. How can I help you today?


In [11]:
QUESTION1 = "What was the average air quality from 2024-01-10 till 2024-01-14 in New York?"

response1 = generate_response(
    QUESTION1, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    encoder,
    llm_chain,
    verbose=True,
)

print(response1)

Finished: Reading data from Hopsworks, using ArrowFlight (7.49s) 
🗓️ Today's date: Sunday, 2024-03-17
📖 Air Quality Measurements for New York:
Date: 2024-01-10; Air Quality: 7.2
Date: 2024-01-11; Air Quality: 5.9
Date: 2024-01-12; Air Quality: 10.8
Date: 2024-01-13; Air Quality: 5.9
Date: 2024-01-14; Air Quality: 5.1

The average air quality in New York from January 10th to January 14th was 6.8. This indicates that the air quality was generally safe for most people to breathe, but individuals with respiratory issues may still need to take precautions.


In [12]:
QUESTION11 = "When and what was the maximum air quality from 2024-01-10 till 2024-01-14 in New York?"

response11 = generate_response(
    QUESTION11, 
    feature_view, 
    model_llm,
    tokenizer,
    model_air_quality,
    encoder,
    llm_chain,
    verbose=True,
)

print(response11)

Finished: Reading data from Hopsworks, using ArrowFlight (7.40s) 
🗓️ Today's date: Sunday, 2024-03-17
📖 Air Quality Measurements for New York:
Date: 2024-01-10; Air Quality: 7.2
Date: 2024-01-11; Air Quality: 5.9
Date: 2024-01-12; Air Quality: 10.8
Date: 2024-01-13; Air Quality: 5.9
Date: 2024-01-14; Air Quality: 5.1

The maximum air quality in New York from January 10th to January 14th was on January 12th with an air quality of 10.8. This indicates that the air quality on that day was not safe for most people, especially those with respiratory issues, and it would be advisable to limit outdoor activities.


In [13]:
QUESTION12 = "When and what was the minimum air quality from 2024-01-10 till 2024-01-14 in New York?"

response12 = generate_response(
    QUESTION12, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    encoder,
    llm_chain,
    verbose=True,
)

print(response12)

Finished: Reading data from Hopsworks, using ArrowFlight (7.47s) 
🗓️ Today's date: Sunday, 2024-03-17
📖 Air Quality Measurements for New York:
Date: 2024-01-10; Air Quality: 7.2
Date: 2024-01-11; Air Quality: 5.9
Date: 2024-01-12; Air Quality: 10.8
Date: 2024-01-13; Air Quality: 5.9
Date: 2024-01-14; Air Quality: 5.1

The minimum air quality in New York from January 10th to January 14th was on January 11th with an air quality of 5.9. This indicates that the air quality on that day was generally safe for most people to breathe, but individuals with respiratory issues may still need to take precautions.


In [14]:
QUESTION2 = "What was the air quality yesterday in London?"

response2 = generate_response(
    QUESTION2, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    encoder,
    llm_chain,
    verbose=True,
)

print(response2)

Finished: Reading data from Hopsworks, using ArrowFlight (7.39s) 
🗓️ Today's date: Sunday, 2024-03-17
📖 Air Quality Measurements for London:
Date: 2024-03-16; Air Quality: 9.5

The air quality yesterday in London was 9.5, which indicates that the air quality was generally safe for most people to breathe, but individuals with respiratory issues may still need to take precautions.


In [15]:
QUESTION3 = "What will the air quality be like in London in 2024-03-23?"

response3 = generate_response(
    QUESTION3, 
    feature_view, 
    model_llm, 
    tokenizer,
    model_air_quality,
    encoder,
    llm_chain,
    verbose=True,
)

print(response3)

Finished: Reading data from Hopsworks, using ArrowFlight (7.47s) 
🗓️ Today's date: Sunday, 2024-03-17
📖 Air Quality Measurements for London:
Date: 2024-03-17; Air Quality: 7.6
Date: 2024-03-18; Air Quality: 9.88
Date: 2024-03-19; Air Quality: 9.18
Date: 2024-03-20; Air Quality: 9.34
Date: 2024-03-21; Air Quality: 9.37
Date: 2024-03-22; Air Quality: 9.37
Date: 2024-03-23; Air Quality: 9.37

The air quality in London on 2024-03-23 is expected to be 9.37, which indicates that the air quality is generally safe for most people to breathe, but individuals with respiratory issues may still need to take precautions.


In [16]:
QUESTION4 = "What will the air quality be like in Chicago the day after tomorrow?"

response4 = generate_response(
    QUESTION4, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    encoder,
    llm_chain,
    verbose=True,
)

print(response4)

Finished: Reading data from Hopsworks, using ArrowFlight (7.39s) 
🗓️ Today's date: Sunday, 2024-03-17
📖 Air Quality Measurements for Chicago:
Date: 2024-03-17; Air Quality: 6.1
Date: 2024-03-18; Air Quality: 8.37
Date: 2024-03-19; Air Quality: 7.39

The air quality in Chicago the day after tomorrow, on 2024-03-19, is expected to be 7.39, which indicates that the air quality is generally safe for most people to breathe, but individuals with respiratory issues may still need to take precautions.


In [17]:
QUESTION5 = "What will the air quality be like in London on Sunday?"

response5 = generate_response(
    QUESTION5, 
    feature_view, 
    model_llm, 
    tokenizer, 
    model_air_quality, 
    encoder,
    llm_chain,
    verbose=True,
)

print(response5)

Finished: Reading data from Hopsworks, using ArrowFlight (7.38s) 
🗓️ Today's date: Sunday, 2024-03-17
📖 Air Quality Measurements for London:
Date: 2024-03-17; Air Quality: 7.6

The air quality in London on Sunday, 2024-03-17, is expected to be 7.6, which indicates that the air quality is generally safe for most people to breathe, but individuals with respiratory issues may still need to take precautions.


In [18]:
QUESTION7 = "What will the air quality be like on March 21 in London?"

response7 = generate_response(
    QUESTION7, 
    feature_view,
    model_llm,
    tokenizer, 
    model_air_quality, 
    encoder,
    llm_chain,
    verbose=True,
)

print(response7)

Finished: Reading data from Hopsworks, using ArrowFlight (7.58s) 
🗓️ Today's date: Sunday, 2024-03-17
📖 Air Quality Measurements for London:
Date: 2024-03-17; Air Quality: 7.6
Date: 2024-03-18; Air Quality: 9.88
Date: 2024-03-19; Air Quality: 9.18
Date: 2024-03-20; Air Quality: 9.34
Date: 2024-03-21; Air Quality: 9.37

The air quality in London on March 21 is expected to be 9.37, which indicates that the air quality is generally safe for most people to breathe, but individuals with respiratory issues may still need to take precautions.


In [19]:
QUESTION = "Is this air quality level dangerous?"

response = generate_response(
    QUESTION, 
    feature_view, 
    model_llm, 
    tokenizer,
    model_air_quality, 
    encoder,
    llm_chain,
    verbose=True,
)

print(response)

🗓️ Today's date: Sunday, 2024-03-17
📖 





The air quality level is not dangerous, but individuals with respiratory issues may still need to take precautions.


In [20]:
QUESTION = "Can you please explain different air quality levels?"

response = generate_response(
    QUESTION, 
    feature_view, 
    model_llm, 
    tokenizer,
    model_air_quality, 
    encoder,
    llm_chain,
    verbose=True,
)

print(response)

🗓️ Today's date: Sunday, 2024-03-17
📖 





Of course! Air quality levels are typically measured on a scale, and the specific scale can vary depending on the location and the organization providing the measurements. Generally, air quality levels are categorized into different ranges, with each range corresponding to a specific level of air quality. Here is a general overview of air quality levels:

1. Good (0-50): The air quality is considered good, and it is safe for most people to breathe.
2. Moderate (51-100): The air quality is acceptable, but it may cause a slight irritation to some people with respiratory issues.
3. Poor (101-150): The air quality is considered unhealthy for sensitive groups, such as children, the elderly, and those with respiratory issues. It is advisable for these groups to limit their outdoor activities.
4. Very Poor (151-200): The air quality is considered unhealthy, and it may cause respiratory issues for most people. Outdoor activities should be limited.
5. Hazardous (over 200): The air quality is c

---