# Vehicle Sales Price Predictions Workshop - Part 3 of 3


## Inference Pipeline

In order to make a machine learning system from this dataset, we have structured the service into 3 pipelines:

1. feature engineering pipeline notebook (see Part 1)
2. training pipeline notebook (see Part 2)
3. inferencing pipeline notebook (this Part 3)

This notebook will outline the third step, ie. the inference pipeline.

In [1]:
# We need to install a library to deploy the model. This install throws an error in colab, but will still work.
!pip install --quiet "hsfs[python] @ git+https://github.com/logicalclocks/feature-store-api@master#subdirectory=python"

[0m

In [1]:
# We will use the Hopsorks Model Registry to instantiate the Model

import hopsworks
import joblib
import torch
import torch.nn as nn

proj = hopsworks.login()
fs = proj.get_feature_store()
mr = proj.get_model_registry()

feature_view = fs.get_feature_view("car_prices", version=1)

model = mr.get_model(
    "car_prices",
    version=1,
)

# Download the model directory from the Model Registry
model_dir = model.download()

# Load the model using joblib from the downloaded model directory
label_encoders = joblib.load(model_dir + "/label_encoders.pkl")

# Definition of the model
class DeepRegressor(nn.Module):
    def __init__(self, input_size):
        super(DeepRegressor, self).__init__()
        self.fc1 = nn.Linear(input_size, 64)
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 1)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x

# Load the file of the model
model_data = torch.load(model_dir + '/regression_model.pth', map_location=torch.device('cpu'))

Connected. Call `.close()` to terminate connection gracefully.

Multiple projects found. 

	 (1) Car_Prices
	 (2) GraphEmbeddingsDemo
	 (3) rixdemo
	 (4) BeerVolumePrediction

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/818324
2024-06-24 15:44:57,626 INFO: Initializing external client
2024-06-24 15:44:57,627 INFO: Base URL: https://c.app.hopsworks.ai:443

Connected. Call `.close()` to terminate connection gracefully.
Connected. Call `.close()` to terminate connection gracefully.
Downloading model artifact (1 dirs, 4 files)... DONE

In [2]:
# Check the type of the loaded object
if isinstance(model_data, dict) and 'model_state_dict' in model_data:
    state_dict = model_data['model_state_dict']
elif isinstance(model_data, torch.nn.Module):
    state_dict = model_data.state_dict()
else:
    raise ValueError("The file does not contain a valid PyTorch model or the format is unexpected.")

# Show weights of each layer
print("Show the weights of the different layers of the model:")
for layer_name, weights in state_dict.items():
    print(f"\nLayer: {layer_name}")
    print(f"Weights: {weights.shape}")
    print(weights)

# If the object contains another structure, inspect it
if isinstance(model_data, dict):
    print("\nComplete structure of the saved model:")
    for key, value in model_data.items():
        if key != 'model_state_dict':
            print(f"\nStructure of the key '{key}':")
            if isinstance(value, torch.Tensor):
                print(f"  - Tensor with shape : {value.shape}")
            elif isinstance(value, dict):
                print("  - Dictionnary with keys :")
                for subkey in value.keys():
                    print(f"    - {subkey}")
            else:
                print(f"  - Type : {type(value)}")


# If the object is directly the model, display its architecture
elif isinstance(model_data, torch.nn.Module):
    print("\nThe saved model is a direct PyTorch model.")
    print("Architecture of the model :")
    print(model_data)

Show the weights of the different layers of the model:

Layer: fc1.weight
Weights: torch.Size([64, 10])
tensor([[ 3.7739e-03, -4.1663e-01, -3.2566e-01, -1.7974e-01, -1.1622e-01,
         -5.3949e-02, -1.5235e-01, -6.7243e-02, -3.7296e-01, -3.1163e-01],
        [ 6.9914e-03, -4.0423e-01, -5.3683e-01, -6.0665e-01, -4.9663e-01,
         -4.5980e-01,  3.5866e-01, -2.2980e-01, -9.2960e-03, -3.4243e-02],
        [ 1.8259e+00,  1.0723e-01, -9.7697e-02,  4.8672e-03,  8.2941e-01,
         -8.4416e-01,  1.6125e+00, -3.4593e-02,  7.9182e-01,  5.6379e-01],
        [ 4.3295e-03, -1.1289e-01, -5.1054e-01, -7.7657e-01, -1.3810e-01,
         -4.5050e-01,  1.0690e-01, -4.7684e-02,  8.4692e-02, -3.5154e-01],
        [-2.7155e-01,  1.0009e-01, -1.5411e-01,  3.0566e-01, -5.4428e-02,
          1.6371e-02,  9.8408e-02, -1.5999e-01,  1.5704e-01,  2.8539e-01],
        [ 1.9840e-02, -2.1818e-01, -9.5115e-01, -8.8015e-01, -2.3550e-01,
         -4.8558e-01,  9.6716e-02, -8.6441e-02,  5.8155e-02, -4.2638e-01],
  

In [3]:
test_data = feature_view.get_batch_data(start_time="2015-07-01 00:00")
test_data

Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (3.86s) 


Unnamed: 0,year,make,model,trim,body,transmission,condition,odometer,color,interior
0,2013,Toyota,Tacoma,V6,double cab,manual,27.0,43988.0,green,gray
1,2011,GMC,Yukon XL,Denali,suv,automatic,38.0,29767.0,black,black
2,2007,Chevrolet,Silverado 1500 Classic,LS2,crew cab,automatic,24.0,76663.0,gold,gray
3,2012,Chevrolet,Sonic,LT,Hatchback,manual,36.0,64312.0,brown,gray
4,2013,Dodge,Grand Caravan,SXT,Minivan,automatic,36.0,70859.0,gold,black
...,...,...,...,...,...,...,...,...,...,...
831,2014,Ford,Fusion Hybrid,Titanium,Sedan,automatic,45.0,33761.0,silver,black
832,2015,Volvo,V60,T5 Drive-E,Wagon,automatic,44.0,20846.0,silver,black
833,2006,Infiniti,G35,Base,Coupe,manual,22.0,1.0,gray,gray
834,2013,Nissan,Maxima,3.5 SV,Sedan,automatic,33.0,50363.0,silver,black


In [5]:
from sklearn.preprocessing import LabelEncoder

# Define the function to encode categorical data
def encode_categorical_data(dataset, label_encoders):    
    # Iterate over the columns of the DataFrame
    for column in dataset.columns:
        # Check if the column is of type 'object' (categorical)
        if dataset[column].dtype == 'object':
            # Retrieve already fitted LabelEncoder
            label_encoder = label_encoders[column.lower()]
            
            # Perform encoding on unique column values
            dataset[column] = label_encoder.transform(dataset[column])
    return dataset

In [6]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
import numpy as np
import pandas as pd

# Define the function to predict selling prices
def predict_selling_price(data, label_encoders, model):
        
    # Preprocess the data
    processed_data = encode_categorical_data(data, label_encoders)
    
    # Convert data to PyTorch tensors
    X_test = torch.tensor(processed_data.values.astype(np.float32))
    
    # Pass data to model for prediction
    with torch.no_grad():
        model.eval()
        predictions = model(X_test).numpy()
        
    predictions_df = pd.DataFrame(predictions, columns=['predicted_sale_price'])
    data['predicted_sale_price'] = predictions_df['predicted_sale_price']
    return data

df_encoded = predict_selling_price(test_data, label_encoders, model_data)
df_encoded

Unnamed: 0,year,make,model,trim,body,transmission,condition,odometer,color,interior,predicted_sale_price
0,2013,45,577,1057,50,1,27.0,43988.0,8,6,10083.130859
1,2011,13,656,429,72,0,38.0,29767.0,1,1,10160.800781
2,2007,7,546,654,45,0,24.0,76663.0,6,6,9733.910156
3,2012,7,558,657,21,1,36.0,64312.0,3,6,9839.119141
4,2013,9,286,958,25,0,36.0,70859.0,6,1,9804.382812
...,...,...,...,...,...,...,...,...,...,...,...
831,2014,12,249,1018,32,0,45.0,33761.0,15,1,10155.495117
832,2015,47,610,996,38,0,44.0,20846.0,15,1,10281.016602
833,2006,18,256,321,8,1,22.0,1.0,7,6,9679.805664
834,2013,33,382,184,32,0,33.0,50363.0,15,1,9894.855469


### 8. Use your trained model to make price estimates

Make a prediction function to load the model trained and saved in the model_regressor.pth file and then test on data that the user will enter manually. As a reminder, the categorical data was encoded and saved in a label_encoders.pth file. There are also numeric variables that the user must indicate.

In [7]:
!pip install --quiet gradio==3.48.0




In [8]:
import gradio as gr
from functools import partial
import warnings
warnings.filterwarnings('ignore')

# Define the method to print out the values entered
def print_values(year, make, model, trim, body, transmission, condition, odometer, color, interior, model_data, label_encoders):
    data = {
        "Year": [year],
        "Make": [make],
        "Model": [model],
        "Trim": [trim],
        "Body": [body],
        "Transmission": [transmission],
        "Condition": [condition],
        "Odometer": [odometer],
        "Color": [color],
        "Interior": [interior]
    }
    df = pd.DataFrame(data)
        
    df_encoded = predict_selling_price(df, label_encoders, model_data)
    return df_encoded.iloc[0]['predicted_sale_price']

print_values_partial = partial(
    print_values, 
    model_data=model_data, 
    label_encoders=label_encoders,
)

In [9]:
# Create the Gradio interface
with gr.Blocks() as demo:
    with gr.Row():
        year = gr.Number(label="Year", value=2014)
        make = gr.Dropdown(label="Make", choices=["Toyota", "Ford", "Volkswagen", "BMW"], value="Toyota")
        model = gr.Dropdown(label="Model", choices=["Prius", "Mustang", "Jetta", "X1"], value="Prius")
        trim = gr.Dropdown(label="Trim", choices=["Base", "SL", "GLS", "Luxury"], value="Base")
        body = gr.Dropdown(label="Body", choices=["Hatchback", "Sedan", "SUV", "Minivan"], value="Hatchback")

    with gr.Row():
        transmission = gr.Dropdown(label="Transmission", choices=["automatic", "manual"], value="automatic")
        condition = gr.Number(label="Condition", value=45.0)
        odometer = gr.Number(label="Odometer", value=33761.0)
        color = gr.Dropdown(label="Color", choices=["red", "white", "black", "silver"], value="red")
        interior = gr.Dropdown(label="Interior", choices=["black", "gray", "brown"], value="black")

    submit_button = gr.Button("Submit")
    output = gr.JSON(label="Entered Values")

    submit_button.click(
        print_values_partial,
        inputs=[year, make, model, trim, body, transmission, condition, odometer, color, interior],
        outputs=output,
    )

# Launch the interface
demo.launch(share=True)

Running on local URL:  http://127.0.0.1:7860
2024-06-24 15:46:04,304 INFO: Found credentials in shared credentials file: ~/.aws/credentials
IMPORTANT: You are using gradio version 3.48.0, however version 4.29.0 is available, please upgrade.
--------
Running on public URL: https://243ef10c902b522d8a.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




This completes the entire process of feature engineering, training and inferencing pipelines, and therefore the delivery of an end-to-end machine learning system.

---