***

# Taxi Trip Fare Prediction - Model 1

***

The goal of this example is to train and serve a taxi trip fare prediction model. We will
- train an ML model based on historical taxi trip fare data
- serve the ML model to predict the trip fare for new trips

### Prepare your data

The trip table is a csv file containing information about taxi trips. First we will download the csv file and peek at a few lines of data. The data includes the pickup datetime, pickup latitude, longitude, dropoff latitude, longitude, pickup and dropoff zipcodes, passenger count and fare amount. We will download the csv using `wget` and print the first few lines using the `head` command.

In [None]:
!wget http://<wget server address>:8011/trip_table11.csv

In [None]:
!head -n 5 trip_table11.csv

### Upload your data

We will use MySQL as the data source for the trip table. We will upload the csv file to a MySQL server and connect that MySQL server to the Elevo platform. Use `mysql-load-csv.py` to upload a csv file to the MySQL server. The `-b` option specifies the IP address of the MySQL server. The `-u` and `-p` options specify the  MySQL username and password. The `-i` option specifies the input csv file name. The `-k` option specifies the MySQL table name. The `-m` option specifies the MySQL index column names. The `-n` option specifies the MySQL primary key column names. The `-g` option obtains the MySQL server credentials. The `-h` option displays help.

Note the `mysql source meta` from the upload output. It will be used later to connect MySQL to the Elevo platform.

In [None]:
!mysql-load-csv.py -b <mysql host> -u '<mysql user>' -p '<mysql password>' -i trip_table11.csv -k trip_table -m pickup_datetime -n 'pickup_datetime,pickup_longitude,pickup_latitude,dropoff_longitude,dropoff_latitude'

In [None]:
!mysql-load-csv.py -b <mysql host> -u '<mysql user>' -p '<mysql password>' -g

***

### Create a Foresight project

We will create a project called `trip_fare` for this example.

In [None]:
create project trip_fare

***

# Connect your Data Sources

<html><img src="1_1.png"/></html>

In this step we will connect the MySQL data source to Elevo. This will allow Elevo to read the trip table from the MySQL server.

### Create a Foresight ML sources file

Data sources are connected to Elevo via a Foresight ML sources file. Create a Foresight ML sources file using the templates and code snippets available at the icons to the left. Refer to the  Elevo Foresight User Manual for help. The MySQL server credentials must be specified in the sources file.
Alternatively you may use the Foresight ML sources file from this tutorial.

**Make sure you update the Foresight ML sources file with the correct MySQL server url address and user credentials obtained from the *"Upload your data"* step above.**
<br>The relevant section in the `trip_fare_data_sources_1.yml` file looks like this:
    
            meta:
              source_type: mysql
              source_format: jdbc
              url: jdbc:mysql://<mysql host>:3306/tutorial_client_<xxxx_xxxxxx>       <<<
              user: <mysql user>                                                      <<<
              password: <mysql password>                                              <<<
              driver: com.mysql.jdbc.Driver

In [None]:
!cat ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_data_sources_1.yml

#### Add column schema to your data sources file

Foresight can automatically infer column schema from your data sources and update the ML sources file. Use the `add columns` command to automatically infer and update the ML sources file with the data source column schema. After this command completes, you must review the column schema for correctness and if necessary edit the ML sources file to fix column names or data types. Alternatively you may manually edit the ML sources file and add all the column names and data types to match your data source schema.

In [None]:
add columns trip_fare_data_sources_1.yml

In [None]:
!cat ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_data_sources_1.yml

If you are using the Foresight ML sources file from this tutorial, copy it to your project location using the `cp` command in the cell below.

In [None]:
!cp ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_data_sources_1.yml ~/projects/trip_fare/

***

# Create a Training Dataset

<html><img src="1_2.png"/></html>

In this step we will create a training dataset using the trip table data source. We will use the pickup_zipcode, dropoff_zipcode and passenger_count as input features to the ML model. The fare_amount will be the target or label for the ML model to train. 

### Create a Foresight ML job file to generate a training dataset

The training dataset will be created using a SQL command. SQL commands can be executed via Foresight ML job files. Create a Foresight ML job file using the templates and code snippets available at the icons to the left. Refer to the Elevo Foresight User Manual for help.
Alternatively you may view and copy the Foresight ML job file from this tutorial to your project location using the `cp` command in the cells below.

In [None]:
!cat ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_train_dataset_1.ml

In [None]:
!cp ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_train_dataset_1.ml ~/projects/trip_fare/

### Create the dataset

Use the `create dataset` command to execute the Foresight ML job file to create the training dataset in Elevo. The `list datasets` command will list the created datasets within a project. The `display dataset` command will display the first few rows of the training dataset.

**This command may take up to 10 minutes due to the size of the dataset.**

In [None]:
create dataset trip_fare_train_dataset_1

In [None]:
list datasets

In [None]:
display dataset trip_fare_train_dataset_1

### Explore the dataset

Use the `explore dataset` command to visually explore the dataset using the Elevo Foresight data explorer. The `target_column` is the target or label for ML training. Click on the output url to visualize the dataset.

**This command may take a few minutes due to the size of the dataset.**

In [None]:
explore dataset trip_fare_train_dataset_1,datetime_column=pickup_datetime,target_column=fare_amount

***

# Train an ML Model

<html><img src="1_3.png"/></html>

In this step we will train an ML model using the training dataset that was created. We will use the pickup_zipcode, dropoff_zipcode and passenger_count as input features to the ML model. The fare_amount will be the target or label for the ML model to train. 

### Create a Foresight ML job file for model training

ML model training is initiated via a Foresight ML job file which specifies the ML training parameters. Create a Foresight ML job file using the templates and code snippets available at the icons to the left. Refer to the Elevo Foresight User Manual for help.
Alternatively you may view and copy the Foresight ML job file from this tutorial to your project location using the `cp` command in the cells below.

In [None]:
!cat ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_model_train_1.ml

In [None]:
!cp ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_model_train_1.ml ~/projects/trip_fare/

### Start ML model training

Use the `start training` command to execute the Foresight ML job file to start the model training in Elevo. The `status training` command will show the status of the model training. 

**Click the url shown in the output to open a *TensorBoard* session that displays the training progress and metrics.** After opening the *TensorBoard* url click on the reload button to the top right of the *TensorBoard* page.

In [None]:
start training trip_fare_model_train_1

In [None]:
list tensorboard trip_fare_model_1,1

#### Wait for ML model training to complete

Use the `status training` command to check the status of the model training. Wait for the ML model training status to complete. 

**Training could take 10 minutes or more to complete.**

In [None]:
status training trip_fare_model_train_1

## Register a trained ML model

After the training is complete, the `status training` command will show COMPLETED status. The trained ML model must be registered before it can be used for predictions. The `list trained-models` command will list all the trained models within a project. The `register model` command will register a trained model. The `list registered-models` will list all registered models within a project.

In [None]:
list trained-models trip_fare_model_1

In [None]:
register model trip_fare_model_1,1,PRODUCTION

In [None]:
list registered-models

***

# Serve an ML Model

<html><img src="1_4.png"/></html>

In this step we will deploy a trained ML model to serve prediction requests. 

### Create a Foresight ML job file for model serving

ML models are deployed via a Foresight ML job file which specifies the ML serving options. Create a Foresight ML job file using the templates and code snippets available at the icons to the left. Refer to the Elevo Foresight User Manual for help.
Make sure to create a prediction Foresight ML sources file to match your ML job file. You will need to add two REST sources, one for the prediction REST request and one for the prediction REST response. You will need to add a prediction log table definition.

Alternatively you may view and copy the Foresight ML job file and ML sources file from this tutorial to your project location using the `cp` command in the cells below.

In [None]:
!cat ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_model_serve_1.ml

In [None]:
!cat ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_prediction_sources_1.yml

In [None]:
!cp ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_model_serve_1.ml ~/projects/trip_fare/

In [None]:
!cp ~/tutorial/examples/trip_fare_prediction_model_1/trip_fare_prediction_sources_1.yml ~/projects/trip_fare/

### Deploy the model

Use the `start prediction` command to execute the Foresight ML job file to deploy a model in Elevo. The `status prediction` command will show the status of the model serving. The url shown in the output is the endpoint to which REST prediction request may be sent via `curl` or some other means.

In [None]:
start prediction trip_fare_model_serve_1

In [None]:
status prediction trip_fare_model_serve_1

## Predict trip fare amounts

Use the `curl` command to send prediction requests to the deployed model via the serving url shown above. Change the http url in the two cells below to match the url shown above and execute the `curl` commands.

For predictions, get the current datetime by executing the cell below and use that datetime as the pickup_datetime value in the prediction curl request

In [None]:
!date -u +'"pickup_datetime":"%Y-%m-%d %H:%M:%S", "hour_of_day":%H, "calendar_day":"%Y-%m-%d"'

In [None]:
!curl -X GET http://<use url info from above status prediction cmd> -H "Content-Type: application/json" -d \
'[{"pickup_datetime": "2022-10-27 08:39:00", "pickup_latitude": "40.7514", "pickup_longitude": "-73.994", "dropoff_latitude": "40.7599", "dropoff_longitude": "-73.9795", "pickup_zipcode": "10001", "dropoff_zipcode": "10111", "passenger_count": 2}]'

In [None]:
!curl -X GET http://<use url info from above status prediction cmd> -H "Content-Type: application/json" -d \
'[{"pickup_datetime": "2022-10-27 18:57:00", "pickup_latitude": "40.754", "pickup_longitude": "-73.9721", "dropoff_latitude": "40.7296", "dropoff_longitude": "-73.987", "pickup_zipcode": "10017", "dropoff_zipcode": "10003", "passenger_count": 1}]'

### Stop the deployed model

Use the `stop prediction` command to stop ML model serving when you have completed the prediction requests. This step is optional, you may choose to leave the model deployed.

In [None]:
stop prediction trip_fare_model_serve_1