In [3]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import mean_squared_error

## Package a machine learning model
train a LinearRegression model from scikit-learn to predict profit of a Unicorn Company.
use MLflow's built-in scikit-learn Flavor to package the model. You will use the Flavor's auto logging function to automatically log metrics, parameters and the model to MLflow Tracking when the fit estimator is called.

## Storage Format
MLflow uses a specific storage format in order to standardize the way models are packaged. 

## What's in an MLmodel file?
You learned that the MLmodel file is used to define specific information about how our models can be loaded and integrated with existing ML tools.

An MLmodel file has been set to a variable called "mlmodel". Use the print(mlmodel) in the IPython Shell to view the contents of the file.


## Model API
MLflow Models are a way to standardize how ML models are packaged. The Model API is used to interact with models. With the Model API users can save, log, and load an MLflow model using a particular flavor.

## MLflow REST API
MLflow uses a REST API that allows users to create, list, and retrieve information programmatically from every component of MLflow. 
MLflow integrates with several common libraries such as scikit-learn. Using the mlflow-dot-sklearn module users can do the following: Use the save_model function to save a model to the local filesystem. The log_model function logs the model to MLflow Tracking as an artifact within a run. Finally, the load_model function is used to load the model from either the local filesystem or from MLflow Tracking.

## Load model
When loading an MLflow Model it is important to understand what location formats are supported. When loading a model from the local filesystem, load_model supports both relative and absolute paths. To load a model from MLflow Tracking, MLflow uses a "runs" format where the run id and model path must be included. MLflow also supports loading models from AWS S3 and other cloud storages.

## Logging and loading a model
The Model API provides a way to interact with our models by logging and loading them directly from MLflow Tracking in a standardized manner. Being able to interact with models is crucial during the ML lifecycle for the Model Engineering and Model Evaluation steps.

## Creating a custom Python Class
MLflow provides a way to create custom models in order to provide a way to support a wide variety of use cases. To create custom models, MLflow allows for users to create a Python Class which inherits mlflow.pyfunc.PythonModel Class. The PythonModel Class provides customization by providing methods for custom inference logic and artifact dependencies.

## MLflow Models Deployment
MLflow Models is used to standardize model packaging, log models for tracking, and evaluate model performance. These features, in combination, cover the "Model Engineering" and "Model Evaluation" steps of the ML Lifecycle.

MLflow serves models as a REST API. MLflow's API used for deploying models defines four endpoints: The ping and health endpoints are used to get health information about the REST API service. The version endpoint is used to retrieve the version of MLflow used on the REST API. And finally, the invocations endpoint is used to retrieve a score from the deployed model. The REST API uses port 5000 by default. Each endpoint can be reached once a model is deployed by going to the URL of where MLflow is running.

## Invocations endpoint
The invocations endpoint accepts either CSV or JSON as input. The REST API also needs a content-type header to be specified with either application-slash-json or application-slash-csv to specify the input format.

### CSV and JSON format
When using CSV input, the input must be a valid pandas DataFrame. Pandas has a to_csv method for CSV format representation. JSON input must be a dictionary with exactly one of dataframe_split or dataframe_records. The fields specify the type of input data being passed to the REST API.

## Serving Models
To serve a model, MLflow includes a command line interface command called "Serve". Serve is used to launch a local webserver that runs the REST API used for serving models.
Model deployment is another important step of the ML Lifecycle. The MLflow command line interface includes a command for serving models. Models can be deployed with MLflow from the local filesystem, from MLflow Tracking, and from several cloud providers such as AWS S3.

```bash
mlflow models serve -m runs:/7de9bbe306224c2c9842beb11357d084/model
```

## Score from a served model
Once a model has been served with mlflow serve command line interface command

```bash
curl -d '{"dataframe_split": {"columns": ["x"], "data": [[10]]}}' -H 'Content-Type: application/json' -X POST localhost:5000/invocations
```



In [None]:
# Import Scikit-learn flavor
import mlflow.sklearn

# Set the experiment to "Sklearn Model"
mlflow.set_experiment("Sklearn Model")

# Set Auto logging for Scikit-learn flavor 
mlflow.sklearn.autolog()

lr = LinearRegression()
lr.fit(X_train, y_train)

# Get a prediction from test data
print(lr.predict(X_test.iloc[[5]]))

In [1]:
import mlflow

In [2]:
# Create new experiment
mlflow.create_experiment("Unicorn Model")
# Set the experiment
mlflow.set_experiment("Unicorn Model")

# Tag new experiment
mlflow.set_experiment_tag("version", "1.0")



## What is MLflow Tracking?
+ Model Metrics
    - F1, Recall, Accuracy, MSE, etc...
+ Parameters 
    - library specific
+ code
    - train.py
+ other artifacts
    _ tokenizers, pickle, etc

In [None]:
# Start a run
mlflow.start_run()

# Model Training Code here
lr = LogisticRegression(n_jobs=1)

# Model evaluation Code here
lr.fit(X_train, y_train)

y_pred = lr.predict(X_test)

r2_score = lr.score(y_test, y_pred)

rmse = mean_squared_error(y_test, y_pred)


In [None]:
# Log the metric r2_score as "r2_score"
mlflow.log_metric("score", score)

# Log parameter n_jobs as "n_jobs"
mlflow.log_param("n_jobs", 1)

# Log the training code
mlflow.log_artifact("train.py")

In [None]:
# Create a filter string for R-squared score
r_squared_filter = "metrics.r2_score > .70"

# Search runs
mlflow.search_runs(experiment_names=["Unicorn Sklearn Experiments", "Unicorn Other Experiments"], 
                   filter_string=r_squared_filter, 
                   order_by=["metrics.r2_score DESC"])

In [None]:
# Load model from local filesystem
model = mlflow.sklearn.load_model("lr_local_v1")

# Training Data
X = df[["R&D Spend", "Administration", "Marketing Spend", "State"]]
y = df[["Profit"]]
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7,random_state=0)
# Train Model
model.fit(X_train, y_train)

# Save model to local filesystem
mlflow.sklearn.save_model(model, "lr_local_v2")

In [None]:
# Log model to MLflow Tracking
mlflow.sklearn.log_model(lr_model, "lr_tracking")

# Get the last run
run = mlflow.last_active_run()

# Get the run_id of the above run
run_id = run.info.run_id


# Load model from MLflow Tracking
model = mlflow.sklearn.load_model(f"runs:/{run_id}/lr_tracking")

In [None]:
# Create Python Class
class CustomPredict(mlflow.pyfunc.PythonModel):
    # Set method for loading model
    def load_context(self, context):
        self.model = mlflow.sklearn.load_model("./lr_model/")
    # Set method for custom inference     
    def predict(self, context, model_input):
        predictions = self.model.predict(model_input)
        decoded_predictions = []  
        for prediction in predictions:
            if prediction == 0:
                decoded_predictions.append("female")
            else:
                decoded_predictions.append("male")
        return decoded_predictions

In [None]:
# Eval Data
eval_data = X_test
eval_data["sex"] = y_test
# Log the lr_class model using Scikit-Learn Flavor
mlflow.sklearn.log_model(lr_class, "model")

# Get run id
run = mlflow.last_active_run()
run_id = run.info.run_id

# Evaluate the logged model with eval_data data
mlflow.evaluate(f"runs:/{run_id}/model", 
        data = eval_data, 
        targets="sex",
        model_type="classifier"
)