# Installation
Install mlflow simply by using `pip install mlflow`

Try to import it by `import mlflow`
## Potential issue:
If you run into a ModuleNotFoundError: No module named 'google.protobuf.internal' run the following line in the terminal
`conda install protobuf`

### Below is a sample code to see how we can use MLflow Tracking

We see that there is a function that takes in a set of parameters for a GradientBoostingRegressor and fits the regressor with the supplied parameters. 
Then it logs it to MLflow (`mlflow.log_params`) along with a couple of other intersting thins - metrics (here only the mse) and the model itself!

Mlflow will create a folder in the working directory (i.e. where the code is executed from, use !pwd to see where the notebook's working directory is) called mlruns, which stores all of the tracked metadata

In [49]:
def train(params):
    import numpy as np
    from sklearn import datasets, ensemble
    from sklearn.metrics import mean_squared_error
    from sklearn.model_selection import train_test_split

    import mlflow
    import mlflow.sklearn

    diabetes = datasets.load_diabetes()
    X, y = diabetes.data, diabetes.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=13)

    with mlflow.start_run():

        reg = ensemble.GradientBoostingRegressor(**params)
        reg.fit(X_train, y_train)

        mse = mean_squared_error(y_test, reg.predict(X_test))

        print(f"MSE: {mse}")
        
        mlflow.log_params(params)
        mlflow.log_metric("mse", mse)

        mlflow.sklearn.log_model(reg, "model")

In [50]:
params_1 = {'n_estimators': 500,
          'max_depth': 4,
          'min_samples_split': 5,
          'learning_rate': 0.01,
          'loss': 'ls'}

params_2 =    {'n_estimators': 100,
          'max_depth': 2,
          'min_samples_split': 5,
          'learning_rate': 0.1,
          'loss': 'ls'}       


params_3 =    {'n_estimators': 1000,
          'max_depth': 6,
          'min_samples_split': 5,
          'learning_rate': 0.05,
          'loss': 'ls'}

train(params_1)
train(params_2)
train(params_3)

MSE: 3034.2704768058793
MSE: 3115.8133716365646
MSE: 3678.989046022396


In [43]:
! ls mlruns

0


For example, here we can see the value of the max_depth parameter for one of the executed runs

In [48]:
cat mlruns/0/42d2f797f34f43ab8f3976ae497b5097/params/max_depth  

2

# How to access the MLflow ui
This metadata is supposed to be accessed from the mlflow ui, not directly
Run the following command from the working directory (i.e. the directory in which mlruns folder resides): `mlflow ui`

This will run a server at http://127.0.0.1:5000 . Paste the link in your webbrowser and you should see the MLflow UI.