# Connect to remote DataBricks MLFlow server
- Using your terminal (anaconda prompt or powershell) use ```databricks configure --token``` 
  - hostname as explained [here](https://learn.microsoft.com/en-us/azure/databricks/workspace/workspace-details#per-workspace-url)
    - Metyis databricks host name: https://adb-8894339424313813.13.azuredatabricks.net/
    - Adaptfy databricks host name: https://adb-243386025177033.13.azuredatabricks.net/
  - Generate **AND COPY** Rest API token through DB as shown in the image below ![Example Image](\imgs\generate__rest_api_token_through_db.png), put token in your command line

## Set the tracking URI & Creating an experiment
- **Only one team member has to do this**

In [None]:
import mlflow

team_number = "1" # Please fill in your team number
remote_server_uri = "databricks"
user_name =  # "...@metyis.com" # or "...@adaptfy.com"
experiment_name = f"experiment team {team_number}"

# Set your tracking uri automatically uses the host and token when databricks is specified
mlflow.set_tracking_uri(remote_server_uri)

# Create an experiment for your team
experiment = mlflow.create_experiment(f'/Users/{user_name}/{experiment_name}')

![Created an experiment on DB](\imgs\experiment_created.png)

---
# Exercise
### Run experiments and log all hyperparameters and metrics
Using the cells below, you can run your make_dataset and build_features, then log parameters to different models through MLFlow
#### Optional: 
- Use Scikit-learns grid search
- Think about anything what would be relevant per model run, like: plots, small tables, ...
 
MLFlow integrates well with Scikit-learn, please see their general documentation [here](https://mlflow.org/docs/latest/python_api/mlflow.sklearn.html)
- Check out the [autologging functionality](https://mlflow.org/docs/latest/tracking.html#scikit-learn) which integrates well with more than Scikit-learn (Spark, XGBoost, ...)

### Autolog example

In [None]:
# Set experiment to your teams experiment
team_number = "team chkpt3" # Please fill in your team number
user_name = "robin.opdam@metyis.com" # "...@metyis.com" # or "...@adaptfy.com"
experiment_name = f"experiment team {team_number}"

mlflow.set_experiment(f'/Users/{user_name}/{experiment_name}')

In [None]:
from src.data.make_dataset import make_dataset
from src.features.build_features import build_features
from sklearn.ensemble import RandomForestClassifier

df = make_dataset()
x_train, x_test, y_train, y_test = build_features(df)

mlflow.autolog()

# Create and train models.
rf = RandomForestClassifier(n_estimators=100, max_depth=6, max_features=3)
rf.fit(x_train, y_train)

# Use the model to make predictions on the test dataset.
predictions = rf.predict(x_test)
autolog_run = mlflow.last_active_run()

# Have a look at your experiment!