# MLFlow Experiments
This notebook is a collection of experiments with MLFlow to test the setup of the project.

## Imports

In [1]:
import mlflow
import dagshub
import os
from dotenv import load_dotenv

### Load environmental variables

In [2]:
load_dotenv()

True

### Initialize MLFlow experiment and Dagshub

In [3]:
mlflow.set_experiment("test-experiments")
mlflow.set_tracking_uri(os.getenv("MLFLOW_TRACKING_URI"))

dagshub.init(
    repo_name= os.getenv("DAGSHUB_REPO_NAME"),
    repo_owner= os.getenv("DAGSHUB_REPO_OWNER"),
)

mlflow.autolog()

## Experiments

In [4]:
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes
from sklearn.ensemble import RandomForestRegressor
db = load_diabetes()

X_train, X_test, y_train, y_test = train_test_split(db.data, db.target)

# Create and train models.
rf = RandomForestRegressor(n_estimators=100, max_depth=6, max_features=3)
rf.fit(X_train, y_train)

# Use the model to make predictions on the test dataset.
predictions = rf.predict(X_test)

2024/10/26 11:44:50 INFO mlflow.tracking.fluent: Autologging successfully enabled for sklearn.
2024/10/26 11:44:51 INFO mlflow.utils.autologging_utils: Created MLflow autologging run with ID '9904573d4fbd4775a11af1055e9fcb72', which will track hyperparameters, performance metrics, model artifacts, and lineage information for the current sklearn workflow
2024/10/26 11:44:58 INFO mlflow.tracking._tracking_service.client: 🏃 View run persistent-mule-430 at: https://dagshub.com/nachoogriis/TAED2_YOLOs.mlflow/#/experiments/1/runs/9904573d4fbd4775a11af1055e9fcb72.
2024/10/26 11:44:58 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: https://dagshub.com/nachoogriis/TAED2_YOLOs.mlflow/#/experiments/1.


## Conclusion
In this notebook, we successfully set up MLFlow with DagsHub and ran an experiment with a `RandomForestRegressor` model on the diabetes dataset. The experiment details, including parameters, metrics, and model artifacts, are tracked and logged remotely for review and analysis.

### Next Steps
- Experiment with different model types and parameters to observe changes in metrics.
- Use the saved experiment logs to compare model performances and identify the best configuration.