# MLJAR AutoML 

MLJAR is an Automated Machine Learning framework. It is available as Python package with code at GitHub: https://github.com/mljar/mljar-supervised

The MLJAR AutoML can work in several modes:
- Explain - ideal for initial data exploration
- Perform - perfect for production-level ML systems
- Compete - mode for ML competitions under restricted time budget
- Optuna - mode for ML competitiona without time budget for computations :)

The `Optuna` mode is experimental. It is available at `dev` branch (you need install package manually or directly from GitHub).

In the `Optuna` mode, each algorithm is tuned with `Optuna` hperparameters framework with selected time budget (controlled with `optuna_time_budget`). 


The example useage of `Optuna` with `MLJAR`:

```python

automl = AutoML(mode="Optuna", 
                optuna_time_budget=1800, 
                optuna_init_params={}, 
                algorithms=["LightGBM", "Xgboost", "Extra Trees"], 
                total_time_limit=24*3600)
automl.fit(X, y)
```

Description of parameters:
- `optuna_time_budget` - time budget for `Optuna` to tune each algorithm,
- `optuna_init_params` - if you have precomputed parameters for `Optuna` they can be passed here, then for already optimized models `Optuna` will not be used.
- `algorithms` - the algorithms that we will check,
- `total_time_limit` - the total time limit for AutoML training.

Let's try MLJAR with TPS March-21 data!

---

MLJAR GitHub: https://github.com/mljar/mljar-supervised

Optuna GitHub: https://github.com/optuna/optuna

<img src="https://raw.githubusercontent.com/mljar/visual-identity/main/media/mljar_AutomatedML.png" style="width: 50%;"/>

In [None]:
!pip install -q -U git+https://github.com/mljar/mljar-supervised.git@dev
!pip install -q -U matplotlib==3.1.3 

In [None]:
import numpy as np
import pandas as pd
from supervised.automl import AutoML # mljar-supervised


In [None]:
train = pd.read_csv('../input/tabular-playground-series-mar-2021/train.csv')
test = pd.read_csv('../input/tabular-playground-series-mar-2021/test.csv')

In [None]:
x_cols = train.columns[1:-1]
y_col = "target"

In [None]:
automl = AutoML(mode="Optuna", 
                eval_metric="auc",
                algorithms=["LightGBM", "Xgboost", "Extra Trees"],
                optuna_time_budget=1800,   # tune each algorithm for 30 minutes
                total_time_limit=48*3600,  # total time limit, set large enough to have time to compute all steps
                features_selection=False)
automl.fit(train[x_cols], train[y_col])
preds = automl.predict_proba(test[x_cols])

In [None]:
submission = pd.DataFrame({'id':test.id, 'target': preds[:,1]})
submission.to_csv('1_submission.csv', index=False)

In [None]:
automl.report()

### Thank you!

You can check MLJAR's code at GitHub: https://github.com/mljar/mljar-supervised

MLJAR's AutoML Documentation: https://supervised.mljar.com/

MLJAR's website: https://mljar.com/

