# Example: Multiclass classification
------------------------------------

This example shows how to compare the performance of three models on a multiclass classification task.

Import the wine dataset from [sklearn.datasets](https://scikit-learn.org/stable/datasets/index.html#breast-cancer-wisconsin-diagnostic-dataset). This is a small and easy to train dataset whose goal is to predict wines into three groups (which cultivator it's from) using features based on the results of chemical analysis.

## Load the data

In [1]:
# Import packages
from sklearn.datasets import load_wine
from atom import ATOMClassifier

In [2]:
# Load data
X, y = load_wine(return_X_y=True, as_frame=True)

# Let's have a look
X.head()

Unnamed: 0,alcohol,malic_acid,ash,alcalinity_of_ash,magnesium,total_phenols,flavanoids,nonflavanoid_phenols,proanthocyanins,color_intensity,hue,od280/od315_of_diluted_wines,proline
0,14.23,1.71,2.43,15.6,127.0,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065.0
1,13.2,1.78,2.14,11.2,100.0,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050.0
2,13.16,2.36,2.67,18.6,101.0,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185.0
3,14.37,1.95,2.5,16.8,113.0,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480.0
4,13.24,2.59,2.87,21.0,118.0,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735.0


## Run the pipeline

In [3]:
atom = ATOMClassifier(X, y, n_jobs=1, verbose=2, random_state=1)

# Fit the pipeline with the selected models
atom.run(
    models=["LR","LDA", "RF"],
    metric="roc_auc_ovr",
    n_trials=14,
    n_bootstrap=5,
    errors="raise",
)

Trial 0 failed with parameters: {'penalty': 'l1', 'C': 0.0054, 'solver': 'saga', 'max_iter': 480, 'l1_ratio': 0.7} because of the following error: AxisError(1, 1, None).
Traceback (most recent call last):
  File "C:\Users\Mavs\Documents\Python\ATOM\venv311\Lib\site-packages\optuna\study\_optimize.py", line 200, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "C:\Users\Mavs\Documents\Python\ATOM\atom\basemodel.py", line 1045, in objective
    results = Parallel(n_jobs=self.n_jobs)(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mavs\Documents\Python\ATOM\venv311\Lib\site-packages\joblib\parallel.py", line 1863, in __call__
    return output if self.return_generator else list(output)
                                                ^^^^^^^^^^^^
  File "C:\Users\Mavs\Documents\Python\ATOM\venv311\Lib\site-packages\joblib\parallel.py", line 1792, in _get_sequential_output
    res = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
 


Algorithm task: Multiclass classification.

Shape: (178, 14)
Train set size: 143
Test set size: 35
-------------------------------------
Memory: 19.36 kB
Scaled: False
Outlier values: 12 (0.6%)


Models: LR, LDA, RF
Metric: roc_auc_ovr


Running hyperparameter tuning for LogisticRegression...
| trial | penalty |       C |  solver | max_iter | l1_ratio | roc_auc_ovr | best_roc_auc_ovr | time_trial | time_ht |    state |
| ----- | ------- | ------- | ------- | -------- | -------- | ----------- | ---------------- | ---------- | ------- | -------- |

Exception encountered while running the LR model.
AxisError: axis 1 is out of bounds for array of dimension 1


AxisError: axis 1 is out of bounds for array of dimension 1

## Analyze the results

In [None]:
atom.results

In [None]:
# Show the score for some different metrics
atom.evaluate(["precision_macro", "recall_macro", "jaccard_weighted"])

In [None]:
# Some plots allow you to choose the target class to look at
atom.rf.plot_probabilities(rows="train", target=0)

In [None]:
atom.lda.plot_shap_heatmap(target=2, show=7)