# Tutorial

This tutorial provides an overview on how to use the `classifierpromax` library to train multiple classification models in parallel.

## Requirements
The following libraries are needed to run `classifierpromax`. These dependencies should've been installed as part of the installation process. 
- `python` 3.12 or newer
- `pandas` 2.2.3 or newer
- `scikit-learn` 1.6.1 or newer

## Dataset
This tutorial will use a toy "wine" dataset from `scikit-learn` to perform classification. We will be performing a multi-class classification of wine based on wine attributes. The target classes are class_0, class_1 and class_2. We start by importing the dataset. 

Use `print(datasets.load_wine().DESCR)` for more information on the "wine" dataset

In [1]:
import pandas as pd
from sklearn import datasets

X = pd.DataFrame(datasets.load_wine().data)
X.columns = datasets.load_wine().feature_names
y = datasets.load_wine().target
X.head()

Unnamed: 0,alcohol,malic_acid,ash,alcalinity_of_ash,magnesium,total_phenols,flavanoids,nonflavanoid_phenols,proanthocyanins,color_intensity,hue,od280/od315_of_diluted_wines,proline
0,14.23,1.71,2.43,15.6,127.0,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065.0
1,13.2,1.78,2.14,11.2,100.0,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050.0
2,13.16,2.36,2.67,18.6,101.0,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185.0
3,14.37,1.95,2.5,16.8,113.0,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480.0
4,13.24,2.59,2.87,21.0,118.0,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735.0


## Model Training

Since all of the features are numerical, we will use a `StandardScaler` as a preprocessor for our models. 

In [2]:
from sklearn.preprocessing import StandardScaler

preprocessor = StandardScaler()

Import `ClassifierTrainer` and build four baseline models: `DummyClassifier`, `LogisticRegression`, `SVC`, and `RandomForestClassifier` 

In [3]:
from classifierpromax.ClassifierTrainer import ClassifierTrainer

baseline_models, baseline_score = ClassifierTrainer(preprocessor, X, y, seed=123)

## Feature Selection and Hyperparameter Optimization

Perform feature selection on the model and then perform hyperparameter optimization. 

In [4]:
from classifierpromax.ClassifierOptimizer import ClassifierOptimizer
from classifierpromax.FeatureSelector import FeatureSelector

fs_models = FeatureSelector(preprocessor, baseline_models, X, y, n_features_to_select=3)
opt_models, opt_score = ClassifierOptimizer(fs_models, X, y)


Training logreg...

Training svc...

Training random_forest...


KeyboardInterrupt: 

## Model Score

In [None]:
from classifierpromax.ResultHandler import ResultHandler