# AutoML in Python

This notebook shows how to use AutoML frameworks in Python, using GAMA as example.

In [1]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

First we need some data to work with, we use a hold-out task on the Letter dataset:


In [2]:
import openml
letter_holdout = openml.tasks.get_task(236)
train, test = letter_holdout.get_train_test_split_indices()
x, y = letter_holdout.get_X_and_y()
x_train, x_test, y_train, y_test = x[train, :], x[test, :], y[train], y[test]
f'x_train: {x_train.shape}, x_test: {x_test.shape}, y_train: {y_train.shape}, y_test: {y_test.shape}'

'x_train: (13400, 16), x_test: (6600, 16), y_train: (13400,), y_test: (6600,)'

And then we need to configure an AutoML framework, for demonstration purposes we use TPOT, but many AutoML tools follow the same conventions:

In [3]:
import logging
from gama import GamaClassifier

automl = GamaClassifier(
    max_total_time=180,  # seconds
    verbosity=logging.INFO,
    scoring='accuracy',
    store="nothing",
)
automl.fit(x_train, y_train)

Using GAMA version 21.0.0.
INIT:GamaClassifier(store=nothing,output_directory=gama_66cdc2b8-5802-41c0-9e26-ed7ce2f922b6,post_processing=BestFitPostProcessing(),search=AsyncEA(),verbosity=20,max_memory_mb=None,n_jobs=None,max_eval_time=None,max_total_time=180,random_state=None,max_pipeline_length=None,regularize_length=True,scoring=accuracy)
START: preprocessing default
STOP: preprocessing default after 0.0782s.
START: search AsyncEA
Starting EA with new population.
-inf: StandardScaler>RobustScaler>MultinomialNB
0.54619: MultinomialNB
0.55149: LogisticRegression
0.89828: RandomForestClassifier
0.93746: MinMaxScaler>RandomForestClassifier
0.93336: KNeighborsClassifier
0.93985: SelectFwe>MinMaxScaler>RandomForestClassifier
Search phase evaluated 100 individuals.
STOP: search AsyncEA after 162.0313s.
START: postprocess BestFitPostProcessing
STOP: postprocess BestFitPostProcessing after 1.8001s.


In [4]:
automl.score(x_test, y_test)

0.9421212121212121

## Other Frameworks

There are many other AutoML frameworks in Python.
Code snippets below require additional dependencies not listed in `requirements.txt`.
GAMA follows the scikit-learn conventions, but many other AutoML frameworks in Python do so as well, for example:

*auto-sklearn*

In [None]:
import autosklearn.classification
automl = autosklearn.classification.AutoSklearnClassifier()
automl.fit(X_train, y_train)
predictions = automl.predict(X_test)

*TPOT*

In [None]:
from tpot import TPOTClassifier
automl = TPOTClassifier()
automl.fit(X_train, y_train)
predictions = automl.predict(X_test)

*Auto-PyTorch*

In [None]:
from autoPyTorch import AutoNetClassification

automl = AutoNetClassification("tiny_cs")  # config preset
automl.fit(X_train, y_train, validation_split=0.3)
predictions = automl.predict(X_test)

among many others.
*note*: Above cells require additional modules to be installed.