-
Notifications
You must be signed in to change notification settings - Fork 6
Auto Classification
The fastautoml.AutoClassifier class automatically select the best model for a classification problem. In particular, it compute the optimal subset of features, select the best family of models, and the best hyperparameters for the model selected. Please, refer to the Reference API (TDB) for a list of supported families of models.
In this example we are going to apply our auto classification class to the problem of recognize hand written digits, that is, the digits dataset included with sckit-learn.
>>> from fastautoml.fastautoml import AutoClassifier
>>> from sklearn.datasets import load_digits
>>> from sklearn.model_selection import train_test_split
>>> (X, y) = load_digits(return_X_y=True)
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)
>>> model = AutoClassifier()
>>> model.fit(X_train, y_train)
AutoClassifier()
>>> model.score(X_test, y_test)
0.9622222222222222
>>> type(model.model)
sklearn.svm.classes.LinearSVCA key difference between the fastautoml library and other AutoML libraries is that we return one single model as the best possible model, instead of an ensamble of models. In this sense, the data scientst can reuse the results of the AutoClassifier and continue with the analysis.
For more information about how to select the optimal classifier see the following blog entries:
- A Comparision of Time and Accuracy in AutoML libraries (TBD)
- AutoML for Financial Data (TBD)
(TBD)