## EvalML: AutoML

EvalML is an AutoML library which builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions.

**Key Functionality**

- Automation - Makes machine learning easier. Avoid training and tuning models by hand. Includes data quality checks, cross-validation and more.
- Data Checks - Catches and warns of problems with your data and problem setup before modeling.
- End-to-end - Constructs and optimizes pipelines that include state-of-the-art preprocessing, feature engineering, feature selection, and a variety of modeling techniques.
- Model Understanding - Provides tools to understand and introspect on models, to learn how they'll behave in your problem domain.
- Domain-specific - Includes repository of domain-specific objective functions and an interface to define your own.

In [13]:
import orchest
## EVALML
from evalml.automl import AutoMLSearch
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
import warnings

warnings.filterwarnings("ignore")

In [14]:
data = orchest.get_inputs()
train,test = data["data"]

In [3]:
X = train.drop(['target'], axis=1)
y = train['target']

In [7]:
# import woodwork as ww
# X = ww.DataTable(X)

# Note: We could have also manually set the Message column to 
# natural language if Woodwork had not automatically detected
from evalml.utils import infer_feature_types
X = infer_feature_types(X, {'text': 'NaturalLanguage'})
# y = ww.DataColumn(y)

In [8]:
from evalml.preprocessing import split_data

X_train, X_holdout, y_train, y_holdout = split_data(X, y, problem_type='binary', test_size=0.2)

In [9]:
automl = AutoMLSearch(X_train=X_train, y_train=y_train, problem_type='binary',max_time=300)

In [10]:
automl.search()


Do you really want to exit search (y/n)?  y


In [11]:
%%time
pipeline = automl.best_pipeline
pipeline.fit(X_train, y_train)

KeyboardInterrupt: 

In [12]:
preds = pipeline.predict(X_test)

NameError: name 'X_test' is not defined

In [None]:
print("AUC score:",roc_auc_score(y_test,preds))
orchest.output(automl,name='automl')