## Using Pipelines
As the evaluation function takes scikit-learn compatible estimators, it is possible to use scikits <a href="https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html">pipelines</a> to create models in an easy to use and concise way. A pipeline chains feature transformers with an estimator at the end. In the following, we evaluate a support vector machine with linear kernel chaining a custom column-selector, a `CountVectorizer` and a `MaxAbsScaler` transformer as preprocessing steps in the form of such a pipeline model.

In [70]:
from sklearn.pipeline import Pipeline
from sklearn import preprocessing, base
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
import evaluation

# Setup model as transformer pipeline with logistic regression
model = Pipeline([
    # Extract the `text` feature
    ('col-selector', preprocessing.FunctionTransformer(func=lambda X: X[:, 2])),
    #TF-IDF Vectorizer
    ('tfidf', TfidfVectorizer()),
    #NaiveBayes-Classifier
    ('clf', MultinomialNB()),
])

# Evaluate model pipeline
evaluation.evaluate(model, store_model=False, store_submission=False)

INFO:root:Loading training data from ../data/external/kaggle/train.csv...
INFO:root:-> Number of samples: 7613
INFO:root:-> Number of features: 3
INFO:root:Evaluating model with 1 experiment(s) of 10-fold Cross Validation...
INFO:root:Run 1/10 finished
INFO:root:Run 2/10 finished
INFO:root:Run 3/10 finished
INFO:root:Run 4/10 finished
INFO:root:Run 5/10 finished
INFO:root:Run 6/10 finished
INFO:root:Run 7/10 finished
INFO:root:Run 8/10 finished
INFO:root:Run 9/10 finished
INFO:root:Run 10/10 finished
INFO:root:---
INFO:root:Expected submission results (F1-Score): around 0.73
INFO:root:F1-Score: 0.85 (training); 0.73 (test)
INFO:root:Accuracy: 88.78% (training); 79.77% (test)
INFO:root:Recall: 76.96% (training); 62.18% (test)
INFO:root:Precision: 96.17% (training); 87.03% (test)
INFO:root:Evaluation finished.


INFO:root:Expected submission results (F1-Score): around 0.73
INFO:root:F1-Score: 0.85 (training); 0.73 (test)
INFO:root:Accuracy: 88.78% (training); 79.77% (test)
INFO:root:Recall: 76.96% (training); 62.18% (test)
INFO:root:Precision: 96.17% (training); 87.03% (test)