# Train and visualize a model in Tensorflow - Part 2: Scikit Learn

Along this tutorial we will explain the **multilayer perceptron** algorithm, which is the simplest possible form of an *artificial feed-forward neural network*. For this we will use the 20 newsgroup dataset obtained in the previous part of the tutorial.

We will see the same algorithm in three different ways: using *scikit-learn*'s `MLPClassifier`, using *TensorFlow*'s `DNNClassifier`, and finally writing the whole neural network from scratch with the TensorFlow API.

The idea is to compare how the different ways serve different purposes. This notebook deals with the simplest form possible using Scikit Learn.

In [1]:
import numpy as np
import warnings
warnings.filterwarnings('ignore')

from sklearn.metrics import accuracy_score, classification_report
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import MLPClassifier

## Scikit Learn

Scikit Learn offers a simple API to do machine learning, specially in comparison to TensorFlow. The main problem with scikit learn is that most of the models are shallow ones (e.g. Logistic Regression, SVM, etc). The `MLPClassifier` exists and offers the possibility for a Neural Network classifier, however it is considerably slow to train a classifier and doesn't provide GPU optimization. The [documentation](http://scikit-learn.org/stable/modules/neural_networks_supervised.html) itself says the implementation of `MLPClassifier` is not intended for large-scale applications.

To keep things simple we will create a simple multilayer perceptron with only one hidden layer with size 5000 (half the size of the input) and see how it goes.

In [None]:
# Load the dataset
newsgroups = np.load('./resources/newsgroup.npz')

# Define the model
model = MLPClassifier(
    activation='relu',  # Rectifier Linear Unit activation
    hidden_layer_sizes=(5000,),  # 1 hidden layer of size 5000
    max_iter=5,  # Each epochs takes a lot of time so we keep it to 5
    batch_size=100,  # The batch size is set to 100 elements
    solver='adam')  # We use the adam solver

model.fit(newsgroups['train_data'],
          newsgroups['train_target'])

In [2]:
accuracy = accuracy_score(
    newsgroups['test_target'],
    model.predict(newsgroups['test_data']))

print("Accuracy: %.2f" % accuracy)

print(classification_report(
    newsgroups['test_target'],
    model.predict(newsgroups['test_data'])))

Accuracy: 0.92
             precision    recall  f1-score   support

          0       0.96      0.96      0.96       160
          1       0.82      0.84      0.83       195
          2       0.89      0.80      0.84       197
          3       0.73      0.82      0.77       196
          4       0.90      0.83      0.87       192
          5       0.87      0.93      0.90       196
          6       0.84      0.87      0.85       194
          7       0.92      0.91      0.92       198
          8       0.97      0.96      0.97       199
          9       0.99      0.98      0.99       199
         10       0.99      0.97      0.98       200
         11       0.99      0.93      0.96       198
         12       0.90      0.89      0.90       196
         13       0.93      0.96      0.95       198
         14       0.98      0.96      0.97       197
         15       0.98      0.94      0.96       200
         16       0.93      0.95      0.94       182
         17       0.99      0.