# Neural networks
A family of algorithms known as neural networks has become increasingly popular during the past few years under
the name “deep learning.” While deep learning shows great promise in many machine
learning applications, deep learning algorithms are often tailored very carefully to a
specific use case. Here, we will only discuss some relatively simple methods, namely
multilayer perceptrons for classification and regression, that can serve as a starting
point for more involved deep learning methods. Multilayer perceptrons (MLPs) are
also known as (vanilla) feed-forward neural networks, or sometimes just neural
networks.

MLPs can be viewed as generalizations of linear models that perform multiple stages
of processing to come to a decision. Remember that the prediction by a linear regressor is given as:

ŷ = w[0] * x[0] + w[1] * x[1] + ... + w[p] * x[p] + b

In plain English, ŷ is a weighted sum of the input features x[0] to x[p], weighted by
the learned coefficients w[0] to w[p]. We could visualize this graphically as shown in the following figure:

<img src="perceptron.png">

In an MLP this process of computing weighted sums is repeated multiple times, first
computing hidden units that represent an intermediate processing step, which are
again combined using weighted sums to yield the final result. Note that there can be several layers of hidden units, each of which will create a more complex representation of the input.

<img src="MLP.png">

Computing a series of weighted sums is mathematically the same as computing just
one weighted sum, so to make this model truly more powerful than a linear model,
we need one extra trick. After computing a weighted sum for each hidden unit, a
nonlinear function is applied to the result.

Let’s look into the workings of the MLP by applying the `MLPClassifier` to the
two_moons dataset. Import the `make_moons` method from sklearn library and generate a dataset with 100 samples `noise=0.25` and `random_state=3`. Then make a scatter plot of the data with matplotlib `scatter` function.

Now import the `Perceptron` classifier from sklearn and apply it to the data.

Build a function to visualize the input data and the decision function of the model.

In [None]:
def visualize_classifier(model, X, y, ax=None, cmap='RdBu', figsize=(7,7)):
    import numpy as np
    
    if ax is None:
        plt.figure(figsize=figsize)
    
    ax = ax or plt.gca()
    
    # Plot the training points
    ax.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=cmap,
               clim=(y.min(), y.max()), zorder=3)
    ax.axis('tight')
    xlim = ax.get_xlim()
    ylim = ax.get_ylim()

    xx, yy = np.meshgrid(np.linspace(*xlim, num=200),
                         np.linspace(*ylim, num=200))
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)

    # Create a color plot with the results
    n_classes = len(np.unique(y))
    contours = ax.contourf(xx, yy, Z, alpha=0.3,
                           levels=np.arange(n_classes + 1) - 0.5,
                           cmap=cmap,
                           zorder=1)

    ax.set(xlim=xlim, ylim=ylim)

Visualize the classifier. Is this shape expected? Why is the perceptron a linear classifier? 

Now import the `MLPClassifier`, set the random state to 0 and the solver to lbfgs. Leave the rest of the parameters as default. Fit the MLP model and then plot the results.