# 3. Why deep neural network can get smarter?

In this section, we will learn why a deep neural network (DNN) can be so powerful and smart enough to solve the complex problem.

## 3-1. The power of the hidden layers

In the last section, we saw that single layer neural network can only **draw a straight line** on the map to classify Manhattan, so that it can not split the geolocations between Manhattan and Brooklyn with a curved boundary.

Instead, by using DNN with the 4 **hidden layers**, you can draw a curved boundary between Manhattan and Brooklyn and get 99.9% accuracy. So, what is hidden layers and how it can be so powerful? 

There's a reason why neural networks can get much smarter with hidden layers. Let's take a look at another sample dataset from TensorFlow Playground.

<br/>
![](https://cloud.google.com/blog/big-data/2016/07/images/146798944178238/neural-networks-4.png)
<br/>

This dataset can not be classified by a single neuron, as the two groups of data points can't be divided by a single line. This is a so-called **nonlinear classification problem**. In the real world, there's no end to non-linear and complex datasets such as this one, and the question is how to capture these sorts of complex patterns?

### Lab: Solve a non-linear problem with hidden layers

- [Click this link](http://playground.tensorflow.org/#activation=sigmoid&regularization=L2&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=3&seed=0.84062&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&showTestData_hide=true&learningRate_hide=true&regularizationRate_hide=true&percTrainData_hide=true&numHiddenLayers_hide=true&discretize_hide=true&activation_hide=true&problem_hide=true&noise_hide=true&regularization_hide=true&dataset_hide=true&batchSize_hide=true&playButton_hide=false) to open the Playground for this problem, and click the Run button to start training
- After the training, move your cursor over each neuron in the hidden layer and see what kind of lines they draw
- Move your cursor over each lines of weights (the dash lines in blue and orange) and see the weight values

### Hidden layers "tweaks" the data to extract features

![](https://cloud.google.com/blog/big-data/2016/07/images/146798944178238/neural-networks-1.png)

<br/>

What's happening here? If you click each one of the neurons in the hidden layer, you see they're each doing a simple, single-line classification:

- The first neuron checks if a data point is on the left or right
- The second neuron checks if it's in the top right
- The third one checks if it's in the bottom right

These three results are called features of the data. Outputs from these neurons indicate the strength of their corresponding features.

Finally, the neuron on the output layer uses these features to classify the data. If you draw a three dimensional space consisting of the feature values, the final neuron can simply divide this space with a flat plane. This is an example of a transformation of the original data into a feature space.

[colah's blog](http://colah.github.io/posts/2015-01-Visualizing-Representations/) has many great visuzalization of this concept. You can see in those graphs that the hidden layers are **tweaking** the original data to extract the important features, such as blue or red. So that the final neuron can just draw a straight line (or a hyperplane in n-dimensional space) to classify them by looking at the features.

![](https://cloud.google.com/blog/big-data/2016/07/images/146798944178238/neural-networks-28.png)
![](http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/img/spiral.1-2.2-2-2-2-2-2.gif)

![](https://cloud.google.com/blog/big-data/2016/07/images/146798944178238/neural-networks-16.png)

In the case of the Playground demo, the transformation results in a composition of multiple features corresponding to a triangular or rectangular area. If you add more neurons by clicking the "plus" button, you'll see that the output neuron can capture much more sophisticated polygonal shapes from the dataset.

Getting back to the office worker analogy, you can say the transformation is extracting the insights that an experienced professional has in their daily work. A new employee gets confused and distracted by random signals coming from e-mails, phones, the boss, customers, etc., but senior employees are very efficient about extracting the essential signal from those inputs, and organize the chaos according to a few important principles.

Neural networks work the same way — trying to extract the most important features in a dataset to solve the problem. That's why neural networks can sometimes get smart enough to handle some pretty complex tasks.

![](https://cloud.google.com/blog/big-data/2016/07/images/146798944178238/neural-network-23.png)

### Lab: How the hidden layers works to "tweak" the data?
  
- Discuss what does it mean by "the hidden layers tweaks the data to make it lineary classifiable" with your buddy

## 3-2. We need to go deeper: building a hierarchy of abstractions

With more neurons in a single hidden layer, you can capture more features. And having more hidden layers means more complex constructs that you can extract from the dataset. You can see how powerful this can be in the next example.

<br/>
![](https://cloud.google.com/blog/big-data/2016/07/images/146798944178238/neural-networks-9.png)
<br/>

What kind of code would you write to classify this dataset? Dozens of IF statements with many many conditions and thresholds, each checking which small area a given data point is in? I personally wouldn’t want to do that.

### Lab:

TODO

- [Click this link](http://playground.tensorflow.org/#activation=relu&batchSize=10&dataset=spiral&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=8,8,5&seed=0.53586&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&showTestData_hide=true&activation_hide=true&problem_hide=true&noise_hide=true&discretize_hide=true&regularization_hide=true&dataset_hide=true&batchSize_hide=true&learningRate_hide=true&regularizationRate_hide=true&percTrainData_hide=true&numHiddenLayers_hide=true) 

- [Click this link](http://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=spiral&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.63289&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false&regularizationRate_hide=true&noise_hide=true&batchSize_hide=true&percTrainData_hide=true&regularization_hide=true&problem_hide=true)
