# Ch 3 - Getting Started with Neural Networks

## 3.1 Anatomy of a Neural Network

Training a neural network revolves around the following objects:

- **Layers**, which are combined into a **network** (or **model**)

- The **input data** and corresponding **targets**

- The **loss function**, which defines the feedback signal used for learning

- The **optimizer**, which determines how learning proceeds



![NN1](Images/03_01.jpg)


The network maps the input data to predictions. The loss function then compares these predictions to the targets, producing a loss value: a measure of how well the network's predictions match what was expected. The optimizer uses this loss value to update the network's weights. 



### 3.1.1 Layers: the Building Blocks of Deep Learning


The **layer** is the fundamental data structure in neural networks. It is a data-processing module that takes one or more tensors as input and outputs one or more tensors. 

Some layers are stateless, but they more frequently have a state: the layer's **weights**, one or several tensors learned with stochastic gradient descent, which together contain the network's knowledge. 

Different layers are appropriate for different tensor formats and different types of data processing. 

- **Dense Layers**: Simple vector data stored in 2D tensors of shape (samples, features) is often processed by densely connected layers, also called fully connected or dense layers (the Dense class in Keras).

- **Recurrent Layers**: Sequence data stored in 3D tensors of shape (samples, timesteps, features), is typically processed by recurrent layers such as an LSTM layer.

- **2D Convolution Layers**: Image data stored in 4D tensors is usually processed by 2D convolution layers (Conv2D).


Building deep-learning models in Keras is done by clipping together compatible layers to form useful data-transformation pipelines. The notion of **layer compatibility** here refers specifically to the fact that every layer will only accept input tensors of a certain shape and will return output tensors of a certain shape.




#### Example:
(A dense layer with 32 output units)




In [7]:
from keras import layers

layer = layers.Dense(32, input_shape=(784, ))

We are creating a layer that will only accept 2D tensors as input where the first dimension is 784. Since axis 0 is unspecified, any value would be accepted. This layer will return a tensor where the first dimension has been transformed to be 32.

This layer can only be connected to a downstream layer that expects 32-dimensional vectors as its input.

When using Keras, you don't have to worry about compatibility because the layers you add to your models are dynamically built to match the shape of the incoming layer. 

In [9]:
from keras import models

model = models.Sequential()
model.add(layers.Dense(32, input_shape=(784, )))
model.add(layers.Dense(32))

The second layer didn't receive an input shape argument - instead, it automatically inferred its input shape as being the output shape of the layer that came before. 

### 3.1.2 Models: Networks of Layers





### 3.1.3 Loss Functions and Optimizers: Keys to Configuring the Learning Process

## 3.2 Introduction to Keras


### 3.2.1 Keras, Tensorflow, Theano, and CNTK

### 3.2.2 Developing with Keras: a Quick Overview

## 3.3 Setting Up a Deep-Learning Workstation

### 3.3.1 Jupyter Notebooks: the Preferred Way to Run Deep-Learning Experiments

### 3.3.2 Getting Keras Running: Two Options

### 3.3.3 Running Deep-Learning Jobs in the Cloud: Pros and Cons

### 3.3.4 What is the Best GPU for Deep Learning?

## 3.4 Classifying Movie Reviews: a Binary Classification Example

### 3.4.1 The IMDB Dataset

### 3.4.2 Preparing the Data

### 3.4.3 Building Your Network

### 3.4.4 Validating Your Approach

### 3.4.5 Using a Trained Network to Generate Predictions on New Data

### 3.4.6 Further Experiments

### 3.4.7 Wrapping Up

## 3.5 Classifying Newswires: a Multiclass Classification Example

### 3.5.1 The Reuters Dataset

### 3.5.2 Preparing the Data

### 3.5.3 Building Your Network

### 3.5.4 Validating Your Approach

### 3.5.5 Generating Predictions on New Data

### 3.5.6 A Different Way to Handle the Labels and the Loss

### 3.5.7 The Importance of Having Sufficiently Large Intermediate Layers

### 3.5.8 Further Experiments

### 3.5.9 Wrapping Up

## 3.6 Predicting House Prices: a Regression Example

### 3.6.1 The Boston Housing Price Dataset

### 3.6.2 Preparing the Data

### 3.6.3 Building Your Network

### 3.6.4 Validating Your Approach Using K-fold Validation

### 3.6.5 Wrapping Up