<!-- TITLE: A Single Neuron -->

# Welcome to Deep Learning! #

Welcome to Kaggle's *Introduction to Deep Learning* course! You're about to learn all you need to get started building your own deep neural networks. Using Keras and Tensorflow you'll learn how to:
- create a **fully-connected** neural network architecture
- apply neural nets to two classic ML problems: **regression** and **classification**
- train neural nets with **stochastic gradient descent**, and
- improve performance with **dropout**, **batch normalization**, and other techniques

The tutorials will introduce you to these topics with fully-worked examples, and then in the exercises, you'll explore these topics in more depth and apply them to real-world datasets.

Let's get started!

# The Linear Unit #

<mark><b>TODO: What is a "neural network" and what does it mean in the context of a course on deep learning?  Currently missing: [1] definition of neural network so user can understand "neuron" as part of "neural network", [2] the idea of that neural network maps inputs to outputs (to set us up for understanding single neuron as doing the same).</b></mark>

Let's begin with the simplest and most basic part of a neural network -- a single neuron.  As a diagram, a **neuron** (or **unit**) with one input looks like:

<figure style="padding: 1em;">
<img src="https://i.imgur.com/d8rXmAr.png" width="250" alt="Diagram of a linear unit.">
<figcaption style="textalign: center; font-style: italic"><center>The Linear Unit: $y = w x + b$
</center></figcaption>
</figure>

The input is `x`. Its connection to the neuron has a **weight** which is `w`. Whenever a value flows through a connection, you multiply the value by the connection's weight. For the input `x`, what reaches the neuron is `w * x`. A neural network "learns" by modifying its weights.

The `b` is a special kind of weight we call the **bias**. The bias doesn't have any input data associated with it; instead, we put a `1` in the diagram so that the value that reaches the neuron is just `b` (since `1 * b == b`). The bias enables the neuron to "shift" its output independently of its inputs.

The `y` is the value the neuron ultimately outputs. To get the output, the neuron sums up all the values it receives through its connections. This neuron's activation is `y = w * x + b`, or as a formula $y = w x + b$.  

<blockquote style="margin-right:auto; margin-left:auto; background-color: #ebf9ff; padding: 1em; margin:24px;">
    <strong>Does the formula $y=w x + b$ look familiar?</strong><br>
 It's an equation of a line! It's the slope-intercept equation, where $w$ is the slope and $b$ is the y-intercept. 
</blockquote>



# Example #

<mark><b>TODO: Instead of working with a toy example here, recommend instead grounding in a real example.  x=-4 (or you can change the value) should correspond to a feature from a real dataset, and the "y" value should be something that's predicted by a model.  I moved "house prices" here so that the input could come from a feature from that dataset, and then you can identify "y" as the intended output</b></mark>

In the [House Prices](https://www.kaggle.com/c/house-prices-advanced-regression-techniques) competition, your task is to predict the price of a house (the target) from things like how many rooms it has or what year it was built (the features).

Say the weights on our neuron happened to be `w=3` and `b=2`. What would we get if we plugged in `x=-4`?

<figure style="padding: 1em;">
<img src="https://i.imgur.com/Ihh7iaj.png" width="800" alt="Diagram of neural computation.">
<figcaption style="textalign: center; font-style: italic"><center>Computing with the linear unit.
</center></figcaption>
</figure>

This checks with our formula: $y = 3(-4) + 2 = -10$.

# Multiple Inputs #

<mark><b>TODO: Can motivate this section along the lines of house prices: want to use more than just one feature</b></mark>

What if we wanted to work with more than one input? That's easy enough. We can just add more input connections to the neuron. To find the output, multiply each input to its connection weight and then add them all together.

<figure style="padding: 1em;">
<img src="https://i.imgur.com/mgncuD1.png" width="300" alt="Three input connections: x0, x1, and x2, along with the bias.">
<figcaption style="textalign: center; font-style: italic"><center>A linear unit with three inputs.
</center></figcaption>
</figure>

The formula for this neuron would be $y = w_0 x_0 + w_1 x_1 + w_2 x_2 + b$. A linear unit with two inputs will fit a plane, and a unit with more inputs than that will fit a hyperplane.

# Linear Units in Keras #

<mark><b>TODO: this input_shape number should match the number of inputs from the "multiple inputs" section</b></mark>

In Keras, you create a model with `keras.Sequential`.  For now, we define a model with a single linear unit, using what's called a `Dense` layer.  

Most neural networks are built by combining multiple layers of neurons, which we'll learn about in Lesson 2. Stacking layers is what `keras.Sequential` does.

In [None]:
from tensorflow import keras
from tensorflow.keras import layers

# Create a network with 1 linear unit
model = keras.Sequential([
    layers.Dense(units=1, input_shape=[2])
])

For the number of output values we want, we use the `units` argument. For most of this course, we will predict a single value (a price, say) for each set of inputs; so, we use `units=1`.

The `input_shape` argument tells Keras the dimensions of the inputs.  Setting `input_shape=[2]` ensures the model will accept two features as input.  

<blockquote style="margin-right:auto; margin-left:auto; background-color: #ebf9ff; padding: 1em; margin:24px;">
    <strong>Why is <code>input_shape</code> a Python list?</strong><br>
The data we'll use in this course will be tabular data, like in a Pandas dataframe. We'll have one input for each feature in the dataset. The features are arranged by column, so we'll always have <code>inputs_shape=[num_columns]</code>.

The reason Keras uses a list here is to permit use of more complex datasets. Image data, for instance, might need three dimensions: <code>[height, width, channels]</code>.
</blockquote>

When first created a model has its weights initialized randomly. We'll need to fit network to the training data before we make predictions, which we'll learn how to do in Lesson 3.


# Conclusion #

Now check out the exercises and learn about **TODO**