<!-- TITLE: A Single Neuron -->

# Welcome to Deep Learning! #

Welcome to Kaggle's *Introduction to Deep Learning* course! You're about to learn all you need to get started building your own deep neural networks. Using Keras and Tensorflow you'll learn how to:
- create a **fully-connected** neural network architecture
- apply neural nets to two classic ML problems: **regression** and **classification**
- train neural nets with **stochastic gradient descent**, and
- improve performance with **dropout**, **batch normalization**, and other techniques

The tutorials will introduce you to these topics with fully-worked examples, and then in the exercises, you'll explore these topics in more depth and apply them to real-world datasets.

Let's get started!

# The Linear Unit #

Let's begin with the simplest and most basic part of a neural network -- a single neuron. Though it might seem simple, even a single neuron can be supisingly powerful.

As a diagram, a **neuron** (or **unit**) with one input looks like:

<figure style="padding: 1em;">
<img src="https://i.imgur.com/d8rXmAr.png" width="250" alt="Diagram of a linear unit.">
<figcaption style="textalign: center; font-style: italic"><center>The Linear Unit: $y = w x + b$
</center></figcaption>
</figure>

The input is `x`. Its connection to the neuron has a **weight** which is `w`. Whenever a value flows through a connection, you multiply the value by the connection's weight. For the input `x`, what reaches the neuron is `w * x`. A neural network "learns" by modifying its weights.

The `b` is a special kind of weight we call the **bias**. The bias doesn't have any input data associated with it; instead, we put a `1` in the diagram so that the value that reaches the neuron is just `b` (since `1 * b == b`). The bias enables the neuron to "shift" its output independently of its inputs.

The value the neuron ultimately outputs is the neuron's **activation**. To get the activation, the neuron sums up all the values it receives through its connections. This neuron's activation is `y = w * x + b`, or as a formula $y = w x + b$.

# Example #

Say the weights on our neuron happened to be `w=3` and `b=2`. What would we get if we plugged in `x=-4`?

<figure style="padding: 1em;">
<img src="https://i.imgur.com/Ihh7iaj.png" width="800" alt="Diagram of neural computation.">
<figcaption style="textalign: center; font-style: italic"><center>Computing with the linear unit.
</center></figcaption>
</figure>

Which checks with our formula: $y = 3(-4) + 2 = -10$.

(By the way, running all of your training data through a network like this is sometimes called doing the *forward pass*.)

# A Linear Unit Fits a Line #

Until Lesson 6, the problems we'll work on will be **regression** problems. In a regression problem, we want to predict a numeric *target* from some inputs, the *features*. In the [House Prices[(https://www.kaggle.com/c/house-prices-advanced-regression-techniques) competition, your task is to predict the price of a house (the target) from things like how many rooms it has or what year it was built (the features).

You could think about regression as a *curve-fitting* problem. If your features are $x$ and your target is $y$, regression is trying to fit a curve to all the points $(x, y)$.

What kind of curve does a linear unit fit? Does the formula $y=w x + b$ look familiar? It's an equation of a line! It's the slope-intercept equation, where $w$ is the slope and $b$ is the y-intercept. A linear unit computes a linear function and has a linear graph.

<figure style="padding: 1em;">
<img src="https://i.imgur.com/9crcufH.png" width="700" alt="On the left, training data points with a line placed at random. On the right, the same data points with the line running through the middle.">
<figcaption style="textalign: center; font-style: italic"><center><strong>Left: </strong>The untrained linear unit. <strong>Right: </strong>The trained linear unit.
</center></figcaption>
</figure>

When first created, a neuron typically has its weights set randomly. The goal of training is to find values for the weights that fit the curve. For our one-input linear unit, we're trying to find the best slope $w$ and y-intercept $b$.

# Multiple Inputs #

What if we wanted to fit a curve to more than one input? That's easy enough. We can just add more input connections to the neuron. To find the output, multiply each input to its connection weight and then add them all together.

<figure style="padding: 1em;">
<img src="https://i.imgur.com/mgncuD1.png" width="300" alt="Three input connections: x0, x1, and x2, along with the bias.">
<figcaption style="textalign: center; font-style: italic"><center>A linear unit with three inputs.
</center></figcaption>
</figure>

The formula for this neuron would be $y = w_0 x_0 + w_1 x_1 + w_2 x_2 + b$. A linear unit with two inputs will fit a plane, and a unit with more more inputs than that will fit a hyperplane.

# Linear Models in Keras #

In Keras, you can create a model with a single linear unit using what's called a `Dense` layer. Most neural networks are built by stacking layers of neurons that connect in a particular way, which we'll learn about in Lesson 2. Stacking layers is what `Sequential` does.

In [None]:
from tensorflow import keras
from tensorflow.keras import layers

# Create a network with 1 linear unit
model = keras.Sequential([
    layers.Dense(units=1, input_shape=[2])
])

The `input_shape` argument tells Keras the dimensions of the inputs. All of our inputs will be something like `[num_xs]`, where we're just counting how many input connections the unit has. This model would accept two features as input -- two columns from a dataframe, say.

For the number of output values we want, we use the `units` argument. For our regression problems, we're just trying to predict a single value (a price, say) for each set of inputs; so, we use `units=1`.

As we mentioned earlier, when first created a model has its weights initialized randomly. We'll need to fit network to the training data before we make predictions, which we'll learn how to do in Lesson 3.

<blockquote style="margin-right:auto; margin-left:auto; background-color: #ebf9ff; padding: 1em; margin:24px;">
    <strong>Input Shape</strong><br>
The data we'll use in this course will be tabular data, like in a Pandas dataframe. We'll have one input for each feature in the dataset. The features are arranged by column, so we'll always have <code>inputs_shape=[num_columns]</code>.

The reason we use a list here is that some data may need more than one "axis". Image data, for instance, might need three: <code>[height, width, channels]</code>.
</blockquote>

# Conclusion #

Now check out the exercises and learn about **TODO**