# SINGLE NEURON

Learn about linear units, the building blocks of deep learning.



## Welcome to Deep Learning!
Welcome to Kaggle's Introduction to Deep Learning course! You're about to learn all you need to get started building your own deep neural networks. Using Keras and Tensorflow you'll learn how to:

create a fully-connected neural network architecture

apply neural nets to two classic ML problems: regression and classification

train neural nets with stochastic gradient descent, and

improve performance with dropout, batch normalization, and other techniques
The tutorials will introduce you to these topics with fully-worked examples, and then in the exercises, you'll explore these topics in more depth and apply them to real-world datasets.

Let's get started!

What is Deep Learning?
Some of the most impressive advances in artificial intelligence in recent years have been in the field of deep learning. Natural language translation, image recognition, and game playing are all tasks where deep learning models have neared or even exceeded human-level performance.

So what is deep learning? Deep learning is an approach to machine learning characterized by deep stacks of computations. This depth of computation is what has enabled deep learning models to disentangle the kinds of complex and hierarchical patterns found in the most challenging real-world datasets.

Through their power and scalability neural networks have become the defining model of deep learning. Neural networks are composed of neurons, where each neuron individually performs only a simple computation. The power of a neural network comes instead from the complexity of the connections these neurons can form.



## The Linear Unit
So let's begin with the fundamental component of a neural network: the individual neuron. As a diagram, a neuron (or unit) with one input looks like:
https://storage.googleapis.com/kaggle-media/learn/images/mfOlDR6.png

The input is x. Its connection to the neuron has a weight which is w. Whenever a value flows through a connection, you multiply the value by the connection's weight. For the input x, what reaches the neuron is w * x. A neural network "learns" by modifying its weights.

The b is a special kind of weight we call the bias. The bias doesn't have any input data associated with it; instead, we put a 1 in the diagram so that the value that reaches the neuron is just b (since 1 * b = b). The bias enables the neuron to modify the output independently of its inputs.

The y is the value the neuron ultimately outputs. To get the output, the neuron sums up all the values it receives through its connections. This neuron's activation is y = w * x + b, or as a formula  y=wx+b
 .



Example - The Linear Unit as a Model
Though individual neurons will usually only function as part of a larger network, it's often useful to start with a single neuron model as a baseline. Single neuron models are linear models.

Let's think about how this might work on a dataset like 80 Cereals( https://www.kaggle.com/crawford/80-cereals). Training a model with 'sugars' (grams of sugars per serving) as input and 'calories' (calories per serving) as output, we might find the bias is b=90 and the weight is w=2.5. We could estimate the calorie content of a cereal with 5 grams of sugar per serving like this: View the visual representation.

https://storage.googleapis.com/kaggle-media/learn/images/yjsfFvY.png
And, checking against our formula, we have  calories=2.5×5+90=102.5
 , just like we expect.



### Multiple Inputs
The 80 Cereals dataset has many more features than just 'sugars'. What if we wanted to expand our model to include things like fiber or protein content? That's easy enough. We can just add more input connections to the neuron, one for each additional feature. To find the output, we would multiply each input to its connection weight and then add them all together. Follow this link to better visualize how it must be done.
https://storage.googleapis.com/kaggle-media/learn/images/vyXSnlZ.png

The formula for this neuron would be  y=w0x0+w1x1+w2x2+b
 . A linear unit with two inputs will fit a plane, and a unit with more inputs than that will fit a hyperplane.





#### Linear Units in Keras
The easiest way to create a model in Keras is through keras.Sequential, which creates a neural network as a stack of layers. We can create models like those above using a dense layer (which we'll learn more about in the next lesson).

We could define a linear model accepting three input features ('sugars', 'fiber', and 'protein') and producing a single output ('calories') like so:


In [1]:
from tensorflow import keras
from tensorflow.keras import layers

#Create a network with 1 linear unit(neuron)

model = keras.Sequential([
    layers.Dense(units=1,input_shape=[3])
])


2023-08-14 16:54:52.687907: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-08-14 16:54:56.282701: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-08-14 16:54:56.289887: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


With the first argument, units, we define how many outputs we want. In this case we are just predicting 'calories', so we'll use units=1.

With the second argument, input_shape, we tell Keras the dimensions of the inputs. Setting input_shape=[3] ensures the model will accept three features as input ('sugars', 'fiber', and 'protein').

This model is now ready to be fit to training data!



Why is input_shape a Python list?
The data we'll use in this course will be tabular data, like in a Pandas dataframe. We'll have one input for each feature in the dataset. The features are arranged by column, so we'll always have input_shape=[num_columns]. The reason Keras uses a list here is to permit use of more complex datasets. Image data, for instance, might need three dimensions: [height, width, channels].


In [2]:
pip install tensorrt

Collecting tensorrt
  Downloading tensorrt-8.6.1.tar.gz (16 kB)
  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: tensorrt
  Building wheel for tensorrt (setup.py) ... [?25lerror
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py bdist_wheel[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m [31m[127 lines of output][0m
  [31m   [0m running bdist_wheel
  [31m   [0m running build
  [31m   [0m running build_py
  [31m   [0m creating build
  [31m   [0m creating build/lib
  [31m   [0m creating build/lib/tensorrt
  [31m   [0m copying tensorrt/__init__.py -> build/lib/tensorrt
  [31m   [0m running egg_info
  [31m   [0m writing tensorrt.egg-info/PKG-INFO
  [31m   [0m writing dependency_links to tensorrt.egg-info/dependency_links.txt
  [31m   [0m writing requirements to tensorrt.egg-info/requires.txt
  [31m   [0m writing top-level names to tensorrt.egg-

Failed to build tensorrt
[31mERROR: Could not build wheels for tensorrt, which is required to install pyproject.toml-based projects[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


# MAKE YOUR MODELS DEEP

## Introduction
In this lesson we're going to see how we can build neural networks capable of learning the complex kinds of relationships deep neural nets are famous for.

The key idea here is modularity, building up a complex network from simpler functional units. We've seen how a linear unit computes a linear function -- now we'll see how to combine and modify these single units to model more complex relationships.



### LAYERS

Neural networks typically organize their neurons into layers. When we collect together linear units having a common set of inputs we get a dense layer. View this image to grasp the visual representation described above:

https://storage.googleapis.com/kaggle-media/learn/images/2MA4iMV.png
(Linear Networks with two inputs and a bias)

You could think of each layer in a neural network as performing some kind of relatively simple transformation. Through a deep stack of layers, a neural network can transform its inputs in more and more complex ways. In a well-trained neural network, each layer is a transformation getting us a little bit closer to a solution.

A "layer" in Keras is a very general kind of thing. A layer can be, essentially, any kind of data transformation. Many layers, like the convolutional and recurrent layers, transform data through use of neurons and differ primarily in the pattern of connections they form. Others though are used for feature engineering or just simple arithmetic. There's a whole world of layers to discover -- check them out!



### Activation Function

It turns out, however, that two dense layers with nothing in between are no better than a single dense layer by itself. Dense layers by themselves can never move us out of the world of lines and planes. What we need is something nonlinear. What we need are activation functions.

https://storage.googleapis.com/kaggle-media/learn/images/OLSUEYT.png

######  Without activation functions, neural networks can only learn linear relationships. In order to fit curves, we'll need to use activation functions.

An activation function is simply some function we apply to each of a layer's outputs (its activations). The most common is the rectifier function  max(0,x)
Visually represented below:
https://storage.googleapis.com/kaggle-media/learn/images/aeIyAlF.png

The rectifier function has a graph that's a line with the negative part "rectified" to zero. Applying the function to the outputs of a neuron will put a bend in the data, moving us away from simple lines.

When we attach the rectifier to a linear unit, we get a rectified linear unit or ReLU. (For this reason, it's common to call the rectifier function the "ReLU function".) Applying a ReLU activation to a linear unit means the output becomes max(0, w * x + b), which we might draw in a diagram like:

https://storage.googleapis.com/kaggle-media/learn/images/eFry7Yu.png
 .

## STACKING DENSE LAYERS

Now that we have some nonlinearity, let's see how we can stack layers to get complex data transformations.
It is represented in the image below:
https://storage.googleapis.com/kaggle-media/learn/images/Y5iwFQZ.png
(A stack of dense layers make a "fully connected neural network)

### Building Sequential Models

The Sequential model we've been using will connect together a list of layers in order from first to last: the first layer gets the input, the last layer produces the output. This creates the model in the figure above:

In [2]:
from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    #hidden 'relu' layers
    layers.Dense(units=4, activation='relu',input_shape=[2]),
    layers.Dense(units=3, activation='relu'),
    #linear unit
    layers.Dense(units=1),
])

In [4]:
#Another way to write activation functions
model = keras.Sequential([
    layers.Dense(units=32, input_shape=[8]),
    layers.Activation('relu'),
    layers.Dense(32),
    layers.Activation('relu'),
    layers.Dense(1),
])

# STOCHASTIC GRADIENT DESCENT
