<a href="https://colab.research.google.com/github/marinarhianna/python-tutorials/blob/main/AI_Introduction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Intro to Artificial Intelligence 👾


---


The purpose of this notebook is to introduce you to the **code** behind a neural network.


*   This notebook is structured to give you a full explanation of what the code is doing, and then will show you a cell to try running.
*  Read **carefully** through the description of what each cell is doing **before** you run it.
*   Read through the code and the **comments**.
*   Run the cell.


Please ask for clarification on anything that is confusing!!





## Step 1: Import Libraries 📖


---


The Python library `torch` is what we will use to build our neural network.

Within `torch`, we need to import their *neural network* package and their *optimizer* package.



In [None]:
# import the machine learning libraries and give the longer ones a nickname
import torch
import torch.nn as nn   # import the neural network package from torch
import torch.optim as optim  # import the optimizer package from torch

# import other useful packages for maths and plotting
import numpy as np
import matplotlib.pyplot as plt

## Step 2: Data 💽


---


Our neural network needs **data** to be **trained** and **tested** on.

In this example, our dataset will just contain be groups of numbers that are either "big" or "small". The aim of this mini-neural-net will be to classify the groups into 2 classifications: big or small.


For our **training** dataset:

*   In this example, it is a dataset containing 4 groups of numbers.
*   We store this within a list: `X_train = [[group 1], [group 2], [group 3]], [group 4]`
*   The example in the cell contains two "small" number groups and two "big" number groups.

For our network to **learn**, the data needs to be **labelled**:

*   We store these labels within another list: `y_train = [label 1, label 2, ...]` where our labels go in the same order as what they are describing in `X_train`.
*   We don't use words for labels, we use digits 0, 1, ...
*   It is very important to make note of what digits we associate with each label in the comments.

In [None]:
# create our training set, keep an eye on the brackets here!
X_train = [[1, 2, 3, 4], [20, 30, 40, 50], [2, 4, 6, 8], [15, 30, 45, 60]]

# label our training set: 0 = small numbers, 1 = big numbers
y_train = [0, 1, 0, 1]  # ie. small, big, small, big

Now, we need to create a dataset to test on.

A testing dataset is usually smaller than a training dataset.

*   We store the testing set within another list: `X_test = [[group 1], [group 2]]`

We also need to label our testing set using the same digit associations we used for the training set.
*   We store these labels within a list: `y_test = [label 1, label 2]`

In [None]:
# create our testing dataset
X_test = [[3, 6, 9, 12], [20, 40, 60, 80]]

# label our testing dataset
y_test = [0, 1]

In [None]:
# CHALLENGE: Edit this cell to print out a statement telling us which is the first and last group in the training set.

###########
# REPLACE #
###########

# HINT: Remember indexing from the previous session!

In [None]:
#@title Example Solution
print("The first group in the training set is", X_train[0], "and the last group in the training set is", X_train[3])

Finally, we need to get our data into the correct **format** for the neural network to be able to understand it.

`torch` can be picky about what form of data is inputted.

It doesn't like lists, but it does like something called a `tensor`.

A `tensor` is a bit like a *vector*: it contains **multiple dimensions** in one object.

To convert our lists into tensors, we need to:


1.   Create a new variable for our tensor objects.
2.   Call the `torch` library and apply its inbuilt function `.tensor(...)`

Recall from last week that a function requires certain information to be inputted to its brackets.

The `.tensor` function requires:


*   Name of the list we want to convert into a tensor
*   The type of data, aka `dtype`

There are different types of data, and we need to let `torch` know which ones to convert our lists into.

*   **Whole numbers** aka integers aka NO decimal places:
    * Suitable data type for *labels*, which we use for classification of data.
    * To tell `torch` about this data type, we say `dtype=torch.long`

*   **Decimals** aka NON-whole numbers:
    * Referred to as a "float"
    * Suitable data type for actual *contents of datasets*.
    * To tell torch about this data type, we say `dtype=torch.float32`
    * If you're curious, 32 refers to a computer-sciencey thing of "32-bit".
    * What this does is convert `[1, 2, 3]` into `[1.0, 2.0, 3.0]`

Finally, `torch` needs to know the exact **shape** of the input data.

At the moment, our `X_train` dataset has the shape of `(4, 4)`.
* There are 4 groups.
* Each group has 4 numbers inside it.

We need to add in a third "channel" telling `torch` how many **features** our data has.

Our dataset only has one feature: the size of the numbers. So, our channel dimension would just be 1. To tell this to `torch`:
* Add to the end `.unsqueeze(1)` where 1 is how many features we have.

Discussion: How many features would a RGB colour image have?

In [None]:
# convert the training dataset into a tensor
X_train_tensor = torch.tensor(X_train, dtype=torch.float32).unsqueeze(1)

# convert the training labels into a tensor
y_train_tensor = torch.tensor(y_train, dtype=torch.long)

In [None]:
# CHALLENGE: Edit this cell to convert the testing dataset and training labels into tensor objects

# convert testing dataset into a tensor
# ... replace ...

# convert testing labels into a tensor
# ... replace ...

In [None]:
#@title See Solution

# convert testing dataset into a tensor
X_test_tensor = torch.tensor(X_test, dtype=torch.float32).unsqueeze(1)

# convert testing labels into a tensor
y_test_tensor = torch.tensor(y_test, dtype=torch.long)

## Step 3: Design and Build our Neural Network 🤖


---

* A `class` is a bit like a blueprint for our CNN, that we give a name. In this example, I have named it `MyNetwork`, but we can name it whatever we want.

* To set up a `class`, we need to call `Module` from `torch.nn`, which is how we let Python know that we are about to build a neural network. Remember that we gave `torch.nn` a nickname when we imported it of just `nn`.

So it would look like:



```
class MyNetwork(nn.Module):
```



* Next, we need to define a `function` inside our `class` in which we can design the layers of our network.
    * Python has a special inbuilt function to get us started with this, that we call using

    `def __init__(self):`
    
    where `init` is short for *initialise*, and `self` just means *this specific object*, ie. this neural network. Don't worry about the underscores, they are just there to let Python know that we are initialising our network using its special inbuilt function.

    * To start setting up our layers, we need to say `super(MyNetwork, self).__init__()` within the above function. `super` is how we tell Python that we are setting up a normal neural network, and we are about to add our own layers.
    * Don't worry at all if this is confusing. The initial set-up of the neural network can feel like a totally different language, but things will become closer to normal English again soon!

Now we can start defining the layers of our network. In this simple case, let's start with just one convolutional layer and one fully connected layer. Our convolutional layer acts as our pattern-detector, and the fully-connected layer acts as our decision-maker.

To set up the convolutional layer:
* We define a variable `self.conv1` which just means *convolutional layer number 1*.
* If we were to add two convolutional layers, we would name the second one `self.conv2`, and so on...
* We define this variable as `self.conv1 = nn.Conv1d(...)`, which is how we get `torch` to create an actual 1D convolutional layer in our neural network by calling `torch.nn.Conv1d` (again, remember our nickname when we imported it).

* Within the brackets of `.Conv1d`, we need to define:
    * Number of inputs (in this case = 1, because one row of number per group)
    * Number of outputs (in this case = 4, because it learns four patterns)
    * Kernel size (example = 2, looks at numbers 2 at a time)

Recall that the kernel is the window that we slide across the input to perform the convolution and learn features bit by bit.
* The kernel size determines how the dimensions of our data change as it moves through the network.
* If we have four numbers: [1, 2, 3, 4], and the kernel size is 2 (ie looks at numbers 2 at a time), what happens is:
    * First step: [1, 2]
    * Second step: [2, 3]
    * Third step: [3, 4]

* This means once a convolution has been performed by the kernel, ***our sequence size shrinks from 4 to 3.***

To set up our fully-connected layer:
* Define a variable `self.fc1` which just means *fully-connected layer number 1*.
* Define this variable as `self.fc1 = nn.Linear(...)` which is how we tell `torch` that this layer in our NN is just made of neurons and no convolutions.
* The contents of our fully-connected layer depend on the outputs of the layer before:
    * The convolutional layer produces four rows (ie. four groups of numbers).
    * After the convolution, there are 3 entries within each group (due to kernel size).
* This means for our fully-connected layer:
    * Number of inputs = 4 * 3
    * Number of outputs = 2 (our classification options: big or small)


So, putting the above all together, it would look like:



```
    def __init__(self):
        super(MyNetwork, self).__init__()
        self.conv1 = nn.Conv1d(1, 4, 2)
        self.fc1 = nn.Linear(4 * 3, 2)
```



Next, we define another function within the class which has the purpose of *passing the information forward through the network*.
* We define a function `def forward(self, x):`
* The input `self` is how we tell Python that we are referring to the above layers.
* The input `x` is how we describe the data that we will pass through the network.

Within the function, we set `x` equal to what we named each layer in our above function, in the order we want to pass the data through.  

This is how we tell Python to pass our data, which we have named `x`, through each layer in the network.

We also include something called an Rectified Linear Unit (ReLU) **Activation Function**, which allows the network to learn non-linear patterns:
* `torch.relu`

Recall that between convolutional layers and fully connected layers we need to **flatten** our data. This essentially is a re-shaping which we do by telling Python to *view* the data, `x`, in a specified way:
* `x.view(x.size(0), -1)`
    * `size(0)` keeps the number of groups constant.
    * `-1` puts everything into one row.

So it would look like:


```
    def forward(self, x):
        x = self.conv1(x)
        x = torch.relu(x)
        x = x.view(x.size(0), -1)
        x = self.fc1(x)
        return x
```

In [None]:
# CHALLENGE: Edit this cell to piece together the above chunks of code to set up a neural network, adding your own comments to each line.

###########
# REPLACE #
###########

# Hint: remember to indent your functions properly within the class!

In [None]:
#@title See Solution:

# set up neural network and give it a name
class MyNetwork(nn.Module):

    # define a function for creating layers
    def __init__(self):

        # tell function name of network and initialise layers
        super(MyNetwork, self).__init__()

        # define convolutional layer: (input, output, kernel size)
        self.conv1 = nn.Conv1d(1, 4, 2)

        # define fully connected layer: (input, output)
        self.fc1 = nn.Linear(4 * 3, 2)

    # define function for passing info forward through network
    def forward(self, x):

        # pass data x into convolutional layer
        x = self.conv1(x)

        # apply activation function
        x = torch.relu(x)

        # flatten the data
        x = x.view(x.size(0), -1)

        # pass data x through fully-connected layer
        x = self.fc1(x)

        # return data x after it has gone through the layers
        return x