# 1. Introduction to PyTorch, a Deep Learning Library

Self-driving cars, smartphones, search engines... Deep learning is now everywhere. Before you begin building complex models, you will become familiar with PyTorch, a deep learning framework. You will learn how to manipulate tensors, create PyTorch data structures, and build your first neural network in PyTorch.

### Preparing the environment

In [1]:
import expectexception

## 1.1 Introduction to deep learning with PyTorch

### Importing PyTorch and related packages

In [2]:
import torch
import torch.nn as nn

In [3]:
import numpy as np

### Tensors: the building blocks of networks in PyTorch

In [4]:
# Load from list
lst = [[1, 2, 3], [4, 5, 6]]
tensor = torch.tensor(lst)
tensor

tensor([[1, 2, 3],
        [4, 5, 6]])

In [5]:
# Load from NumPy array
lst = [[1, 2, 3], [4, 5, 6]]
np_array = np.array(lst)
np_tensor = torch.from_numpy(np_array)
np_tensor

tensor([[1, 2, 3],
        [4, 5, 6]], dtype=torch.int32)

### Tensor attributes

In [6]:
# Tensor shape
lst = [[1, 2, 3], [4, 5, 6]]
tensor = torch.tensor(lst)
tensor.shape

torch.Size([2, 3])

In [7]:
# Tensor data type
tensor.dtype

torch.int64

In [8]:
# Tensor device
tensor.device

device(type='cpu')

### Getting started with tensor operations

In [9]:
# Compatible shapes
a = torch.tensor([[1, 1], [2, 2]])
b = torch.tensor([[2, 2], [3, 3]])

# Addition / subtraction
a + b

tensor([[3, 3],
        [5, 5]])

In [10]:
%%expect_exception RuntimeError

# Incompatible shapes
a = torch.tensor([[1, 1], [2, 2]])
c = torch.tensor([[2, 2, 4], [3, 3, 5]])

# Addition / subtraction
a + c

[1;31m---------------------------------------------------------------------------[0m
[1;31mRuntimeError[0m                              Traceback (most recent call last)
Cell [1;32mIn[10], line 6[0m
[0;32m      3[0m c [38;5;241m=[39m torch[38;5;241m.[39mtensor([[[38;5;241m2[39m, [38;5;241m2[39m, [38;5;241m4[39m], [[38;5;241m3[39m, [38;5;241m3[39m, [38;5;241m5[39m]])
[0;32m      5[0m [38;5;66;03m# Addition / subtraction[39;00m
[1;32m----> 6[0m [43ma[49m[43m [49m[38;5;241;43m+[39;49m[43m [49m[43mc[49m

[1;31mRuntimeError[0m: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 1


In [11]:
# Element-wise multiplication
a = torch.tensor([[1, 1], [2, 2]])
b = torch.tensor([[2, 2], [3, 3]])
a * b

tensor([[2, 2],
        [6, 6]])

In [12]:
# Transposition
a = torch.tensor([[4, 1], [5, 3], [2, 1]])
print(a)

torch.transpose(a, 0, 1)

tensor([[4, 1],
        [5, 3],
        [2, 1]])


tensor([[4, 5, 2],
        [1, 3, 1]])

In [13]:
# Matrix multiplication
a = torch.tensor([[1, 2],
                  [3, 4]])
b = torch.tensor([[2, 3],
                  [4, 5]])

# [[1*2 + 2*8, 1*3 + 2*5],
#  [3*2 + 4*4, 3*3 + 4*5]]
torch.matmul(a, b)

tensor([[10, 13],
        [22, 29]])

In [14]:
# Difference with element-wise multiplication
# [[1*2, 2*3],
#  [3*4, 4*5]]
a * b

tensor([[ 2,  6],
        [12, 20]])

In [15]:
# Concatenation
a = torch.tensor([[1, 2],
                  [3, 4]])
b = torch.tensor([[2, 3],
                  [4, 5]])

torch.cat([a, b], dim=0)

tensor([[1, 2],
        [3, 4],
        [2, 3],
        [4, 5]])

In [16]:
torch.cat([a, b], dim=1)

tensor([[1, 2, 2, 3],
        [3, 4, 4, 5]])

### Ex.1 - Creating tensors and accessing attributes

Tensors are the primary data structure in PyTorch and will be the building blocks for our deep learning models. They share many similarities with NumPy arrays but have some unique attributes too.

In this exercise, you'll practice creating a tensor from a Python list and displaying some of its attributes.

**Instructions**
1. Begin by importing PyTorch. (Already done!)
2. Create a tensor from the Python list `list_a`.
3. Display the tensor device.
4. Display the tensor data type.

In [17]:
list_a = [1, 2, 3, 4]

# Create a tensor from list_a
tensor_a = torch.tensor(list_a)
print(tensor_a)

# Display the tensor device
print(tensor_a.device)

# Display the tensor data type
print(tensor_a.dtype)

tensor([1, 2, 3, 4])
cpu
torch.int64


### Ex.2 - Creating tensors from NumPy arrays
Tensors are the fundamental data structure of PyTorch. You can create complex deep learning algorithms by learning how to manipulate them.

**Instructions**

1. Create two tensors, `tensor_a` and `tensor_b`, from the `NumPy` arrays `array_a` and `array_b`, respectively.
2. Subtract `tensor_b` from `tensor_a` and assign it to `tensor_c`.
3. Perform an element-wise multiplication of `tensor_a` and `tensor_b`, assign it to `tensor_d`.
4. Add the resulting tensors, `tensor_c` and `tensor_d`, from the two previous steps together and assign it to `tensor_e`.

In [18]:
array_a = np. array([[1, 1, 1],
                     [2, 3, 4],
                     [4, 5, 6]])

array_b = np.array([[7, 5, 4],
                    [2, 2, 8],
                    [6, 3, 8]])

In [19]:
# Create two tensors from the arrays
tensor_a = torch.tensor(array_a)
tensor_b = torch.tensor(array_b)

# Subtract tensor_b from tensor_a 
tensor_c = tensor_a - tensor_b

# Multiply each element of tensor_a with each element of tensor_b
tensor_d = tensor_a * tensor_b

# Add tensor_c to tensor_d
tensor_e = tensor_c + tensor_d
print(tensor_e)

tensor([[ 1,  1,  1],
        [ 4,  7, 28],
        [22, 17, 46]], dtype=torch.int32)


## 1.2 Creating our first neural network

### Our first neural network

In [20]:
## Create input_tensor with three features
input_tensor = torch.tensor([[0.3471, 0.4547, -0.2356]])
input_tensor

tensor([[ 0.3471,  0.4547, -0.2356]])

In [21]:
# A linear layer takes an input, applies a linearfunction, and returns output
# Define our first linear layer
linear_layer = nn.Linear(in_features=3, out_features=2)
linear_layer

Linear(in_features=3, out_features=2, bias=True)

In [22]:
# Pass input through linear layer
output = linear_layer(input_tensor)
print(output)

tensor([[-0.3043, -0.3788]], grad_fn=<AddmmBackward0>)


### Getting to know the linear layer operation

In [23]:
# Each linear layer has a .weight
linear_layer.weight

Parameter containing:
tensor([[-0.2864,  0.0919,  0.0344],
        [-0.3922, -0.4002, -0.4589]], requires_grad=True)

In [24]:
# and .bias property
linear_layer.bias

Parameter containing:
tensor([-0.2386, -0.1688], requires_grad=True)

### Stacking layers with nn.Sequential()

In [25]:
model = nn.Sequential(
    nn.Linear(10, 18),
    nn.Linear(18, 20),
    nn.Linear(20, 5)
)
model

Sequential(
  (0): Linear(in_features=10, out_features=18, bias=True)
  (1): Linear(in_features=18, out_features=20, bias=True)
  (2): Linear(in_features=20, out_features=5, bias=True)
)

### Stacking layers with nn.Sequential()

In [26]:
input_tensor = torch.tensor([[-0.0014, 0.4038, 1.0305, 0.7521, 0.7489, -0.3968, 0.0113, -1.3844, 0.8705, -0.9743]])
input_tensor.size(), input_tensor.shape

(torch.Size([1, 10]), torch.Size([1, 10]))

In [27]:
output_tensor = model(input_tensor)
print(output_tensor)

tensor([[ 0.1823, -0.3461,  0.5391,  0.2288, -0.3502]],
       grad_fn=<AddmmBackward0>)


### Ex.3 - Your first neural network

In this exercise, you will implement a small neural network containing two linear layers. The first layer takes an eight-dimensional input, and the last layer outputs a one-dimensional tensor.

The torch package and the torch.nn package have already been imported for you.

**Instructions**

1. Create a neural network of two linear layers that takes a tensor of dimensions $ 1 \times 8 $ as input, representing 8 features, and outputs a tensor of dimensions $ 1 \times 1 $.
2. Use any output dimension for the first layer you want.

In [28]:
input_tensor = torch.Tensor([[2, 3, 6, 7, 9, 3, 2, 1]])

# Implement a small neural network with exactly two linear layers
model = nn.Sequential(
    nn.Linear(8, 12),
    nn.Linear(12, 1)
)

output = model(input_tensor)
print(output)

# Good job on creating your first neural network! In practice, you'll find that modern neural
# networks can contain hundreds of layers and millions of parameters. Recall that the model
# output is not meaningful until the model is trained, i.e. until the weights and biases of
# each layer can meaningfully be used to produce output.

tensor([[-1.9584]], grad_fn=<AddmmBackward0>)


## 1.3 Discovering activation functions

### Meet the sigmoid function

In [29]:
input_tensor = torch.tensor([[6.0]])
sigmoid = nn.Sigmoid()
output = sigmoid(input_tensor)
output

tensor([[0.9975]])

### Activation function as the last layer

In [30]:
# Sigmoid as last step in network of linear layers is equivalent to traditional logistic
# regression.
model = nn.Sequential(
    nn.Linear(6, 4), # First linear layer
    nn.Linear(4, 1), # Second linear layer
    nn.Sigmoid() # Sigmoid activation function
)
model

Sequential(
  (0): Linear(in_features=6, out_features=4, bias=True)
  (1): Linear(in_features=4, out_features=1, bias=True)
  (2): Sigmoid()
)

### Getting acquainted with softmax

- Outputs a probability distribution:
    - each element is a probability (it's bounded between 0 and 1)
    - the sum of the output vector is equal to 1
- `dim = -1` indicates softmax is applied to the input tensor's last dimension
- The softmax function outputs a tensor of the same dimension as its input.

In [31]:
# Create an input tensor
input_tensor = torch.tensor([[4.3, 6.1, 2.3]])

# Apply softmax along the last dimension
probabilities = nn.Softmax(dim=-1)
output_tensor = probabilities(input_tensor)
print(output_tensor)

tensor([[0.1392, 0.8420, 0.0188]])


### Ex.4 - The sigmoid and softmax functions

The sigmoid and softmax functions are two of the most popular activation functions in deep learning. They are both usually used as the last step of a neural network. Sigmoid functions are used for binary classification problems, whereas softmax functions are often used for multi-class classification problems. This exercise will familiarize you with creating and using both functions.

Let's say that you have a neural network that returned the values contained in the score tensor as a pre-activation output. You will apply activation functions to this output.

`torch.nn` is already imported as nn.

**Instructions**

1. Create a sigmoid function and apply it on input_tensor to generate a probability.
2. Create a softmax function and apply it on input_tensor to generate a probability.

In [32]:
input_tensor = torch.tensor([[0.8]])

# Create a sigmoid function and apply it on input_tensor
sigmoid = nn.Sigmoid()
probability = sigmoid(input_tensor)
print(probability)

tensor([[0.6900]])


In [33]:
input_tensor = torch.tensor([[1.0, -6.0, 2.5, -0.3, 1.2, 0.8]])

# Create a softmax function and apply it on input_tensor
softmax = nn.Softmax(dim=-1)
probabilities = softmax(input_tensor)
print(probabilities)

tensor([[1.2828e-01, 1.1698e-04, 5.7492e-01, 3.4961e-02, 1.5669e-01, 1.0503e-01]])


-----------------