#Notebook link
Notebook link: https://github.com/MITDeepLearning/introtodeeplearning/blob/master/lab1/PT_Part1_Intro.ipynb

# Creating 1-d Tensors

In [0]:
import torch
import torch.nn as nn



import numpy as np
import matplotlib.pyplot as plt

PyTorch provides an interface for creating and manipulating tensors, which are data structures that you can think of as multi-dimensional arrays. Tensors are represented as n-dimensional arrays of base datatypes such as a string or integer -- they provide a way to generalize vectors and matrices to higher dimensions. PyTorch provides the ability to perform computation on these tensors, define neural networks, and train them efficiently.

The shape of a PyTorch tensor defines its number of dimensions and the size of each dimension. The ndim or dim of a PyTorch tensor provides the number of dimensions (n-dimensions)

In [0]:
#passing in a single number gives you a 0-dimensional tensor
num = torch.Tensor(76)
print(f'num is {num}, it has {num.dim()} dimensions, and is of shape {num.shape}, and has size: {num.size}')

num is tensor([-7.0700e-28,  4.5615e-41, -7.0700e-28,  4.5615e-41,  5.3249e-43,
         0.0000e+00,  4.8231e+28, -5.5506e-02,  1.3868e-38,  5.2128e-43,
         0.0000e+00,  0.0000e+00,  6.0979e-38,  6.2815e-38,  0.0000e+00,
         0.0000e+00, -1.9629e-13,  8.4638e-43,  1.8588e-37,  4.2981e-38,
         0.0000e+00,  0.0000e+00,  4.6290e-38,  3.5032e-44,  0.0000e+00,
         6.9060e-37,  0.0000e+00,  4.1905e-38,  1.4235e-38,  1.8515e-37,
         5.0727e-43,  0.0000e+00,  1.4235e-38,  5.1539e-35,  2.2964e-39,
         0.0000e+00,  0.0000e+00,  2.6802e-36,  7.4359e-37,  1.9562e-42,
         0.0000e+00,  0.0000e+00,  6.0984e-38,  6.2815e-38,  0.0000e+00,
         0.0000e+00,  1.8514e-37,  1.2247e-42,  0.0000e+00,  3.2209e-36,
         2.2964e-39,  0.0000e+00,  0.0000e+00,  1.2261e-42,  8.3961e-16,
         4.6286e-38,  1.0721e-35,  6.8769e-37,  0.0000e+00,  0.0000e+00,
         4.2882e-35,  4.6288e-38,  3.5032e-44,  0.0000e+00,  1.4235e-38,
         2.3625e-34,  1.1473e-35,  0.0000e+0

Note the above use of `torch.Tensor` gives you a list of 76 random numbers.  The use of the capital T in tensor is the reason. See below for use of `torch.tensor`. 

In [0]:
num_1 = torch.tensor(74)
print(f'{num_1} is a torch tensor of data type {num_1.dtype}\n'
      f'num_1 has the following shape: {num_1.shape}\n'
      f'num_1 has the following dimensions: {num_1.ndim}-d\n'
      f'num_1 has the following size: {num_1.size()}')


#print(f'num_1 is {num_1}, it has shape {num_1.shape} \nand is {num_1.dim()}-d and has size: {num_1.size()}')
                    

74 is a torch tensor of data type torch.int64
num_1 has the following shape: torch.Size([])
num_1 has the following dimensions: 0-d
num_1 has the following size: torch.Size([])


In [0]:
num_1 = torch.tensor(24)
print(f'num_1 is {num_1}, it has {num_1.dim()} dimensions, and is of shape {num_1.shape}, and has size: {num_1.size}')

num_1 is 24, it has 0 dimensions, and is of shape torch.Size([]), and has size: <built-in method size of Tensor object at 0x7f2816c76330>


In [0]:
#passing in a vector or list gives you a one-dimension tensor as opposed to a 0-d tensor
integer = torch.Tensor([1,2,99,42.3])
integer

tensor([ 1.0000,  2.0000, 99.0000, 42.3000])

In [0]:
integer.shape == integer.size()

True

In [0]:
integer.ndim

1

In [0]:
integer.shape

torch.Size([4])

In [0]:
integer.dim()

1

## Using unsqueeze

In [0]:
integer

tensor([ 1.0000,  2.0000, 99.0000, 42.3000])

In [0]:
integer.ndim

1

In [0]:
integer_1 =integer.unsqueeze(0)
integer_1

tensor([[ 1.0000,  2.0000, 99.0000, 42.3000]])

In [0]:
integer_1.ndim

2

In [0]:
integer_1.shape

torch.Size([1, 4])

In [0]:
two_d_tensor = torch.Tensor([[1,8,9],[9,3,2]])
print(f'two_d_tensor:\n{two_d_tensor}\nShape is {two_d_tensor.shape}\nSize is {two_d_tensor.size()}\nDimension is: {two_d_tensor.dim()}')

two_d_tensor:
tensor([[1., 8., 9.],
        [9., 3., 2.]])
Shape is torch.Size([2, 3])
Size is torch.Size([2, 3])
Dimension is: 2


In [0]:
#Use unsqueeze to add a batch size: we now have 1 batch containing 2 rows × 3 columns.
two_d_tensor = two_d_tensor.unsqueeze(0)
print(f'two_d_tensor:\n{two_d_tensor}\nShape is {two_d_tensor.shape}\nSize is{two_d_tensor.size()}\nDimension is: {two_d_tensor.dim()}')

two_d_tensor:
tensor([[[1., 8., 9.],
         [9., 3., 2.]]])
Shape is torch.Size([1, 2, 3])
Size istorch.Size([1, 2, 3])
Dimension is: 3


The unsqueeze function adds a dimension at the index we choose - so for example, by saying `.unsqueeze(0)`, we add a dimension at the 0th index (i.e. right at the beginning.)  The 0-index dimension (i.e. the first one), may typically be thought of as the batch size in PyTorch.

In [0]:
integer = torch.tensor(1234)
decimal = torch.tensor(3.14159265359)

print(f"`integer` is a {integer.ndim}-d Tensor: {integer}")
print(f"`decimal` is a {decimal.ndim}-d Tensor: {decimal}")

`integer` is a 0-d Tensor: 1234
`decimal` is a 0-d Tensor: 3.1415927410125732


In [0]:
print(f"'integer is {integer}, has the shape of {integer.shape}, and has dimension {integer.dim()}.")
print(f"integer is a {integer.ndim}-d Tensor.")

'integer is 1234, has the shape of torch.Size([]), and has dimension 0.
integer is a 0-d Tensor.


Vectors and lists can be used to create 1-d tensors:

In [0]:
#this gives you a list with 100 elements
len(range(100))

100

In [0]:
fibonacci = torch.tensor([1, 1, 2, 3, 5, 8])
count_to_100 = torch.tensor(range(100))

print(f"`fibonacci` is a {fibonacci.ndim}-d Tensor with shape: {fibonacci.shape}")
print(f"`count_to_100` is a {count_to_100.ndim}-d Tensor with shape: {count_to_100.shape}")

`fibonacci` is a 1-d Tensor with shape: torch.Size([6])
`count_to_100` is a 1-d Tensor with shape: torch.Size([100])


# Creating 2-dimensional matrices
Next, let’s create 2-d (i.e., matrices) and higher-rank tensors. In image processing and computer vision, we will use 4-d Tensors with dimensions corresponding to batch size, number of color channels, image height, and image width.

In [0]:
### Defining higher-order Tensors ###

'''TODO: Define a 2-d Tensor'''
matrix = torch.tensor([[1,2,3,9],[2,4,9,16]])

assert isinstance(matrix, torch.Tensor), "matrix must be a torch Tensor object"
assert matrix.ndim == 2

'''TODO: Define a 4-d Tensor.'''
# Use torch.zeros to initialize a 4-d Tensor of zeros with size 10 x 3 x 256 x 256.
#   You can think of this as 10 images where each image is RGB 256 x 256.
images = torch.zeros(10,3,256,256)

assert isinstance(images, torch.Tensor), "images must be a torch Tensor object"
assert images.ndim == 4, "images must have 4 dimensions"
assert images.shape == (10, 3, 256, 256), "images is incorrect shape"
print(f"images is a {images.ndim}-d Tensor with shape: {images.shape}")
print(f"matrix is a {matrix.ndim}-d Tensor with shape: {matrix.shape}")
matrix

images is a 4-d Tensor with shape: torch.Size([10, 3, 256, 256])
matrix is a 2-d Tensor with shape: torch.Size([2, 4])


tensor([[ 1,  2,  3,  9],
        [ 2,  4,  9, 16]])

In [0]:
images

tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]],

         [[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]],

         [[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]],


        [[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
        

The shape of a tensor provides the number of elements in each tensor dimension. The shape is quite useful, and we'll use it often. You can also use slicing to access subtensors within a higher-rank tensor:

In [0]:
matrix

tensor([[ 1,  2,  3,  9],
        [ 2,  4,  9, 16]])

In [0]:
row_vector = matrix[1]
print(row_vector)

tensor([ 2,  4,  9, 16])


In [0]:
#getting a single element from the tensor
single_element = matrix[0,2]
print(single_element)

tensor(3)


In [0]:
#getting a column from the tensor
column_vector = matrix[:,2] #i.e., getting "all rows" and the second column
column_vector 

tensor([3, 9])

In [0]:
#You can also use slicing to access subtensors within a higher-rank tensor:
row_vector = matrix[1] #row number 1
column_vector = matrix[:,1] #all rows, column number 1
matrix_scalar = matrix[0,2] #the scalar at row 0, column 2
print(f"row vector:{row_vector}")
print(f"column vector: {column_vector}")
print(f"matrix scalar: {matrix_scalar}")

row vector:tensor([ 2,  4,  9, 16])
column vector: tensor([2, 4])
matrix scalar: 3


#1.2 Computations on Tensors

A convenient way to think about and visualize computations in a machine learning framework like PyTorch is in terms of graphs. We can define this graph in terms of tensors, which hold data, and the mathematical operations that act on these tensors in some order. Let's look at a simple example, and define this computation using PyTorch:



![Computational Graph](https://raw.githubusercontent.com/michaellevine00/mit_intro_to_deep_learning/main/images/computational_graph.png)

"computational_graph"
## Adding images in Databricks
To add the above, I just added this in the markdown: `![Computational Graph](/files/tables/data/computational_graph.png)` - prior to doing so I uploaded the png into databricks by clicking New > upload data > upload data to DBFS.

In [0]:
#Create the nodes in the graph and initialize values 
a = torch.tensor(15)
b = torch.tensor (61)

#add the terms to create new tensors
c1 = torch.add(a,b)
c2 = a + b  # PyTorch overrides the "+" operation so that it is able to act on Tensors
print(f'c1: {c1}. c1 is of type: {c1.type()} and is of size {c1.size()}')
print(f'c2: {c2}. c2 is of type: {c2.type()} and is of size {c2.size()}')

c1: 76. c1 is of type: torch.LongTensor and is of size torch.Size([])
c2: 76. c2 is of type: torch.LongTensor and is of size torch.Size([])


A slightly more complicated example

![math_operation](https://raw.githubusercontent.com/michaellevine00/mit_intro_to_deep_learning/main/images/math_operation.png)

"math_operation.png"

Here, we take two inputs, a, b, and compute an output e. Each node in the graph represents an operation that takes some input, does some computation, and passes its output to another node.

In [0]:
### Defining Tensor Computations ###

#Construct a simple computation function
def func(a,b):
    ''''TODO: Define the operation for c, d, e.'''''
    c= a+b
    d= b-1
    e= c*d
    return e

Now, we can call this function to execute the computation graph given some inputs a,b:


In [0]:
a, b = 1.5, 2.5
e_out = func(a,b)
print(f'e_out: {e_out}.')
type(e_out)

e_out: 6.0.


float

## Math notation - Latex
For a linear layer:  
$$ y = \sigma(Wx + b) $$  
where  
- $ W \in \mathbb{R}^{d_{\text{out}} \times d_{\text{in}}} $ is the weight matrix  
- $ x \in \mathbb{R}^{d_{\text{in}}} $ is the input vector  
- $ b \in \mathbb{R}^{d_{\text{out}}} $ is the bias vector  
- $ \sigma(\cdot) $ is the activation function (e.g., ReLU, tanh, etc.)

We can also write it more compactly as  
$$ y = \sigma(Wx + b) $$

# 1.3 Neural Networks in PyTorch
We can also define neural networks in PyTorch. PyTorch uses torch.nn.Module, which serves as a base class for all neural network modules in PyTorch and thus provides a framework for building and training neural networks.

Let's consider the example of a simple perceptron defined by just one dense (aka fully-connected or linear) layer:$$y = \sigma(Wx + b)$$ where $$W$$ represents a matrix of weights, $$b$$ is a bias, $$x$$ is the input, $$\sigma$$ is the activation function, and $$y$$ is the output.

![neural_network_sigmoid_activation](https://raw.githubusercontent.com/michaellevine00/mit_intro_to_deep_learning/main/images/neural_network_sigmoid_activation.png)

We will use `torch.nn.Module` to define layers -- the building blocks of neural networks. Layers implement common neural networks operations. In PyTorch, when we implement a layer, we subclass `nn.Module` and define the parameters of the layer as attributes of our new class. We also define and override a function `forward`, which will define the forward pass computation that is performed at every step. All classes subclassing `nn.Module` should override the forward function.

Let's write a dense layer class to implement a perceptron defined above.



In [0]:
### Defining a dense layer ###

# num_inputs: number of input nodes
# num_outputs: number of output nodes
# x: input to the layer

class OurDenseLayer(torch.nn.Module):
    def __init__(self, num_inputs, num_outputs):
        super(OurDenseLayer, self).__init__()
        # Define and initialize parameters: a weight matrix W and bias b
        # Note that the parameter initialize is random!
        self.W = torch.nn.Parameter(torch.randn(num_inputs, num_outputs))
        self.bias = torch.nn.Parameter(torch.randn(num_outputs))

    def forward(self, x):
        '''TODO: define the operation for z (hint: use torch.matmul).'''
        z = torch.matmul(x,self.W)+self.bias

        '''TODO: define the operation for out (hint: use torch.sigmoid).'''
        y = torch.sigmoid(z)
        return y

In [0]:
### Defining a dense layer ###

# num_inputs: number of input nodes
# num_outputs: number of output nodes
# x: input to the layer

class OurDenseLayer(torch.nn.Module):
    def __init__(self, num_inputs, num_outputs):
        super(OurDenseLayer, self).__init__()
        # Define and initialize parameters: a weight matrix W and bias b
        # Note that the parameter initialize is random!
        self.W = torch.nn.Parameter(torch.randn(num_inputs, num_outputs))
        self.bias = torch.nn.Parameter(torch.randn(num_outputs))

    def forward(self, x):
        '''TODO: define the operation for z (hint: use torch.matmul).'''
        z = torch.matmul(x,self.W)+self.bias

        '''TODO: define the operation for out (hint: use torch.sigmoid).'''
        y = torch.sigmoid(z)
        return y

In [0]:
rand_num = torch.randn(2,4,3,7)
print(f"shape of rand_num: {rand_num.shape}")
rand_num

shape of rand_num: torch.Size([2, 4, 3, 7])


tensor([[[[ 0.4433, -0.3157, -0.0253,  0.0615,  0.2681, -1.9439, -1.5134],
          [ 0.5198, -1.1448,  0.6331,  0.3692, -0.3164, -1.0006,  0.2223],
          [-0.5676,  2.2107,  0.6146, -0.9115, -0.2115,  0.3972, -0.4228]],

         [[ 1.1905, -0.9105,  1.3880,  0.2813,  0.7428,  0.0605, -0.6952],
          [-0.4152,  0.7824,  0.3718,  1.6486, -0.2518,  0.8194, -2.2059],
          [ 1.1276,  0.4035,  1.1075, -0.6207, -1.7927,  0.3005, -0.0719]],

         [[-0.4334, -0.4440,  0.0081,  0.4319, -0.4454, -0.4123,  1.3321],
          [-1.5695,  0.5450, -1.6057,  0.0911,  1.0655, -0.4435, -0.3389],
          [-0.8114, -2.2463, -0.3694,  0.0245,  0.6396, -0.2188, -1.1950]],

         [[-0.5981, -0.4013,  0.3619,  0.2992,  0.7335, -0.1032, -0.4479],
          [ 0.0927, -1.1998, -0.0579, -0.1270, -1.0262, -1.1647,  1.1661],
          [-1.3425,  0.9142, -0.1464, -0.9816,  2.2514,  0.8341,  1.4661]]],


        [[[ 1.0344, -1.0010,  0.1127, -1.4099, -1.3201, -1.0933,  0.8420],
          [-1.4

In [0]:
torch.randn?

[0;31mDocstring:[0m
randn(*size, *, generator=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, pin_memory=False) -> Tensor


Returns a tensor filled with random numbers from a normal distribution
with mean `0` and variance `1` (also called the standard normal
distribution).

.. math::
    \text{out}_{i} \sim \mathcal{N}(0, 1)

For complex dtypes, the tensor is i.i.d. sampled from a `complex normal distribution`_ with zero mean and
unit variance as

.. math::
    \text{out}_{i} \sim \mathcal{CN}(0, 1)

This is equivalent to separately sampling the real :math:`(\operatorname{Re})` and imaginary
:math:`(\operatorname{Im})` part of :math:`\text{out}_i` as

.. math::
    \operatorname{Re}(\text{out}_{i}) \sim \mathcal{N}(0, \frac{1}{2}),\quad
    \operatorname{Im}(\text{out}_{i}) \sim \mathcal{N}(0, \frac{1}{2})

The shape of the tensor is defined by the variable argument :attr:`size`.


Args:
    size (int...): a sequence of integers defining the shape 

Now, let's test the output of our layer.



In [0]:
# Define a layer and test the output!
num_inputs = 2
num_outputs = 3
layer = OurDenseLayer(num_inputs, num_outputs)
x_input = torch.tensor([[1, 5.]])
y = layer(x_input)

print(f"input shape: {x_input.shape}")
print(f"output shape: {y.shape}")
print(f"output result: {y}")

input shape: torch.Size([1, 2])
output shape: torch.Size([1, 3])
output result: tensor([[0.0014, 0.6433, 0.0666]], grad_fn=<SigmoidBackward0>)


Conveniently, PyTorch has defined a number of `nn.Modules` (or Layers) that are commonly used in neural networks, for example a `nn.Linear` or `nn.Sigmoid` module.

Now, instead of using a single `Module` to define our simple neural network, we'll use the `nn.Sequential` module from PyTorch and a single `nn.Linear` layer to define our network. With the `Sequential` API, you can readily create neural networks by stacking together layers like building blocks.

In [0]:
### Defining a neural network using the PyTorch Sequential API ###

# define the number of inputs and outputs
n_input_nodes = 2
n_output_nodes = 3

# Define the model
'''TODO: Use the Sequential API to define a neural network with a
    single linear (dense!) layer, followed by non-linearity to compute z'''
model = nn.Sequential(
    nn.Linear(n_input_nodes, n_output_nodes),
    nn.ReLU()
)

We've defined our model using the Sequential API. Now, we can test it out using an example input:

In [0]:
# Test the model with example input
x_input = torch.tensor([[1, 7.]])
model_output = model(x_input)
print(f"input shape: {x_input.shape}")
print(f"output shape: {y.shape}")
print(f"output result: {y}")

input shape: torch.Size([1, 2])
output shape: torch.Size([1, 3])
output result: tensor([[0.0014, 0.6433, 0.0666]], grad_fn=<SigmoidBackward0>)


# Continue here
With PyTorch, we can create more flexible models by subclassing `nn.Module`. The `nn.Module` class allows us to group layers together flexibly to define new architectures.