
# Coding Lecture 1

## Hello

## Welcome to Google Colab
We are using Google Colab for our class.
For introduction of how to use Colab please refer to Google's official notebook: [https://colab.research.google.com/notebooks/intro.ipynb](https://colab.research.google.com/notebooks/intro.ipynb)

This is a text cell and it uses Markdown syntax.

For example, we can enter math inline by `$ $`: $E = mc^2$, and `$$   $$` for a line of equation:
$$
\int^1_0 f(x) dx + \int^1_0 f^{-1}(x) dx  = 1.
$$

We can type Python code as well:
```python
import numpy as np
x = np.array([9,1,1])
print(x)
```

In [None]:
from time import time
print("Welcome to Comp-Sci 5590 / Math 5555.")
print(f"{time():.2f}") # f-string

Welcome to Comp-Sci 5590 / Math 5555.
1694701562.38


In [None]:
import numpy as np
import torch

## Introduction of PyTorch and GPUs

Colab uses an NVIDIA Tesla T4, and Kaggle uses Nvidia Tesla P100, both of which are extremely powerful GPUs only subpar vs the new Ampere GPUs (RTX 3090, RTX 4090, A4000, A8000).

A GPU instance has a time limit (12h on Colab, 9h on Kaggle). However, Colab's GPU limit is more shady as stated in the
> Colab resources are not guaranteed and not unlimited, and the usage limits sometimes fluctuate. This is necessary for Colab to be able to provide resources for free. For more details, see Resource Limits.

If you want to get into serious Machine Learning, my personal recommendation is to build a workstation around an RTX 3060 12GB under a budget and learn Linux. If you started working on CV (computer vision) or NLP (natural language processing), then it is recommended to get an RTX 3090/RTX 4090, which are ideal single GPU set up for large scale deep learning.

At the time of writing this tutorial (Sept 2023), the current stable version is 2.0 with GPU support (`cu` in the end).

In [None]:
torch.__version__ # cu means cuda

'2.0.1+cu118'

In [None]:
torch.cuda.is_available()

In [None]:
!nvidia-smi

As in every machine learning framework, PyTorch provides functions that are stochastic like generating random numbers. However, a very good practice is to setup your code to be reproducible with the exact same random numbers. This is why we set a seed below.

In [None]:
torch.cuda.manual_seed(42) # Setting the seed

### Compare a `torch.tensor` with a `numpy.ndarray`

- Initialization
- Convert one to the other and vice versa
- Common methods (functions) associated with them
- PyTorch has a special "in-place" operation which has an underscore `_` as a suffix of a certain function, meaning they will modify the underlying variable.


Tensors are the PyTorch equivalent to Numpy arrays, with the addition to also have support for GPU acceleration on various operation.
The name "tensor" is a generalization of concepts you already know. For instance, a vector is a 1-D tensor, and a matrix a 2-D tensor. When working with neural networks, we will use tensors of various shapes and number of dimensions.

Most common functions you know from numpy can be used on tensors as well. Actually, since numpy arrays are so similar to tensors, we can convert most tensors to numpy arrays (and back) but we don't need it too often.

In [None]:
np.__version__

#### Initialization

Let's first start by looking at different ways of creating a tensor. There are many possible options, the most simple one is to call `torch.Tensor` (an uninitialized `FloatTensor`, i.e. single precision) or `torch.tensor` (initialized `LongTensor`, double precision) passing the desired shape as input argument

```matlab
x = [1 10; 0 1; 10 2]
```

In [None]:
x = np.array([[1, 10], [0,  1], [10, 2]])
print(x)
print(x.dtype)



[[ 1 10]
 [ 0  1]
 [10  2]]
int64


In [None]:
x = torch.Tensor([[1, 10],
                  [0, 1],
                  [10, 2]])
print(x)
print(x.dtype)

tensor([[ 1., 10.],
        [ 0.,  1.],
        [10.,  2.]])
torch.float32


In [None]:
# x = np.array(range(10))
x = np.arange(10)
print(x)
# from 0 to 10, non-inclusive of the right end

[0 1 2 3 4 5 6 7 8 9]


In [None]:
np.sum(x)

45

In [None]:
dir(x) # all possible functions associated with x

['T',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_finalize__',
 '__array_function__',
 '__array_interface__',
 '__array_prepare__',
 '__array_priority__',
 '__array_struct__',
 '__array_ufunc__',
 '__array_wrap__',
 '__bool__',
 '__class__',
 '__class_getitem__',
 '__complex__',
 '__contains__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__divmod__',
 '__dlpack__',
 '__dlpack_device__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__iand__',
 '__ifloordiv__',
 '__ilshift__',
 '__imatmul__',
 '__imod__',
 '__imul__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__ior__',
 '__ipow__',
 '__irshift__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lshift__',
 '__lt__',
 '__matmul__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__or__',
 '__pos__',

In [None]:
x.sum() # same as np.sum(x)

45

In [None]:
torch.tensor(list([1,2,5]))

In [None]:
x_t = torch.tensor(range(10))
print(x_t)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


In [None]:
x_t.sum()

tensor(45)

In [None]:
x

In [None]:
x = torch.tensor(x)

In [None]:
x_np = x_t.numpy()
print(type(x_np))

In [None]:
type(x)
print(x)

In [None]:
# relu
x.clip(min=0)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
np.array([-0.3, -0.1, 2, 4]).clip(min=0)

array([0., 0., 2., 4.])

In [None]:
x_t = torch.Tensor([-0.3, -0.1, 2, 4])
x_t.clamp(min=0)

tensor([0., 0., 2., 4.])

In [None]:
print(x_t.add(1))
print(x_t)

tensor([0.7000, 0.9000, 3.0000, 5.0000])
tensor([-0.3000, -0.1000,  2.0000,  4.0000])


In [None]:
x_t.add(-3).clamp(min=0)

tensor([0., 0., 0., 1.])

In [None]:
x_t.add_(1) # in-place operations

tensor([0.7000, 0.9000, 3.0000, 5.0000])

In [None]:
print(x_t) # similar to matlab disp

tensor([0.7000, 0.9000, 3.0000, 5.0000])


In [None]:
x_t.__repr__

<bound method Tensor.__repr__ of tensor([0.7000, 0.9000, 3.0000, 5.0000])>

The function `torch.Tensor` allocates memory for the desired tensor, but reuses any values that have already been in the memory. To directly assign values to the tensor during initialization, there are many alternatives including:

* `torch.zeros`: Creates a tensor filled with zeros
* `torch.ones`: Creates a tensor filled with ones
* `torch.rand`: Creates a tensor with random values uniformly sampled between 0 and 1
* `torch.randn`: Creates a tensor with random values sampled from a normal distribution with mean 0 and variance 1
* `torch.arange`: Creates a tensor containing the values $N,N+k,N+2k,...,\min\{M, N+mk\}$ for $m=\lfloor (M-N)/k\rfloor$.
* `torch.Tensor` (input list): Creates a tensor from the list elements you provide


## Linear algebra: Numpy vs PyTorch

### Operations needed to implement or modify base code of others for optimization algorithms for neural networks
- Inner product
- Matrix-vector multiplication
- Element-wise operation
- (Advanced) Einstein summation


Most operations that exist in numpy, also exist in PyTorch. A full list of operations can be found in the [PyTorch documentation](https://pytorch.org/docs/stable/tensors.html#), but we will review the most important ones here.

- `add` and `add_`
- `mm`
- `argmax`
- `bincount` (`accumarray` in MATLAB)
- many others.



In [None]:
# recall numpy's various operations
a = np.array([[1,0], [2,3]])
print(a)

[[1 0]
 [2 3]]


In [None]:
x = np.array([2,-1])
print(x)

[ 2 -1]


In [None]:
# a*x is not the correct way to implement matrix-vector multiplication
a.dot(x) # a times x

array([2, 1])

In [None]:
# pytorch's counterparts
a_t = torch.tensor(a)
x_t = torch.tensor(x)
print(a_t,'\n', x_t)

tensor([[1, 0],
        [2, 3]]) 
 tensor([ 2, -1])


In [None]:
a_t.mm(x_t) # a time s x

RuntimeError: ignored

In [None]:
x_t.reshape(-1,1)
print(x_t.shape)
print(x_t.reshape(-1,1).shape)
# .reshape(-1,1) makes x_t a matrix
# -1 means that we do not specify that dimension

torch.Size([2])
torch.Size([2, 1])


In [None]:
# x_t has to be a tensor of Size(2,1)
a_t.mm(x_t.reshape(-1,1))

tensor([[2],
        [1]])

In [None]:
# relu
y_t = torch.randn((2,5)) # random matrix of size (2, 5) standard normal
print(y_t)

tensor([[-0.4088, -0.2307, -0.6435,  0.2234,  0.4697],
        [-0.2065, -0.3559,  0.3202,  1.0066,  0.0781]])


In [None]:
y_t.clamp(min=0)

tensor([[0.0000, 0.0000, 0.0000, 0.2234, 0.4697],
        [0.0000, 0.0000, 0.3202, 1.0066, 0.0781]])

In [None]:
# boolean array
y_t > 0

tensor([[False, False, False,  True,  True],
        [False, False,  True,  True,  True]])

In [None]:
y_t[y_t<=0]

tensor([-0.4088, -0.2307, -0.6435, -0.2065, -0.3559])

In [None]:
# boolean array as indices
y_t[y_t<=0] = 0 # first y<=0 is getting indices of y such that its entry is <= 0
# then we set these entries to be 0
print(y_t)

tensor([[0.0000, 0.0000, 0.0000, 0.2234, 0.4697],
        [0.0000, 0.0000, 0.3202, 1.0066, 0.0781]])
