# Intro to PyTorch
<figure style='float:right;max-width:30%;'>
<img src='https://upload.wikimedia.org/wikipedia/commons/thumb/c/c6/PyTorch_logo_black.svg/640px-PyTorch_logo_black.svg.png' style='padding:10px;background-color:white'>
<figcaption style='text-align:right'>Source: <a href=https://commons.wikimedia.org/wiki/File:PyTorch_logo_black.svg>Wikimedia Commons</a></figcaption>
</figure>

PyTorch is a machine learning framework with a major focus on neural networks used for computer vision, audio and natural language processing. The user-facing frontend is written in Python, but the number-crunching is handled by a more optimized C++ backend, including support for outsourcing computations to graphics cards (GPUs) for a substantial increase in speed. PyTorch was originally created by Meta (formerly known as facebook), but has always been open source, permissively licensed ([BSD-3](https://en.wikipedia.org/wiki/BSD_licenses#3-clause)), and since September 2022 is managed by the non-profit PyTorch Foundation, a subsidiary of the [Linux Foundation](https://en.wikipedia.org/wiki/Linux_Foundation).

The accessible interface, huge community, and optimized implementations have established PyTorch among the top choices for education, research, and production in the field of neural network design.

**Caveat emptor:** PyTorch is not really "better" or "worse" than other popular frameworks like [Keras](https://keras.io/) or [TensorFlow](https://www.tensorflow.org/). While each framework has its particular strengths, they differ more in their style, philosophy, and user base than in their feature lists and performance. You should absolutely explore other options available to you and find what you like best!


## Structure

The core package of PyTorch is called [`torch`](https://pypi.org/project/torch/). This package contains all the code required to setup and compute general purpose neural networks. It is extended by packages that offer more specialized functions and objects specific to various applications: [`torchvision`](https://pytorch.org/vision/stable/index.html) for Computer Vision (working with images or videos), [`torchaudio`](https://pytorch.org/audio/stable/index.html) for audio processing (e.g. speech recognition or synthesis), and [`torchtext`](https://pytorch.org/text/stable/index.html) for natural language processing. PyTorch is extended by various other packages that comprise the [*PyTorch Ecosystem*](https://pytorch.org/ecosystem/).

## Core components

A plethora of functions and objects can be found within PyTorch. But arguably the most important basic components are:

1. The [`Tensor`](https://pytorch.org/docs/stable/tensors.html#tensor-class-reference) class
2. The differentiation engine [`Autograd`](https://pytorch.org/docs/stable/autograd.html#module-torch.autograd)
3. The neural network building blocks (layers and activation functions) found in [`torch.nn`](https://pytorch.org/docs/stable/nn.html#module-torch.nn)

Before we build our first neural network from scratch, let us walk through these components one at a time:

### The `Tensor` class
<figure style='float:right;max-width=10%;'>
<img src=https://imgs.xkcd.com/comics/machine_learning.png style='padding-right:10px'>
<figcaption style='text-align:right;padding-
right:10px'>Source: <a href=https://xkcd.com/license.html>XKCD</a> </figcaption>
</figure>

Neural networks are essentially a sequence of linear algebra operations. A [mathematical tensor](https://en.wikipedia.org/wiki/Tensor) is the most general algebraic object, of which simpler algebraic objects can be derived:

- A scalar is a tensor of rank 0:
$$
\left[ 0 \right]
$$
- A vector is a tensor of rank 1 (a.k.a. a collection of rank 0 tensors):
$$
\begin{bmatrix} \left[ 0 \right], \left[ 1 \right], \left[ 2 \right] \end{bmatrix}
$$
- A matrix is a tensor of rank 2 (a.k.a. a collection of rank 1 tensors):
$$
\begin{bmatrix} 
  \begin{bmatrix} \left[ 0 \right], \left[ 1 \right], \left[ 2 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 3 \right], \left[ 4 \right], \left[ 5 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 6 \right], \left[ 7 \right], \left[ 8 \right] \end{bmatrix} 
\end{bmatrix}
$$
- An $n$-dimensional array is a tensor of rank $n$ (a.k.a. a collection of rank $n-1$ tensors):
$$
\begin{bmatrix}
\begin{bmatrix} 
  \begin{bmatrix} \left[ 0 \right], \left[ 1 \right], \left[ 2 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 3 \right], \left[ 4 \right], \left[ 5 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 6 \right], \left[ 7 \right], \left[ 8 \right] \end{bmatrix} 
\end{bmatrix}, 
\begin{bmatrix} 
  \begin{bmatrix} \left[ 9 \right], \left[ 10 \right], \left[ 11 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 12 \right], \left[ 13 \right], \left[ 14 \right] \end{bmatrix} \\
  \begin{bmatrix} \left[ 15 \right], \left[ 16 \right], \left[ 17 \right] \end{bmatrix} 
\end{bmatrix}, 
\ldots
\end{bmatrix}
$$

**Note:** Describing a mathematical tensor as a generalized matrix is not [the whole story](https://medium.com/@quantumsteinke/whats-the-difference-between-a-matrix-and-a-tensor-4505fbdc576c). For the purposes of this introduction, this simplified definition shall, however, suffice.

In PyTorch, everything runs on tensors: Your data is encoded in a tensor, the neural networks are expressed as tensors, sending the data through the network is a series of transformations on a tensor. All of these tensors are represented by a class named [`Tensor`](https://pytorch.org/docs/stable/tensors.html#torch-tensor) found in the core `torch` module. 

In [16]:
import torch

torch.Tensor([[0, 1, 2], [3, 4, 5]])

tensor([[0., 1., 2.],
        [3., 4., 5.]])

There are [many ways to conveniently create tensors](https://pytorch.org/docs/stable/torch.html#creation-ops) from existing data, with specific initializations, or of specific shapes:

In [23]:
zeros = torch.zeros(2, 2, 2)    # Rank 3 tensor filled with zeros
ones = torch.ones_like(zeros)   # Tensor of the same shape but filled with ones
eye = torch.eye(4)              # A rank 2 tensor representing an identity matrix

print("A rank 3 tensor filled with zeros: \n", zeros)
print("A tensor of the same shape but filled with ones: \n", ones)
print("A rank 2 tensor representing an identity matrix:\n", eye)

A rank 3 tensor filled with zeros: 
 tensor([[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]])
A tensor of the same shape but filled with ones: 
 tensor([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]])
A rank 2 tensor representing an identity matrix:
 tensor([[1., 0., 0., 0.],
        [0., 1., 0., 0.],
        [0., 0., 1., 0.],
        [0., 0., 0., 1.]])


We can perform calculations with tensors just as we would expect:

In [31]:
a = torch.tensor([1, 1, 1])
b = torch.tensor([2, 2, 2])

print(f'{a = }, {b = }')
print(f'Addition: {a + b = }')
print(f'Element-wise product: {a * b = }')
print(f'Element-wise division: {a / b = }')
print(f'Dot product: {a / b = }')

a = tensor([1, 1, 1]), b = tensor([2, 2, 2])
Addition: a + b = tensor([3, 3, 3])
Element-wise product: a * b = tensor([2, 2, 2])
Element-wise division: a / b = tensor([0.5000, 0.5000, 0.5000])
Element-wise division: a / b = tensor([0.5000, 0.5000, 0.5000])
