# Python Crash Course

## Setting Up Your Environment
As you probably know, setting up your environment can be the most difficult part of learning a new tool. Luckily Python makes it very easy for us with its built-in package manager `pip`. Other package managers exist as well, such as `conda` which is very commonly used.

It is good practice to keep **an isolated environment** for each project, in order to avoid conflicts with the base system environment. In the first part of this tutorial we will see how to do this for:
- Google Colab
- a local environment on your workstation
- basic setup for a SLURM computing environment (e.g DRAC / Compute Canada)

In Google Colab, we do not need to choose which package manager to use, as it comes preconfigured with `pip`, and automatically creates a new environment for you with each session (this is a blessing and a curse).

For a local environment on your workstation, `conda` is recommended as it is a more powerful package manager than `pip`, and makes it easy to use different versons of python for different projects. Conda supports both `conda install` and `pip install` commands.

In a SLURM environment, it is generally recommended to create your environment with `virtualenv` rather than `conda` as it is more light-weight. Packages can then be install with `pip`. DRAC / Compute Canada provides excellent instructions for [setting up a python environment on their cluster.](https://docs.alliancecan.ca/wiki/Python)

### Getting Setup in Google Colab
Conveniently, Google Colab comes with most of the libraries that we need for machine learning and datascience pre-installed. Sometimes however, we need access to code libraries or datasets which are not pre-installed. In this case, Colab makes it easy to install them, however the method is slightly different from how we install libraries locally.

Google Colab is based on Jupyter Notebooks, which have special syntax for executing system commands. We can execute bash commands using the `!command` syntax. For example, we can check the list of packages currently installed:

In [None]:
!pip list

Package                            Version
---------------------------------- ------------------
absl-py                            1.4.0
accelerate                         1.3.0
aiohappyeyeballs                   2.4.4
aiohttp                            3.11.12
aiosignal                          1.3.2
alabaster                          1.0.0
albucore                           0.0.23
albumentations                     2.0.3
ale-py                             0.10.1
altair                             5.5.0
annotated-types                    0.7.0
anyio                              3.7.1
argon2-cffi                        23.1.0
argon2-cffi-bindings               21.2.0
array_record                       0.6.0
arviz                              0.20.0
astropy                            7.0.0
astropy-iers-data                  0.2025.2.3.0.32.42
astunparse                         1.6.3
atpublic                           4.1.0
attrs                              25.1.0
audioread            

We can also install and manage packages using pip in the same way, however it is better to use the special `%pip` command as this will help ensure that everything works well with the Jupyter notebook environment.

In [None]:
# We can install PyTorch Geometric which will be used during the GNN tutorial
%pip install torch_geometric

Collecting torch_geometric
  Downloading torch_geometric-2.6.1-py3-none-any.whl.metadata (63 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/63.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.1/63.1 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
Downloading torch_geometric-2.6.1-py3-none-any.whl (1.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m18.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: torch_geometric
Successfully installed torch_geometric-2.6.1


Often we are not interested in seeing the logging output of the install process, as it just clutters up our notebooks. We can use the `%%capture` cell "magic" command to disable output:

In [None]:
%%capture
%pip install torch_geometric

### Getting Setup on your Local Workstation
We will now see how to setup an environment on your local workstation. First, we will install `conda`. If you do not have a version of `conda` installed already follow the [official instructions](https://docs.anaconda.com/miniconda/install/) to install Miniconda3. **NOTE** To avoid interfering with the Colab or Jupyter environment, the following code snippets are not executable.

For example, to install on Linux, we can copy the quick-install command provided by Miniconda:
```bash
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
```

Then we can activate the environment for the first time, and configure it to activate automatically in future terminal sessions:
```bash
source ~/miniconda3/bin/activate
conda init --all
```

Next we can create a new environment for this project:
```bash
conda create -n gnn python=3.11
```
This will create a new conda environment called `gnn` using Python 3.11. We can then activate the environment and install any packages without interfering with other projects.
```bash
conda activate gnn
```




It is good practice to keep track of the packages required by your code using a `requirements.txt` file. This file lists each pip package (and optionally versions) required on a separate line. For example, we can define a requirements file for the GNN tutorial like this:

```
# requirements.txt
emmet
matplotlib
mp-api
networkx
numpy
pymatgen
scikit-learn
torch
torch_geometric
```

Then, we can install in our conda environment (after running `conda activate gnn`) with the following command:
```bash
pip install -r requirements.txt
```

## Python Basics

### Importing Modules
One of the hallmarks of Python is its module system. Python makes it very easy to load and use code libraries. Some libraries are built-in, while others need to be installed from a third party. Furthermore, it is possible to break your code up into multiple files, and "import" code from another file.


In [None]:
import numpy as np  # here we import the third party libary "numpy" under the alias "np"
import torch

# math is a built-in library
from math import sin # we can import specific functions using "from ... import ..."

np.sum([0, 1, 2]), sin(np.pi / 2), torch.rand(1)  # we can now use functionality from numpy, torch, or the "sin" function from "math"

(3, 1.0, tensor([0.6108]))

Python will first check the current directory for a matching file, then it will check the installed libraries and built-ins.



### Variables and Data Types
Like other programming languages, Python allows us to create and store data in **variables**. It is important to think of a variable as a **container** for data. The data itself can exist even without the variable, or be referenced by multiple containers. Understanding this concept will help you avoid many common bugs in your code.

In Python, everything is an _object_ which can store data and contain methods which act on the data. This means that there are **no primitive types** in Python. Moreover, Python uses the concept of **duck typing** ("if it quacks like a duck then it's a duck"). This means that programs in general should not depend on receiving specific types, but rather depend on objects implementing specific methods. We will see that many built-in functions in Python follow this idea.

We can create variables to hold some of the common data types in python as follows:

In [None]:
a = 5  # Number, specifically an integer
b = 3.14  # Number, specifically a float
c = "hello world"  # String
d = 'a'  # Also a string, Python does not have "char"
e = ["a", "list", "of", "strings"]  # Lists uses square brackets [ ]
f = [1, 3.14, "string"]  # lists can mix different types (each entry is just a "container")
g = True  # A boolean takes the values True / False
h = None  # Used similarily to "null" in other languages
i = { "Drew": 28, "Jama": None}
type(a), type(b), type(c), type(d), type(e), type(f), type(g), type(h), type(i)

(int, float, str, str, list, list, bool, NoneType, dict)

By convention we use `snake_case` for variable names in Python.

Observe that we never "declared" the type of our variables. This is because Python is not a strictly typed language. We can even assign different types to a variable throughout the program. Sometimes this is useful, but often it can harm the readability of our programs.

In [None]:
a = "yikes"  # now the variable `a` holds a string instead of an integer
type(a)

str

Since Python 3.5, it is possible to include type hints for variables. These are used to improve readbility and help the IDE with autocomplete, but they are not (usually) enforced at runtime.

In [None]:
a: int = 5


### Basic Operations
Python supports basic arithmetic operations on numbers.

In [None]:
a = 1
b = 3
c = a + b  # adding integers produces an integer
c, type(c)

(4, int)

In [None]:
a = 1.5
b = 4
c = b - a # operations with floats produce a float
c, type(c)

(2.5, float)

In [None]:
a = 7
b = 3
c = a / b  # division of integers will naturually produce a float
c, type(c)

(2.3333333333333335, float)

In [None]:
a = 7
b = 3
c = a // b # we can do floored division using the // symbol
c, type(c)

(2, int)

In [None]:
a = 6
b = 8
c = a * b  # multiplication uses the * symbol
c, type(c)

(48, int)

In [None]:
a = 2
b = 4
c = a ** b # exponentiation uses the ** symbol
c

16

**NOTE** Python also implements a `^` symbol, but this performs bit-wise XOR not exponentiation

In [None]:
a ^ b

6

The mod / remainder operator is implement using the `%` symbol:

In [None]:
a = 10
b = 3
c = a % b  # 10 mod 3
c

1

Python also allows us to define numbers in scientific notation, hexadecimal, or binary:

In [None]:
a = 1880  # standard notation
a = 1_880 # we can separate thousands with an _ for readability
b = 1.88e3  # scientific notation
c = 0b11101011000 # in binary
d = 0x758 # in hexadecimal
a == b == c == d

True


### Flow Control and Loops
Python provides several structures for flow control, but perhaps not as many as some other languages.

The most basic structure is conditional execution of code with `if` statements.

In [None]:
# Absolute value
x = 32
if x >= 0:
  y = x
else:
  y = -x
y

32

In [None]:
# Absolute value
x = -16
if x >= 0:
  y = x
else:
  y = -x
y

16

In [None]:
animal = 'dog'
if animal == 'cat':
  print('Meow!')
elif animal == 'dog':
  print('Bark!')
elif animal == 'frog':
  print('Ribbit!')
else:
  print('Grrr...')

Bark!


Note that with the `if-elif-else` construction, _exactly_ one block is executed.

Another useful flow control structure are loops. Python has "foreach" loops and "while" loops.

In [None]:
names = ["drew", "jama", "ameer", "mehdi",]
for name in names:  # iterate over ever item in the list
  print("hello", name.capitalize())

hello Drew
hello Jama
hello Ameer
hello Mehdi


There is no `for (int i=0; i < 6; i++)` like structure in Python. Instead, the idiomatic way to do this kind of for loop is:

In [None]:
for i in range(6):
  print(i)

0
1
2
3
4
5


Even better though, we can use the `enumerate` function:

In [None]:
for i, name in enumerate(names):
  print("Hello", name.capitalize(), "you're number", i)

Hello Drew you're number 0
Hello Jama you're number 1
Hello Ameer you're number 2
Hello Mehdi you're number 3


We can also execute a block of code repeatedly as long as a condition is true:

In [None]:
names = ["drew", "jama", "ameer", "mehdi",]
while len(names) > 0:
  print(names.pop(), len(names))  # remove and return the last element of `names`

mehdi 3
ameer 2
jama 1
drew 0



### Defining Functions

Often, we want to reuse code. We can do this by defining functions which accept some parameters, and return some result.

In [None]:
def square(x):
  return x * x

for i in range(10):
  print(i, '->', square(i))

0 -> 0
1 -> 1
2 -> 4
3 -> 9
4 -> 16
5 -> 25
6 -> 36
7 -> 49
8 -> 64
9 -> 81


Functions can take both positional and keyword arguments:

In [None]:
def power(x, p=1):
  return x ** p

x = 2
for i in range(4):
  print(power(x, p=i))

1
2
4
8


It is also possible to define anonymous functions called "lambda functions":

In [None]:
x = [-3, 4, 12, -2, 8, 0, 1, 7]
sorted(x, key=lambda x: abs(x)) # sort by absolute value

[0, 1, -2, -3, 4, 7, 8, 12]


### Defining Classes
Classes are custom objects. They group together data, and methods which operate on the data. They are an execellent way to improve the readability and reuseability of your code. We will use custom classes all the time in PyTorch to define new models:

In [None]:
from torch import nn

class CustomModel(nn.Module):  # we "inherit" from Module, meaning we reuse it's functionality

  def __init__(self, in_size, out_size): # the __init__ method serves to initialize a new instance, like a constructor
    # "super" allows us to call methods from the base implementation, in this case nn.Module
    super().__init__()

    # self is a special parameter which references the current instance, here
    # we use it to store custom data on the instance
    self.in_size = in_size
    self.out_size = out_size

    self.fc = nn.Linear(in_size, out_size)
    self.act = nn.ReLU()

  # "forward" is a method which is expected by nn.Module
  def forward(self, x):
    # We can access the data or methods of the module with `self.`
    x = self.fc(x)
    x = self.act(x)

    return x




### Some Other Features

Like most languages, it is possible for operations to raise errors when they encounter unexpected inputs or other issues. We can wrap code which might raise errors in a `try` clause and then catch them with `except`. This allows us to write code which handles errors gracefully.

In [None]:
try:
  x = 10 / 0
except ZeroDivisionError:
  print("Opps, can't divide by 0...")
  x = 0
except:
  print("something went wrong")
  x = -1
x  # the value of x is now 0

Opps, can't divide by 0...


0

Python natively supports the idea of "context managers." This allows us to execute some specific code at the beginning of a block of code, and again when the program leaves the block of code (regardless of any errors thrown):

In [None]:
with open('/content/sample_data/README.md') as f:  # some code is run here, and the result is provided in the variable "f"
  txt = f.readline()

# some exit logic is run implicitly here

print(txt)

This directory includes a few sample datasets to get you started.



In this case, upon entering the `with` context, we open the specified file and return a reference to the file object in the variable `f`. When we leave the `with` context, `f.close()` will be called automatically.

We will see this `with` pattern commonly in PyTorch as:
```python
with torch.no_grad():  # gradient computation is disabled, which saves memory
  model.eval()
  ... # evaluate our model
# gradient computation is now re-enabled (if it was enabled before entering the block)
```

Internally, the `with` context works by calling the `__enter__` and `__exit__` methods on the object pass to it. This means that you can define your own custom context managers by implementing these methods on your class.



### Numpy and PyTorch
Numpy and PyTorch are too very useful libraries used in machine learning.

Numpy implements "arrays" which are a more efficient, and multi-dimensional form of list. In addition it defines many useful math operations.


In [None]:
# First we make sure numpy is imported
import numpy as np

In [None]:
x = np.zeros((6, 6)) # we can create a 6x6 matrix of zeros
x

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

In [None]:
x.shape, x.dtype # the shape is 6 rows, by 6 columns, the data type is 64-bit float by default

((6, 6), dtype('float64'))

In [None]:
x = np.ones((3, 8, 8)) # we are not limited to 2 dimensional arrays...
x

array([[[1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.]],

       [[1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.]],

       [[1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1

In [None]:
x = np.random.rand(3, 4, 4) # creates a 3x4x4 array of random numbers between [0, 1]
x

array([[[0.45332707, 0.87913782, 0.7525274 , 0.53696492],
        [0.34877662, 0.15144237, 0.579431  , 0.90932915],
        [0.73254163, 0.43978424, 0.98839531, 0.32208989],
        [0.33828115, 0.14894458, 0.19692988, 0.17602081]],

       [[0.16872818, 0.97280371, 0.53619607, 0.77044525],
        [0.33664915, 0.01134832, 0.87576268, 0.76654076],
        [0.61105245, 0.99021261, 0.85370909, 0.55973741],
        [0.9660159 , 0.55608984, 0.29727331, 0.06005138]],

       [[0.09227107, 0.09683557, 0.3737085 , 0.81047916],
        [0.09499326, 0.58141306, 0.28238224, 0.12204229],
        [0.77462464, 0.47719523, 0.23991498, 0.17722404],
        [0.09535425, 0.12413666, 0.95909546, 0.5533829 ]]])

In [None]:
y = x.sum(axis=0) # sum along the first axis (3 elements)
y.shape, y

((4, 4),
 array([[0.71432631, 1.9487771 , 1.66243197, 2.11788933],
        [0.78041903, 0.74420376, 1.73757591, 1.79791219],
        [2.11821872, 1.90719208, 2.08201938, 1.05905133],
        [1.39965131, 0.82917109, 1.45329865, 0.7894551 ]]))

In [None]:
x.mean() # return the average value over all elements, we'd expect this to be close to 0.5

0.482116526002723

Numpy (and PyTorch) both support advanced indexing for arrays / tensors:

In [None]:
x[:, 0] # return the first element along the 2nd dimension

array([[0.46505492, 0.69657101, 0.19713386, 0.93933096],
       [0.90201033, 0.00325078, 0.96537857, 0.40237639],
       [0.69278739, 0.4009709 , 0.1567754 , 0.26029281]])

In [None]:
x[..., :3]  # return only the first 3 elements along the last dimension

array([[[0.45332707, 0.87913782, 0.7525274 ],
        [0.34877662, 0.15144237, 0.579431  ],
        [0.73254163, 0.43978424, 0.98839531],
        [0.33828115, 0.14894458, 0.19692988]],

       [[0.16872818, 0.97280371, 0.53619607],
        [0.33664915, 0.01134832, 0.87576268],
        [0.61105245, 0.99021261, 0.85370909],
        [0.9660159 , 0.55608984, 0.29727331]],

       [[0.09227107, 0.09683557, 0.3737085 ],
        [0.09499326, 0.58141306, 0.28238224],
        [0.77462464, 0.47719523, 0.23991498],
        [0.09535425, 0.12413666, 0.95909546]]])

In [None]:
x[[True, False, False]] # Boolean indexing

array([[[0.46505492, 0.69657101, 0.19713386, 0.93933096],
        [0.39411086, 0.67282509, 0.92814916, 0.0061341 ],
        [0.63186582, 0.43728924, 0.46777357, 0.91862493],
        [0.98333132, 0.38720654, 0.45624926, 0.76001125]]])

In [None]:
x[:, [1, 2, 0], :]  # integer indexing

array([[[0.39411086, 0.67282509, 0.92814916, 0.0061341 ],
        [0.63186582, 0.43728924, 0.46777357, 0.91862493],
        [0.46505492, 0.69657101, 0.19713386, 0.93933096]],

       [[0.64400218, 0.13739824, 0.39569877, 0.41866399],
        [0.05900367, 0.19063073, 0.89275787, 0.31559409],
        [0.90201033, 0.00325078, 0.96537857, 0.40237639]],

       [[0.49810344, 0.25775015, 0.56243913, 0.11910501],
        [0.73592121, 0.54315779, 0.87011818, 0.61196139],
        [0.69278739, 0.4009709 , 0.1567754 , 0.26029281]]])

PyTorch supports many of the same operations as Numpy with a very similar interface. However, PyTorch provides some additional functionality which is very helpful for machine learning.

In PyTorch, what was called an "array" in Numpy is now called a "tensor". Tensors can exist on different devices, namely the CPU or the GPU (cuda). This allows us to leverage the extreme parallelism capabilities of GPUs for our tensor operations.

In [None]:
import torch


# NOTE: to use "cuda" we need a GPU available. This can be done by changing the
#       runtime in Google Colab, by going to Runtime -> Change Runtime Type
device = "cuda" if torch.cuda.is_available() else "cpu"

x = torch.rand((3, 4, 4))
x

tensor([[[0.5352, 0.0947, 0.7733, 0.0189],
         [0.3561, 0.2553, 0.3326, 0.0998],
         [0.7704, 0.8500, 0.9524, 0.7296],
         [0.7963, 0.3886, 0.5943, 0.7973]],

        [[0.4594, 0.5803, 0.7034, 0.1211],
         [0.4443, 0.5238, 0.9516, 0.2728],
         [0.2328, 0.7155, 0.5388, 0.1192],
         [0.2916, 0.7087, 0.3766, 0.9178]],

        [[0.6944, 0.1613, 0.5245, 0.5382],
         [0.4946, 0.5127, 0.8687, 0.2482],
         [0.2628, 0.6253, 0.0133, 0.6206],
         [0.2462, 0.4264, 0.2256, 0.2895]]])

In [None]:
x = x.to(device) # we can move the tensor to a different device, if it's available
x

tensor([[[0.5352, 0.0947, 0.7733, 0.0189],
         [0.3561, 0.2553, 0.3326, 0.0998],
         [0.7704, 0.8500, 0.9524, 0.7296],
         [0.7963, 0.3886, 0.5943, 0.7973]],

        [[0.4594, 0.5803, 0.7034, 0.1211],
         [0.4443, 0.5238, 0.9516, 0.2728],
         [0.2328, 0.7155, 0.5388, 0.1192],
         [0.2916, 0.7087, 0.3766, 0.9178]],

        [[0.6944, 0.1613, 0.5245, 0.5382],
         [0.4946, 0.5127, 0.8687, 0.2482],
         [0.2628, 0.6253, 0.0133, 0.6206],
         [0.2462, 0.4264, 0.2256, 0.2895]]], device='cuda:0')

In [None]:
x.sum(dim=1)  # "axis" is now called "dim" in pytorch

tensor([[2.4579, 1.5886, 2.6526, 1.6456],
        [1.4280, 2.5284, 2.5705, 1.4308],
        [1.6981, 1.7257, 1.6321, 1.6965]], device='cuda:0')

Moreover, PyTorch supports **automatic differentiation** (autograd) to automatically compute gradients, and includes many useful classes and functions for deep learning. The autograd functionality is enabled by setting `requires_grad=True` on a tensor. We will then be able to **back-propagate** to compute the gradient of some function with respect to this tensor. The modules in `torch.nn` all set this automatically on their parameters:

In [None]:
x = torch.randn((3, 5, 5), requires_grad=True)  # normal distribution mean=0, std=1
loss = (x**2).mean()
loss.backward() # compute the gradients with respect to "x"
x.grad

tensor([[[ 0.0891,  0.0070, -0.0238, -0.0108,  0.0372],
         [ 0.0246, -0.0149, -0.0119, -0.0057,  0.0094],
         [-0.0005, -0.0100,  0.0006,  0.0565, -0.0198],
         [-0.0279, -0.0247,  0.0073, -0.0178, -0.0113],
         [-0.0294,  0.0098, -0.0086,  0.0331,  0.0269]],

        [[ 0.0407,  0.0506,  0.0296, -0.0104,  0.0140],
         [ 0.0007, -0.0209, -0.0080,  0.0552,  0.0011],
         [-0.0251, -0.0099,  0.0094,  0.0114, -0.0690],
         [ 0.0130,  0.0164,  0.0124, -0.0065, -0.0140],
         [ 0.0186,  0.0462,  0.0169, -0.0002,  0.0350]],

        [[ 0.0341,  0.0105, -0.0002,  0.0191, -0.0138],
         [-0.0165,  0.0001, -0.0308, -0.0374, -0.0460],
         [-0.0775,  0.0375, -0.0406, -0.0025, -0.0117],
         [-0.0118, -0.0352,  0.0542, -0.0090,  0.0196],
         [ 0.0214,  0.0136, -0.0044, -0.0396,  0.0471]]])

Autograd allows us to build advanced model architectures, and train them with **gradient descent** without worrying about manually calculating gradients.

Various optimizers (e.g `SGD`, `Adam`, etc.) are included in the `torch.optim` module and can be used to optimizer a list of `nn.Parameter` (wraps a tensor so that pytorch knows we want to optimize it) belonging to a model.

The most useful modules in PyTorch are:

- `torch.nn`: Implements many different Neural Network layers and helpful functions. `nn.Module` forms the base of all neural networks we design
- `torch.optim`: Implements various gradient descent variations.
- `torch.utils.data`: Implements a standard interface for `Dataset`, and provides `DataLoader` to dynamically iterate over the data in batches

Some other helpful libraries include:
- `torch_geometric`: Implements graph neural networks. We will use this in the GNN workshop!
- `torchvision`: Useful for computer vision tasks
- `torchaudio`: Useful for audio and signal processing tasks
- `matplotlib` / `seaborn`: Common plotting libraries
- `pandas`: Work with tabular data
- `dask`: Big data, parallel processing on CPU (with support for various HPC clusters)
