# Deep Learning - Part 0

This notebook explains how to install all the prerequisites and libraries that you will need to run the following tutorials. If you can execute all the following cells, you are good to go.

## Environment configuration

### Install **conda**

There are two major package managers in *Python*: **pip** and **conda**. For this tutorial we will be using *conda* which, besides being a package manager is also useful as a version manager. There are two main ways to install *conda*: **Anaconda** and **Miniconda**. Any of them will be useful for this course, just follow the instructions here, according to your operative system:

https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html#regular-installation

### Create an environment with all the *Anaconda* libraries

`$ conda create --name deeplearning python=3.7 anaconda`

Don't forget to activate the new environment...

`$ conda activate deeplearning`

### Install *PyTorch*

This year we will be using [PyTorch](https://pytorch.org/) as the library to build and train the deep learning models. The library is a little less abstract than other possibilities, such as [Keras](https://www.tensorflow.org/guide/keras), but gives a little more control to the user which in turn allows more customization.

In order to install **PyTorch** we recommend the following [official documentation](https://pytorch.org/get-started/locally/). In your local machine, you will install the version that only has *CPU* support (i.e. no *CUDA* version), but in *Nabucodonosor* you need to install the version with *GPU* support.

#### CPU

Install *PyTorch* for CPU:

`(deeplearning) $ conda install pytorch torchvision cpuonly -c pytorch`

Then just check the version installed is $>= 1.7.0$.

In [1]:
import torch

torch.__version__

'1.9.1+cu102'

#### GPU

The **GPU PyTorch** depends on the *CUDA* version installed. *Nabucodonosor* has many installations of *CUDA* in the `/opt/cuda` directory. You need to add `nvcc` to the `$PATH`. For example, to setup for *CUDA 10.2*, do the following:

`(deeplearning) $ export PATH=/opt/cuda/10.2/bin:$PATH`

That needs to be done every time you enter *Nabucodonosor*. To avoid that step, add it to your `.bashrc` file:

`(deeplearning) $ echo "export PATH=/opt/cuda/10.2/bin:$PATH" >> $HOME/.bashrc`

Then, install the *PyTorch* library:

`(deeplearning) $ conda install pytorch torchvision cudatoolkit=10.2 -c pytorch`

Check if this is working by running the following cell:

In [2]:
torch.cuda.is_available()

True

#### Google Colab

In case you want to install *PyTorch* on a *Google Colab*, it is possible, but first you need to check what version of `nvcc` is running. For that run the following:

In [3]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105


According to what the previous cell tells you, you will need to install the proper drivers, with `pip` instead of *conda*. Please refer to the [getting started](https://pytorch.org/get-started/locally/) page and check what to do next.

### Install other libraries

We need the `gensim` library to deal with *word embeddings*, so you need to install it. Plus, the `mlflow` tool to keep track of experiments. Finally, `tqdm` is a handful progress bar to keep track of different processes.

`(deeplearning) $ conda install gensim mlflow tqdm -c conda-forge`

If you have problems importing `gensim` and get this error:

`ImportError: cannot import name 'open' from 'smart_open' (C:\ProgramData\Anaconda3\lib\site-packages\smart_open\__init__.py)`

Then try updating `smart_open`:

`(deeplearning) $ conda update smart_open`

## Download *embeddings* and dataset

### CIFAR10

The dataset we will use (**CIFAR10**) is part of the `torchvision` package, which makes it fairly easy to download. You can learn more details on it [here](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#loading-and-normalizing-cifar10):

In [4]:
import torchvision

torchvision.datasets.CIFAR10(root='data', download=True);

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to data/cifar-10-python.tar.gz


  0%|          | 0/170498071 [00:00<?, ?it/s]

Extracting data/cifar-10-python.tar.gz to data


### *Glove Embeddings* and *IMDB Reviews* Dataset

Some examples that we will run for *text classification* using **Convolutional Neural Networks** require the *Glove Embeddings* as well as the *IMDB Reviews* dataset:

In [5]:
!curl -L https://cs.famaf.unc.edu.ar/\~ccardellino/resources/diplodatos/glove.6B.50d.txt.gz -o data/glove.6B.50d.txt.gz
!curl -L https://cs.famaf.unc.edu.ar/\~ccardellino/resources/diplodatos/imdb_reviews.csv.gz -o data/imdb_reviews.csv.gz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 65.9M  100 65.9M    0     0  32.7M      0  0:00:02  0:00:02 --:--:-- 32.7M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 25.3M  100 25.3M    0     0  30.2M      0 --:--:-- --:--:-- --:--:-- 30.1M


### MeLi Challenge Dataset

For the course project, we will be using a dataset based on the **2019 MeLi Challenge** dataset, for automatic classification of products categories:

In [6]:
!curl -L https://cs.famaf.unc.edu.ar/\~ccardellino/resources/diplodatos/meli-challenge-2019.tar.bz2 -o data/MeLi_Challenge.tar.bz2
!tar jxvf data/MeLi_Challenge.tar.bz2 -C data/

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  945M  100  945M    0     0  34.9M      0  0:00:27  0:00:27 --:--:-- 35.0M
meli-challenge-2019/
meli-challenge-2019/spanish.test.jsonl.gz
meli-challenge-2019/portuguese.validation.jsonl.gz
meli-challenge-2019/portuguese.train.jsonl.gz
meli-challenge-2019/spanish.train.jsonl.gz
meli-challenge-2019/spanish_token_to_index.json.gz
meli-challenge-2019/portuguese_token_to_index.json.gz
meli-challenge-2019/spanish.validation.jsonl.gz
meli-challenge-2019/portuguese.test.jsonl.gz


## Using *Nabucodonosor*

### Tunneling and **ssh**

**How do you run a notebook in a remote machine?** You use an **ssh** connection with a port forwarding. This way, everything that goes to the port on the server machine (like a *jupyter notebook*) also goes to your *localhost*.

It is likely that everyone will be using the same ports, so we recommend you to select a random number before connecting. The port on the **ssh** must be the same that you use to start the notebook.

```
$ ssh -L PORT:localhost:PORT USER@SERVER
$ conda activate deeplearning
(deeplearning) $ jupyter notebook --port PORT --no-browser
```

Now you can use the notebook as if it was running on your computer.

### Using **slurm**

The *Nabucodonosor* server uses a queue system called **slurm**, which grants exclusive access to the GPU resources. You should enqueue everything you do that takes more than 10 minutes.

#### Set-Up

1. Download the script https://raw.githubusercontent.com/MIREL-UNC/mirel-scripts/master/run_scripts/submit_job_slurm.sh.

2. Create a *logs* folder.

#### Enqueue things

To enqueue a job on **slurm**, first put your command in a file, for example `command.txt`.

```
$ sbatch submit_job_slurm.sh command.txt
```

The queue will assign your job a number **JOBID**. All the output of your process will be redirected to `logs/JOBID.out` and `logs/JOBID.err`.

#### Controlling things

To see the state of the queue run `$ squeue`, and to cancel a job run `$ scancel JOBID`.

### Avoid using GPUs

If all the GPUs are being used, you can still force **PyTorch** to use the CPU. For simple models this is still a very good option.

The easiest way is to set the environment variable `CUDA_VISIBLE_DEVICES=""` when running your commands. For example:

```
(deeplearning) $ CUDA_VISIBLE_DEVICES="" jupyter notebook --no-browser
(deeplearning) $ CUDA_VISIBLE_DEVICES="" exercise_1.py --experiment_name mlp_200
```