# Deep Learning - Part 0

This notebook explains how to install all the preriquistes and libraries that you will need to run the following tutorials. If you can execute all the following cells, you are good to go.

## Environment configuration


### Install conda

There are two major package managers in Python: pip and conda. For this tutorial we will be using conda which, besides being a package manager is also useful as a version manager. There are two main ways to install conda: Anaconda and Miniconda. Any will be useful for this course, just follow instructions here, according to your operative system:

https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html#regular-installation

### Create an environment with all the Anaconda libraries

    $ conda create --name deeplearning python=3.7 anaconda

Don't forget to activate the new env

    $ conda activate deeplearning    

### Install TensorFlow

We will use the [TensorFlow](https://www.tensorflow.org/) library to build and train models. In particular, we will use [Keras](https://www.tensorflow.org/guide/keras) module, which are simpler to implement and understand, at the cost of lossing flexibility when defining the architectures.

In order to install tensorflow we recommend following the [official documentation](https://www.tensorflow.org/install). In your local machine, you will install the version that only has cpu support, but in Nabucodonosor you need to install the version with [GPU support](https://www.tensorflow.org/install/gpu).

#### CPU

Upgrade `pip` to the latest version:

    (deeplearning) $ pip install --upgrade pip

Install tensorflow:

    (deeplearning) $ pip install --upgrade tensorflow
    
Then just check the version installed is 2.0.

In [1]:
import tensorflow as tf

tf.__version__

'2.0.0'

    
#### GPU

The supported version of Tensorlfow depends on the cuda drivers intalled on the machine. In the case of Nabucodonosor, cuda and cudnn libraries are located in the /opt directory. You can check the system has intalled cuda 10.X, and cuddnn >= 7.4.1, enough to intall tensorflow 2.0.

    (deeplearning) $ pip install tensorflow-gpu

**WARNING**: changes between tensorflow and keras versions are not minor and your code will break if you don't migrate. For example: https://www.tensorflow.org/beta/guide/effective_tf2

Now we need to tell tensorflow where cuda is installed by setting the environment variable LD_LIBRARY_PATH

    $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/10.0/lib64:/opt/cudnn/v7.6-cu10.0/
    $ export CUDA_HOME=/opt/cuda/10.0

It is convenient to add this statement to your `~/.bashrc` file, so it is executed everytime you open a new console.

To check if it works, execute the following cell

In [2]:
import tensorflow as tf

tf.test.is_gpu_available()

True

### Install other libraries

We need the `gensim` library to deal with word embeddings, so you need to install it. Plus, the `mlflow` tool to keep track of experiments. Also, for seeing a graphical representation of the Keras models, you need `graphviz` and `pydot`.


```
(deeplearning) $ pip install gensim mlflow
(deeplearning) $ conda install graphviz python-graphviz pydot
```

## Download embeddings and dataset

### MNIST

The dataset we will use (MNIST) will be downloaded by Keras automatically the first time you use it. To save time, you can download it now running the next cell.

In [3]:
df = tf.keras.datasets.mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


### PetFinder Dataset

For this course we will setup a Kaggle competition based on the same data for the course of "Supervised Learning". You can access the competition with [this link](https://www.kaggle.com/t/8842af91604944a9974bd6d5a3e097c5) and download the dataset (check the **Download All** button).

Once you have the dataset downloaded in your machine you can copy it to `nabucodonosor` with the following command (this assumes you are already in the directory having the dataset):

    $ scp diplodatos-deeplearning-2019.zip USERNAME@nabucodonosor.ccad.unc.edu.ar:./

After that you should enter to nabucodonosor (via ssh) and unzip it like so:

    $ unzip -d petfinder_dataset diplodatos-deeplearning-2019.zip

## Using the server

### Tunneling and ssh

**How do you run a notebook in a remote machine?** You use an ssh connection with a port forwarding. This way, everything that goes to the port on the server machine (like a jupyter notebook) also goes to your localhost.

It is likely that everyone will be using the same ports, so we recommend you to select a random number before connecting. The port on the ssh must be the same that you use to start the notebook.

```
$ ssh -L PORT:localhost:PORT USER@SERVER
$ conda activate diplodatos
(diplodatos) $ jupyter notebook --port PORT --no-browser
```

Now you can use the notebook as if it were running on your computer

### Using slurm

The Nabucodonosor server uses a queue system called slurm, which grants exclusive access to the CPU resources. You should enqueue everythin you do that takes more than 10 minutes!

#### Set up

1. Download the script https://raw.githubusercontent.com/MIREL-UNC/mirel-scripts/master/run_scripts/submit_job_slurm.sh

2. Create a logs folder

#### Enqueue things

To enqueue a job on slurm, first put your command in a file, for example command.txt
```
$ sbatch submit_job_slurm.sh commant.txt
```

The queue will assign your job a number JOBID. All the output of your process will be redirected to logs/JOBID.out and logs/JOBID.err

#### Controlling things

To see the state of the queue run `$ squeue`

To cancel a job run `$ scancel JOBID`

### Avoid using GPUs

If all the GPUs are being used, you can still force Keras to use the CPU. For simple models this is still a very good option.

The easiest way is to run set the environment variable  `CUDA_VISIBLE_DEVICES=""` when running your commands. For example:

```
(diplodatos) $ CUDA_VISIBLE_DEVICES="" jupyter notebook --no-browser
(diplodatos) $ CUDA_VISIBLE_DEVICES="" exercise_1.py --experiment_name mlp_200
```