# Using a GPU server for running your Machine Learning models

This notebook explains how to install all the prerequisites and libraries that you will need to run your ML models in a GPU server with Keras. If you can execute all the following cells, you are good to go.

## <a id='0'>Content</a>

- <a href='#1'>Connecting to the GPU server</a>  
- <a href='#2'>Changing your default password</a>  
- <a href='#3'>Environment configuration</a>
    - <a href='#31'>Option 1: Conda</a>
    - <a href='#32'>Option 2: Virtualenv and Virtualenrwrapper</a>
- <a href='#4'>Optional libraries</a>
- <a href='#5'>Using the server</a>  
    - <a href='#51'>Tunneling and ssh</a>  
    - <a href='#52'>Using slurm</a>    
- <a href='#6'>Using Keras with GPUs </a>  
    - <a href='#61'>GPU usage monitoring </a>   
    - <a href='#62'>Avoid using GPUs</a>   
- <a href='#7'>Using TensorFlow on demand</a>  
    - <a href='#71'>If you use *TF* with *Keras*</a>   
    - <a href='#72'>If you use *Keras* that brings *TF*</a>  
    - <a href='#73'>If *TF* is used alone without *Keras*</a>  

---

# <a id="1">Connecting to the GPU server</a> 

First of all we must access via ssh:

```
$ ssh <user>@<remote_server>
```

Then you must insert your password and that's all, you're in!

---

# <a id="2">Changing your default password</a>  

After you log in to your server with SSH:

1. Enter the command:

```
$ passwd
```

2. Type your password, then press Enter.

3. When prompted for your current UNIX password, enter your SSH password, then press Enter.

4. Retype your new password and press enter. If successful, you will see the output:

```
$ passwd: all authentication tokes updated successfully
```

---

# <a id="3">Environment configuration</a>  

## <a id="31"> Option 1: Conda</a>  

There are two major package managers in Python: `pip` and `conda`. `conda`, besides being a package manager is also useful as a version manager. There are two main ways to install conda: [Anaconda](https://conda.io/docs/install/quick.html) and [Miniconda](https://conda.io/miniconda.html).

In order to install tensorflow we recommend following the [official documentation](https://www.tensorflow.org/install/install_linux#installing_with_anaconda). In particular, for the `conda` installation, they advise to use `pip` instead of `conda` as the only available Anaconda package for `tensorflow` is not actively mantained.

All the available tensorflow versions (for both Python 2 and 3 and with CPU and GPU support) can be found [in this link](https://www.tensorflow.org/install/install_linux#top_of_page). There're CPU (https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.10.0-cp35-cp35m-linux_x86_64.whl) and GPU (https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.10.0-cp35-cp35m-linux_x86_64.whl)versions of tensorflow. But as we are using a GPU server, we recommended to use the GPU version (otherwise, what's the joke?).


The commands to setup the environment are the following:

```
$ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ bash Miniconda3-latest-Linux-x86_64.sh
$ conda create --name <your_env> python=3.5
$ source activate <your_env>
(<your_env>) $ conda install numpy scipy scikit-learn jupyter nb_conda keras
(<your_env>) $ pip install --ignore-installed --upgrade YOUR_TENSORFLOW_URL
(<your_env>) $ jupyter notebook
```

__Note:__ It's quite important to install `keras` before `tensorflow`, as it overwrites the `tensorflow` version.

## <a id="32">Option 2: Virtualenv and Virtualenrwrapper</a>  

```
$ pip install --user virtualenv virtualenvwrapper
```

Then add the following lines to the bottom of you `~/.bashrc` file:

```
# virtualenv and virtualenvwrapper settings
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
[[ -s "$HOME/.local/bin/virtualenvwrapper.sh" ]] && source "$HOME/.local/bin/virtualenvwrapper.sh"
```

Update changes:

```
$ source ~/.bashrc
```

You should see an output like this:

```
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/premkproject
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/postmkproject
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/initialize
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/premkvirtualenv
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/postmkvirtualenv
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/prermvirtualenv
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/postrmvirtualenv
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/predeactivate
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/postdeactivate
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/preactivate
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/postactivate
virtualenvwrapper.user_scripts creating /home/<username>/.virtualenvs/get_env_details
```

Create virtualenv:

```
$ mkvirtualenv <name> --python=/usr/bin/python3
```

Activate virtualenv:

```
$ source /users/<your_user>/.virtualenvs/<your_env>/bin/activate
```

Deactivate virtualenv:

```
$ deactivate
```

Additional Python libs:

Install other Python libraries with: `$ pip install -r requirements.txt`

---

# <a id="4">Optional libraries</a>  

These are some optional libraries to download in order to see some visualizations. They take a while, so if you don't have good Internet connection or no time you can skip them.

To visualize keras graphs:
```
(<your_env>) $ pip install pydot pydotplus
(<your_env>) $ conda install graphviz matplotlib
```

---

# <a id="5">Using the server</a>  

## <a id="51">Tunneling and ssh</a>  

**How do you run a notebook in a remote machine?** You use an ssh connection with a port forwarding. This way, everything that goes to the port on the server machine (like a jupyter notebook) also goes to your localhost.

It is likely that everyone will be using the same ports, so we recommend you to select a random number before connecting. The port on the ssh must be the same that you use to start the notebook.

```
$ ssh -L PORT:localhost:PORT USER@SERVER
$ source activate <your_env>
(<your_env>) $ jupyter notebook --port PORT --no-browser
```

Now you can use the notebook as if it were running on your computer, simply go to: 

`http://localhost:/?token=<token>`

## <a id="52">Using slurm</a>  

If the server uses a queue system called `slurm`, which grants exclusive access to the CPU resources. You should enqueue everythin you do that takes more than 10 minutes!

#### Set up

1. Download the script https://raw.githubusercontent.com/MIREL-UNC/mirel-scripts/master/run_scripts/submit_job_slurm.sh

2. Create a logs folder

#### Enqueue things

To enqueue a job on slurm, first put your command in a file, for example command.txt
`
$ sbatch submit_job_slurm.sh commant.txt
`

The queue will assign your job a number `JOBID`. All the output of your process will be redirected to `logs/JOBID.out` and `logs/JOBID.err`.

#### Controlling things 

To see the state of the queue run `$ squeue`

To cancel a job run `$ scancel JOBID`

---

# <a id="6">Using Keras with GPUs </a>  

If you installed tensorflow with a GPU support, now it's a good time to check if it actually detects your devices.

In [4]:
import tensorflow
print(tensorflow.__version__)

1.10.1


In [5]:
from tensorflow.python.client import device_lib

def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']
get_available_gpus()

[]

If the above gives an error, try setting the environment variables. You can add this to your `.bashrc`, the changes are only temporary

In [6]:
%%bash
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/9.0/extras/CUPTI/lib64/:/opt/cuda/9.0/lib64:/opt/cudnn/v7.0/
export CUDA_HOME=/opt/cuda/9.0

## <a id="61">GPU usage monitoring</a>  

For Nvidia GPUs there is a tool `nvidia-smi` that can show memory usage, GPU utilization and temperature of GPU. 


The Watch command:

`watch` is used to run any designated command at regular intervals. The time interval between reports can be changed from its default two seconds by using the `-n` option followed by an integer which represents the desired number of seconds. So, we can use the `watch` command together with `nvidia-smi`, in the following way:

```
$ watch -n 1 nvidia-smi
```

## <a id="62">Avoid using GPUs</a>  

If all the GPUs are being used, you can still force Keras to use the CPU. For simple models this is still a very good option.

The easiest way is to run you command with `CUDA_VISIBLE_DEVICES=""`. For example:

```
(<your_env>) $ CUDA_VISIBLE_DEVICES="" jupyter notebook --no-browser
(<your_env>) $ CUDA_VISIBLE_DEVICES="" exercise.py --experiment_name exp_000
```

Now all you have to do is have a little fun and take advantage of the benefits of GPUs.

---

# <a id="7">Using TensorFlow on demand </a>  

## <a id="71">If you use *TF* with *Keras*</a> 

In [None]:
import tensorflow as tf
import keras
from keras import backend as K

# TensorFlow wizardry
config = tf.ConfigProto()
 
# Don't pre-allocate memory; allocate as-needed
config.gpu_options.allow_growth = True
 
# Only allow a total of half the GPU memory to be allocated
#config.gpu_options.per_process_gpu_memory_fraction = 0.5
 
# Create a session with the above options specified.
K.tensorflow_backend.set_session(tf.Session(config=config))


## <a id="72">If you use *Keras* that brings *TF*</a> 

In [None]:
import tensorflow as tf
from tensorflow.keras import backend as K

# TensorFlow wizardry
config = tf.ConfigProto()
 
# Don't pre-allocate memory; allocate as-needed
config.gpu_options.allow_growth = True
 
# Only allow a total of half the GPU memory to be allocated
#config.gpu_options.per_process_gpu_memory_fraction = 0.5
 
# Create a session with the above options specified.
K.set_session(tf.Session(config=config))


## <a id="73">If *TF* is used alone without *Keras*</a> 

In [None]:
import tensorflow as tf

# TensorFlow wizardry
config = tf.ConfigProto()
 
# Don't pre-allocate memory; allocate as-needed
config.gpu_options.allow_growth = True
 
# Only allow a total of half the GPU memory to be allocated
#config.gpu_options.per_process_gpu_memory_fraction = 0.5
 
# Create a session with the above options specified.
with tf.Session(config=config) as sess:

This work is based on [Documentation](https://michaelblogscode.wordpress.com/2017/10/10/reducing-and-profiling-gpu-memory-usage-in-keras-with-tensorflow-backend/)