# How to configure a laptop's NVIDIA GPU for Deep Learning
> Use your own Graphics Card to speed up Tensorflow

- toc: true 
- badges: true
- comments: true
- categories: [GPU, tensorflow, torch]
- hide: false

# Introduction

It is well known for anybody who has ever tried to train any Deep Learning model beyond a simple Multi Layer Perceptron on a few observations that using a GPU for training is kind of a must. While nowadays many different platforms and services offer GPUs for free (I'm looking at you Kaggle and Google Colab) if you're anything like me, and you also happen to have a good NVIDIA Graphics Card in your computer, surely you've felt like giving it a go by yourself.

And here is where complications start. Configuring a NVIDIA GPU to be found by your favourite Deep Learning framework it's way more complicated than it should. Specially if you are installing it on Windows, as it happens to me. Chances are that you have not purchased a high-end NVIDIA exclusively for doing Deep Learning, bur rather you bought the laptop as a gaming station (this is basically what happened to me) and thought of exploring the GPU capabilities in it.

I've gone through the process a few times, and what follows it's my own personal guidelines to replicate the setup should I have to do a hard reset on my laptop (God forbids),

# Let's Get to it

Basically we need to install a few things, and the bigger complexity comes from knowing what these are, and which version we should get.

First of all, we are going to download the [latest driver](https://www.nvidia.es/Download/index.aspx) for Our NVIDIA Card. In my case, it's a RTX 2060 operating on Windows 11. For this one you would want to have the latest one. For things we will be installing afterwwards that won't be the case, as we will have contraints due to for example, which version of CUDA is allowed for our card model.

Another thing you would need is the latest **Microsoft Visual C++** development kit.

Install [CUDA toolkit 11](https://developer.nvidia.com/cuda-downloads) at least, and certainly the latest one available.

We will also need **cuDNN**. Here we differ from the guidelines offered by Professor Heaton in the tutorial in the references section. Reason is, if we are willing to go with versions that are not precisely the latest ones for Tensorflow or other frameworks, then we can let **conda** take care of the installation of the rest of pieces. We will use conda to create a brand new environment containing a version of cudnn and cudatoolkit that our Graphics Card would tolerate, and finally a good combination of python version and tensorflow version. 

In my case this happens to be:

```bash
conda create -n tensorflow-cuda cudnn=7.6 cudatoolkit=11.1 python=3.7
conda activate tensorflow-cuda
conda install nb_conda
pip install tensorflow==2.3.0
conda env update --file environment.yml
python -m ipykernel install --user --name tensorflow-cuda --display-name "Python (tensorflow-cuda)"
```

I typically use `environment.yml` as a mean to install any additional library I may need and update the environment on the fly each time. Say I need or want to use scikitlearn and it wasn't installed, I just add it to the `environment.yml` file and update the environment.

# Check Configuration

After this, we can quickly test whether our installation of Tensorflow is working appropiately. I will use the jupyter kernel running behind this post as it is easy and convenient for me. A simple matrix multiplication will do. I mean, that's **why we use GPUs to begin with**.

In [1]:
import tensorflow as tf

print(tf.test.is_gpu_available())
print(tf.test.is_built_with_cuda())

Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
True
True


If I switch to the jupyter kernel without GPU enabled:

In [1]:
%%time

import tensorflow as tf
import warnings
warnings.filterwarnings('ignore')

tf.debugging.set_log_device_placement(True) 
a = tf.constant([1.22582], shape=[10000, 10000], name='a')
b = tf.constant([1.1125], shape=[10000, 10000], name='b')
c = tf.matmul(a, b)

Executing op Fill in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Fill in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op MatMul in device /job:localhost/replica:0/task:0/device:CPU:0
Wall time: 6.45 s


And with GPU

In [11]:
%%time

import tensorflow as tf
import warnings
warnings.filterwarnings('ignore')

tf.debugging.set_log_device_placement(True) 
a = tf.constant([1.22582], shape=[10000, 10000], name='a')
b = tf.constant([1.1125], shape=[10000, 10000], name='b')
c = tf.matmul(a, b)

Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
Wall time: 608 ms


As we can see, with GPU the matrix multiplication takes 10x less time to run :smiley:.

# References

* [Tensorflow Documentation](https://www.tensorflow.org/install/gpu)
* [Jeff Heaton's Video Tutorial](https://www.youtube.com/watch?v=OEFKlRSd8Ic)