### Using GPUs Can Speed Up Our Models

We will look at how to speed up our models by using a **GPU**. We will also see how to split the computations across multiple devices, including **CPU** and numerous **GPU** devices.

Thanks to GPUs, instead of waiting for days or weeks for a training algorithm to complete, we may end up waiting for just a few minutes or hours. This saves an enormous amount of time, but it also means that we can experiment with various models much more quickly and frequently to retrain our models on new data.

![image.png](attachment:image.png)

We can often get a significant performance boost by merely adding **GPU cards** to a single machine. In fact, in many cases, this will suffice; we won’t need to use multiple machines at all. 
* We can typically train a neural network just as fast using four GPUs on a single machine rather than eight GPUs across multiple machines, due to the extra delay imposed by network communications in a distributed setup. 
* Similarly, using a single powerful GPU is often preferable to using various slower GPUs.

### Using a GPU-Equipped Virtual Machine

All major cloud platforms now often **GPU VMs**. Some preconfigured with all the drivers and libraries we need (including TensorFlow). 
* **Google cloud platform (GCP)** enforces various GPU quotas, both worldwide and per region: We cannot just create thousands of **GPU VMs** without prior authorization from Google. 
* By default, the worldwide GPU quota is Zero, so we cannot use any **GPU VMs**. Therefore, the very first thing we need to do is to request a higher worldwide allowance. 

In the **GCP Console**;
1. open the navigation menu and go IAM & admin — quotas. 
2. Click Metric, 
3. click none to uncheck all location, then search for **GPU** and select **GPUs** to see the corresponding allowance. 
4. If this quota’s value is zero, then check the box next to it and click **Edit quotas**. 
5. Fill in the requested information, then click **submit a request**. 

It may take a few hours ( or up to a few days) for four quota requests to be processed and accepted. By default, there is also a quota of one GPU per region and per GPU type. We can request to increase these quotas too: 
1. click Metric, 
2. select None to uncheck all metrics, search for **GPU**, and choose the type of **GPU** we want ( e.g., **NVIDIA P4 GPUs**). 
3. Then click the location drop-down menu, 
4. click none to uncheck all metrics and the place we wish to; check the boxes next to the quota(s) we want to change, and 
5. click **Edit quotas** to file a request.

Once our **GPU** quota requests are approved, we can in no time create a **VM** equipped with **one or more GPUs** by using **Google Cloud AI Platform’s Deep Learning VM images**: 
1. go to https://homl.info/dlvm, 
2. click view console, then click  **Launch on compute Engine** and fill in the VM configuration form. 

Note that some locations do not have all GPUs, and some have no GPUs at all ( change the site to see the type of GPUs available, if any).

Make sure to select TensorFlow 2.0 as the framework, and check **Install NVIDIA GPU driver automatically on the first startup**. It is also good to check **Enable access to JupyterLab via URL instead of SSH**: 
* This will make it very easy to start a **jupyter notebook** running on this **GPU VM**, powered by **jupyterLab** (this is an alternative web interface to run **Jupyter notebooks**).
* Once the Notebook instance appears in the list (this may take a few minutes, click Refresh once in a while until it seems), click its open jupyterlab link. 
* This will run **Jupyterlab** on the **VM** and connect our browser to it. We can create notebooks and run any code we want on this **VM**, and benefit from its **GPUs**.
* If we want to run some quick tests or easily share notebooks with our colleagues, we should try Colaboratory.

### Google Colaboratory — Free GPU

The simplest and cheapest way to access a **GPU VM** is to use **colaboratory ( or colab, for short)**. It’s free! Just go to **Google Colab** and create a new **Python 3 notebook**: this will create a jupyter notebook on our **Google Drive**.

Colab’s user interface is similar to Jupyter’s, except we can share and use the notebooks like regular Google Docs. There are a few other minor differences (e.g., we can create handy widgets using individual comments in our code).

When we open a Colab notebook, it runs on a free Google VM dedicated to us, called a **colab Runtime**. By default, the Runtime is **CPU- only**. But we can change this by going to Runtime- **Change runtime type**, selecting **GPU** in the **Hardware accelerator** drop-down menu, then click Save. 
* We could even choose **TPU!**.

If we run multiple colab notebooks using the same runtime type, they will use the same colab Runtime. So if one writes to a file, the others will be able to read that file. It’s essential to understand the security implications of this. 
* If we run an untrusted colab notebook written by a nasty hacker, it may read private data produced by the other notebooks and then leak it back to the hacker. 
* If this includes private access keys for some resources, the hacker will gain access to those resources.

Moreover, if we install a library in the colab runtime, the other notebooks will also have that library. Depending on what we want to do, this might be great or annoying ( e.g., it means we cannot easily use different versions of the same library in different colab notebooks).

Colab does have some restrictions: as the FAQ states, **Colaboratory is intended for interactive use. Long-running background computations, particularly on GPUs, may be stopped. Please do not use colaboratory for cryptocurrency mining**. 
* The web interface will automatically disconnect from the colab Runtime if we leave it unattended for a while (~30 minutes).

When we reconnect to the colab Runtime, it may have been reset, so make sure we reconnect to the colab Runtime, it may have been reset, so make sure we always download any data we care about. 
* Even if we never disconnect, the colab Runtime will automatically shut down after 12 hours, as it is not meant for long-running computations. Despite these limitations, it’s a fantastic tool to run tests quickly, get quick results, and collaborate with our colleagues.