# Working with GPUs in OpenVINO™

This tutorial provides a high-level overview of working with GPUs in OpenVINO. It shows users how to use Query Device to list system GPUs and check their properties, and it explains some of the key properties. It shows how to compile a model on GPU with performance hints and how to use multiple GPUs using MULTI or CUMULATIVE_THROUGHPUT. 

The tutorial shows example commands for benchmark_app that users can run to compare GPU performance in different configurations. It also provides code for a basic end-to-end application that compiles a model on GPU and uses it to run inference.

## Introduction

1. Background and context on how GPUs are used to speed up inference
2. Introduce OpenVINO’s ability to run inference with GPUs
3. How to configure OpenVINO to work with GPUs (link to Configuration for GPU with OpenVINO page)

## Checking GPUs with Query Device

1. List GPUs with ie.get_available_devices
2. Check properties with ie.get_property
3. Brief descriptions of key properties

### List GPUs with core.get_available_devices


Firstly, in order to use GPUs, we must make sure our system is detecting them correctly.
Running the following cell should output a list of compatible OpenVINO devices, in which our Intel GPUs should appear.

In [1]:
from openvino.runtime import Core

core = Core()
core.available_devices

['CPU']

If the GPUs are installed correctly in the system and still don't appear in the list, we should follow the steps described [here](https://docs.openvino.ai/latest/openvino_docs_install_guides_configurations_for_intel_gpu.html) and try again. Once we have the GPUs working with OpenVINO we can proceed with the next sections.

### Check properties with core.get_property

Now, to get information and customize the behavior of our GPUs, we can use device properties. Devices in OpenVINO, such as CPUs and GPUs, have two types of properties: read-only and read-write. The former mainly shows information about the hardware itself like the device name or supported data types, while the latter allows us to tweak how the model is compiled, for instance to reduce latency or increase throughput.

So, to get the value of a property, such as the device name, we can use the `core.get_property` method as follows

In [2]:
device = "CPU"
core.get_property(device, "FULL_DEVICE_NAME")

'Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz'

The devices also have a specific property, called `SUPPORTED_PROPERTIES`, that allows us to see all the available properties in the device (including the `SUPPORTED_PROPERTIES` itself). To do this, we repeat the above command

In [3]:
core.get_property(device, "SUPPORTED_PROPERTIES")

{'SUPPORTED_PROPERTIES': 'RO',
 'AVAILABLE_DEVICES': 'RO',
 'RANGE_FOR_ASYNC_INFER_REQUESTS': 'RO',
 'RANGE_FOR_STREAMS': 'RO',
 'FULL_DEVICE_NAME': 'RO',
 'OPTIMIZATION_CAPABILITIES': 'RO',
 'CACHING_PROPERTIES': 'RO',
 'CACHE_DIR': 'RO',
 'NUM_STREAMS': 'RW',
 'AFFINITY': 'RW',
 'INFERENCE_NUM_THREADS': 'RW',
 'PERF_COUNT': 'RW',
 'INFERENCE_PRECISION_HINT': 'RW',
 'PERFORMANCE_HINT': 'RW',
 'PERFORMANCE_HINT_NUM_REQUESTS': 'RW'}

Note that the value for each property has either a "RO" or "RW", which corresponds to the two types mentioned previously, "**R**ead-**O**nly" and "**R**-**W**rite" respectively.

### Brief descriptions of key properties

#### PERFORMANCE_HINT
#### INFERENCE_PRECISION_HINT

#### Current values

In [4]:
for prop in core.get_property(device, "SUPPORTED_PROPERTIES"):
    if prop != "SUPPORTED_PROPERTIES":
        print(f"{prop:>30}: {core.get_property(device, prop)}")

             AVAILABLE_DEVICES: ['']
RANGE_FOR_ASYNC_INFER_REQUESTS: (1, 1, 1)
             RANGE_FOR_STREAMS: (1, 12)
              FULL_DEVICE_NAME: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
     OPTIMIZATION_CAPABILITIES: ['FP32', 'FP16', 'INT8', 'BIN', 'EXPORT_IMPORT']
            CACHING_PROPERTIES: {}
                     CACHE_DIR: 
                   NUM_STREAMS: 1
                      AFFINITY: Affinity.NONE
         INFERENCE_NUM_THREADS: 0
                    PERF_COUNT: False
      INFERENCE_PRECISION_HINT: <Type: 'float32'>
              PERFORMANCE_HINT: PerformanceMode.UNDEFINED
 PERFORMANCE_HINT_NUM_REQUESTS: 0


## Compiling a Model on GPU

1. Compile with default configuration (ie.compile_model(model, “GPU”)
2. Throughput and latency performance hints
3. Using multiple GPUs with multi-device and cumulative throughput

### Compile with default configuration (ie.compile_model(model, “GPU”)

In [5]:
model = core.read_model(model="../001-hello-world/model/v3-small_224_1.0_float.xml")
compiled_model = core.compile_model(model, "GPU")

RuntimeError: Device with "GPU" name is not registered in the OpenVINO Runtime

In [6]:
for prop in compiled_model.get_property("SUPPORTED_PROPERTIES"):
    if prop != "SUPPORTED_PROPERTIES":
        print(f"{prop:>30}: {compiled_model.get_property(prop)}")

NameError: name 'compiled_model' is not defined

### Throughput and latency performance hints

##### Throughput

In [7]:
compiled_model = core.compile_model(model, "CPU", {"PERFORMANCE_HINT": "THROUGHPUT"})

In [8]:
compiled_model.get_property("PERFORMANCE_HINT")

<PerformanceMode.THROUGHPUT: 2>

##### Latency

In [9]:
compiled_model = core.compile_model(model, "CPU", {"PERFORMANCE_HINT": "LATENCY"})

In [10]:
compiled_model.get_property("PERFORMANCE_HINT")

<PerformanceMode.LATENCY: 1>

### Using multiple GPUs with multi-device and cumulative throughput

## Performance Comparison with benchmark_app

1. Commands showing users how to run benchmark_app on GPU with various performance hints
2. Show performance results with a basic model (person-detection-0303, perhaps)

For further details check https://docs.openvino.ai/latest/openvino_inference_engine_tools_benchmark_tool_README.html#benchmark-python-tool

### Commands showing users how to run benchmark_app on GPU with various performance hints

In [11]:
!benchmark_app -m notebooks/001-hello-world/model/v3-small_224_1.0_float.xml -hint latency -d GPU

[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3
[ INFO ] 
[ INFO ] Device info:
[ ERROR ] Device with "GPU" name is not registered in the OpenVINO Runtime
Traceback (most recent call last):
  File "/Users/juliomorero/Documents/bdti/openvino_env/lib/python3.10/site-packages/openvino/tools/benchmark/main.py", line 104, in main
    benchmark.print_version_info()
  File "/Users/juliomorero/Documents/bdti/openvino_env/lib/python3.10/site-packages/openvino/tools/benchmark/benchmark.py", line 48, in print_version_info
    for device, version in self.core.get_versions(self.device).items():
RuntimeError: Device with "GPU" name is not registered in the OpenVINO Runtime


In [None]:
!benchmark_app -m notebooks/001-hello-world/model/v3-small_224_1.0_float.xml -hint latency

In [None]:
!benchmark_app -m notebooks/001-hello-world/model/v3-small_224_1.0_float.xml -hint throughput

In [None]:
!benchmark_app -m notebooks/001-hello-world/model/v3-small_224_1.0_float.xml -hint cumulative_throughput

### Show performance results with a basic model (person-detection-0303, perhaps)

## Basic Application Using GPUs

1. Provide end-to-end sample code for running inference on GPU in a basic application

## Conclusion

1. GPUs are easy to use with OpenVINO and considerably boost performance
2. Links to OpenVINO documentation where readers can learn more