# Managing IPU resources from notebooks

The execution model of IPUs and notebooks means that as you experiment with different models
you might keep hold of hardware in an idle state, preventing other users from using it. Or
your experiments might fail because you have insufficient hardware.
Releasing hardware is particularly important in notebooks as the long life time of the
underlying `ipython` kernel can keep a lock on IPUs long after you are done interacting
with the hardware.

The Graphcore frameworks operate a computational architecture of 1 model = 1 IPU device;
this means that each model will attach to specific IPUs and will only release them when
that model goes out of scope or when resources are explicitly released.

In this notebook you will learn:

- to monitor how many IPUs your notebook is currently using
- to release IPUs by detaching a model
- to reattach a model to IPUs, to continue using a model after a period of inactivity.

For more information on the basics of IPU computational architecture you may want to read
the [IPU Programmer's Guide](https://docs.graphcore.ai/projects/ipu-programmers-guide/en/latest/ipu_introduction.html).

## Setup

In order to run this demo you will need Optimum Graphcore.

First of all, let's make sure your environment has the latest version of [🤗 Optimum Graphcore](https://github.com/huggingface/optimum-graphcore) available.

In [1]:
%pip install "optimum-graphcore>=0.4, <0.5"

Collecting optimum-graphcore<0.5,>=0.4
  Downloading optimum_graphcore-0.4.3-py3-none-any.whl (180 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m180.6/180.6 kB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: optimum-graphcore
  Attempting uninstall: optimum-graphcore
    Found existing installation: optimum-graphcore 0.5.0
    Uninstalling optimum-graphcore-0.5.0:
      Successfully uninstalled optimum-graphcore-0.5.0
Successfully installed optimum-graphcore-0.4.3
[0mNote: you may need to restart the kernel to use updated packages.


In [2]:
import os

## Monitoring resources

Grapchore provides the `gc-monitor` utility for inspecting the number of available IPUs and their usage:

In [3]:
!gc-monitor

+---------------+---------------------------------------------------------------------------------+
|  gc-monitor   |         Partition: lr17-1-poplar-19 [active] has 4 reconfigurable IPUs          |
+-------------+--------------------+--------+--------------+----------+-------+----+------+-------+
|    IPU-M    |       Serial       |IPU-M SW|Server version|  ICU FW  | Type  | ID | IPU# |Routing|
+-------------+--------------------+--------+--------------+----------+-------+----+------+-------+
|  10.5.18.3  | 0055.0002.8204821  | 2.6.0  |    1.11.0    |  2.5.9   | M2000 | 0  |  3   |  DNC  |
|  10.5.18.3  | 0055.0002.8204821  | 2.6.0  |    1.11.0    |  2.5.9   | M2000 | 1  |  2   |  DNC  |
|  10.5.18.3  | 0055.0001.8204821  | 2.6.0  |    1.11.0    |  2.5.9   | M2000 | 2  |  1   |  DNC  |
|  10.5.18.3  | 0055.0001.8204821  | 2.6.0  |    1.11.0    |  2.5.9   | M2000 | 3  |  0   |  DNC  |
+-------------+--------------------+--------+--------------+----------+-------+----+------+-------+


In a notebook, we can run this Bash command using `!` in a regular code cell. It provides detailed information on the IPUs that exist in the current partition.
The first section of the output is the `card-info`, this is generic information about the IP addresses and serial numbers of all the cards visible to the process.
The second section of the output indicates usage information of the IPU: it will indicate the user, host and PID which are attached to the different IPUs.

When monitoring IPUs it can be useful to run `gc-monitor` without displaying the static IPU information:

In [4]:
!gc-monitor --no-card-info

+--------------------------------------------------------------------------------------------------+
|                       No attached processes in partition lr17-1-poplar-19                        |
+--------------------------------------------------------------------------------------------------+


Finally, we can write a command that will monitor only the IPUs which are attached from this specific notebook. We do that by only displaying the IPUs attached to a specific PID:

In [5]:
!gc-monitor --no-card-info | grep ${os.getpid()}

Since we've not attached to any IPUs yet, there is no output.

Beyond `gc-monitor`, Graphcore also provides a library for monitoring usage called `gcipuinfo` which can be used in Python. This library is not covered in this tutorial but [examples are available in the documentation](https://docs.graphcore.ai/projects/gcipuinfo/en/latest/examples.html).

### Creating models

Now let's create some models and attach them to IPUs. The simplest way to create a small model is using the inference `pipeline` provided by the `optimum-graphcore` library.

In [6]:
from optimum.graphcore import pipelines
sentiment_pipeline = pipelines.pipeline("sentiment-analysis")
sentiment_pipeline(["IPUs are great!", "Notebooks are easy to program in"])

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/255M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/699 [00:00<?, ?B/s]



Downloading:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

No padding arguments specified, so pad to 128 by default. Inputs longer than 128 will be truncated. To change this behaviour, pass the `padding='max_length'` and`max_length=<your desired input length>` arguments to the pipeline function
Graph compilation: 100%|██████████| 100/100 [00:29<00:00]


[{'label': 'POSITIVE', 'score': 0.9998313188552856},
 {'label': 'POSITIVE', 'score': 0.972362220287323}]

In [22]:
sentiment_pipeline(8*["poor effect.", "Notebooks are easy to program in"])

[{'label': 'NEGATIVE', 'score': 0.9997872710227966},
 {'label': 'POSITIVE', 'score': 0.972362220287323},
 {'label': 'NEGATIVE', 'score': 0.9997872710227966},
 {'label': 'POSITIVE', 'score': 0.972362220287323},
 {'label': 'NEGATIVE', 'score': 0.9997872710227966},
 {'label': 'POSITIVE', 'score': 0.972362220287323},
 {'label': 'NEGATIVE', 'score': 0.9997872710227966},
 {'label': 'POSITIVE', 'score': 0.972362220287323},
 {'label': 'NEGATIVE', 'score': 0.9997872710227966},
 {'label': 'POSITIVE', 'score': 0.972362220287323},
 {'label': 'NEGATIVE', 'score': 0.9997872710227966},
 {'label': 'POSITIVE', 'score': 0.972362220287323},
 {'label': 'NEGATIVE', 'score': 0.9997872710227966},
 {'label': 'POSITIVE', 'score': 0.972362220287323},
 {'label': 'NEGATIVE', 'score': 0.9997872710227966},
 {'label': 'POSITIVE', 'score': 0.972362220287323}]

Now let's check how many IPUs are in use:

In [23]:
!gc-monitor 

+---------------+---------------------------------------------------------------------------------+
|  gc-monitor   |         Partition: lr17-1-poplar-19 [active] has 4 reconfigurable IPUs          |
+-------------+--------------------+--------+--------------+----------+-------+----+------+-------+
|    IPU-M    |       Serial       |IPU-M SW|Server version|  ICU FW  | Type  | ID | IPU# |Routing|
+-------------+--------------------+--------+--------------+----------+-------+----+------+-------+
|  10.5.18.3  | 0055.0002.8204821  | 2.6.0  |    1.11.0    |  2.5.9   | M2000 | 0  |  3   |  DNC  |
|  10.5.18.3  | 0055.0002.8204821  | 2.6.0  |    1.11.0    |  2.5.9   | M2000 | 1  |  2   |  DNC  |
|  10.5.18.3  | 0055.0001.8204821  | 2.6.0  |    1.11.0    |  2.5.9   | M2000 | 2  |  1   |  DNC  |
|  10.5.18.3  | 0055.0001.8204821  | 2.6.0  |    1.11.0    |  2.5.9   | M2000 | 3  |  0   |  DNC  |
+-------------+--------------------+--------+--------------+----------+-------+----+------+-------+


In [21]:
!gc-monitor --no-card-info | grep ${os.getpid()}

|  1119  |...3| 5m55s  |    root    | 0  | 1330MHz  | 36.2 C | 28.0 C |153.4 W |
|  1119  |...3| 5m55s  |    root    | 1  | 1330MHz  | 33.3 C |        |        |


These IPUs will be associated with the model in the pipeline until:

- The `sentiment_pipeline` object goes out of scope or
- The model is explicitly detached from the IPU.

By remaining attached the model can be very fast, providing fast responses to new prompts:

In [29]:
%%timeit
sentiment_pipeline(32*["IPUs are fast once the pipeline is attached", "and Notebooks are easy to program in"])

86.3 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


If you are testing different models you might have multiple pipelines using IPUs:

In [30]:
sentiment_pipeline_2 = pipelines.pipeline("text-classification")
sentiment_pipeline_2(["IPUs are great!", "Notebooks are easy to program in"])

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
No padding arguments specified, so pad to 128 by default. Inputs longer than 128 will be truncated. To change this behaviour, pass the `padding='max_length'` and`max_length=<your desired input length>` arguments to the pipeline function
Graph compilation: 100%|██████████| 100/100 [00:01<00:00]


[{'label': 'POSITIVE', 'score': 0.9998313188552856},
 {'label': 'POSITIVE', 'score': 0.972362220287323}]

Checking the IPU usage we can see that we are now using four IPUs:

In [31]:
!gc-monitor --no-card-info | grep ${os.getpid()}

|  1119  |...3| 9m45s  |    root    | 0  | 1330MHz  | 36.6 C | 28.0 C |153.2 W |
|  1119  |...3| 9m45s  |    root    | 1  | 1330MHz  | 33.6 C |        |        |
|  1119  |...3| 9m45s  |    root    | 2  | 1330MHz  | 39.1 C |        |        |
|  1119  |...3| 9m45s  |    root    | 3  | 1330MHz  | 36.2 C |        |        |


## Managing resources

From this we see that we are using four IPUs, two per active pipeline. While it may make sense for us to keep both pipelines active if we are testing both at the same time, we may need to free up resources to continue experimenting with more models.

To do that we can call the `detachFromDevice` method on the model:

In [35]:
sentiment_pipeline.model.detachFromDevice()

Error: 'poptorch_py_error': Device is not attached

In [38]:
sentiment_pipeline_2.model.detachFromDevice()

In [40]:
!gc-monitor --no-card-info | grep ${os.getpid()}

This method has freed up the IPU resources while keeping the pipeline object available, meaning that we can quickly reattach the same pipeline to an IPU simply by calling it:

In [42]:
simple_test_data=["I love you.", "I hate you!"]

In [None]:
%%time
sentiment_pipeline(simple_test_data)

# sentiment_pipeline.model.detachFromDevice()

CPU times: user 1.69 s, sys: 1.41 s, total: 3.1 s
Wall time: 4.57 s


[{'label': 'POSITIVE', 'score': 0.9998711347579956},
 {'label': 'NEGATIVE', 'score': 0.9987004995346069}]

In [56]:
del sentiment_pipeline

NameError: name 'sentiment_pipeline' is not defined

In [57]:
!gc-monitor --no-card-info | grep ${os.getpid()}

|  1119  |...3| 15m22s |    root    | 0  | 1330MHz  | 36.2 C | 27.9 C |153.3 W |
|  1119  |...3| 15m22s |    root    | 1  | 1330MHz  | 33.0 C |        |        |


The first call is slow as the model is loaded onto the accelerator, but subsequent calls will be fast:

In [58]:
%%time
sentiment_pipeline(simple_test_data)

NameError: name 'sentiment_pipeline' is not defined

The other way to release resources is to let the `sentiment_pipeline` Python variable go out of scope.
There are two main ways to do that:

1. if you want to use the resources for another pipeline you can assign another variable to the same name:

In [None]:
sentiment_pipeline = sentiment_pipeline_2

In [None]:
!gc-monitor --no-card-info | grep ${os.getpid()}

2. Explicitly use `del` to delete the variables:

In [59]:
# Note that after the assignment sentiment_pipeline and sentiment_pipeline_2
# refer to the same object so both symbols must be deleted to release the resources
del sentiment_pipeline
del sentiment_pipeline_2

NameError: name 'sentiment_pipeline' is not defined

In [60]:
!gc-monitor --no-card-info | grep ${os.getpid()}

|  1119  |...3| 15m34s |    root    | 0  | 1330MHz  | 36.1 C | 27.9 C |153.3 W |
|  1119  |...3| 15m34s |    root    | 1  | 1330MHz  | 32.9 C |        |        |


As expected, no IPUs are used by the process anymore.

Alternatively, all IPUs will be released when the notebook kernel is restarted. This can be done from the Notebook graphical user interface by clicking on `Kernel > Restart`:

![Restart ipykernel](images/restart_kernel.png)


## Conclusion

In this simple tutorial we saw how to manage IPU resources from a notebook to make sure that we do not try to use more IPUs than are available on a single system.

For more information on using IPUs and the Poplar SDK through Jupyter notebooks please see the our [dedicated guide](https://github.com/graphcore/tutorials/tree/master/tutorials/standard_tools/using_jupyter).