In [3]:
import dryml

# DRYML Tutorial 3 - Compute `Context`s

A second major issue ML platforms suffer from is the difficulty of configuring compute resources. By default, most platforms (like TensorFlow) just allocate all memory on a GPU. DRYML attempts to remedy this with a `context` system. The `context` system provides a way to specify a computational resource requirement, and indicate which resources have been claimed. `context` can then check whether the user has authorized the current thread to use the available resources. It can either fail or launch a python sub-process to contain compute operations. This allows device memory to be released when the method completes. Secondly, `Object` supports a 'compute' mode, where the user can contain the allocation of any objects which may require device memory.

### DRYML `ResourceRequest`s

DRYML implements the `ResourceRequest` to allow the user to inform the library what types of compute resources are required for a given code section or method. Resources can be specified using a dictionary with keywords aligning with the framework you need. We have the following keywords: for tensorflow 'tf', for pytorch 'torch', and for default 'default'. For example, if we want one GPU available for tensorflow use, the resource request would look like this:

In [4]:
ctx_reqs = {
    'default': {'num_gpus': 0},
    'tf': {'num_gpus': 1},
    'torch': {'num_gpus': 0},
}

A resource request is a dictionary with a couple keys to signal a request for specific resources. Right now we can ask for a specific number of cpus/gpus with `num_cpus` and `num_gpus`. We can also ask for specific cpus and gpus with `cpu/<i>` and `gpu/<i>` with a float value between `0.` and `1.`. When possible, if you request a fraction of a gpu, DRYML will configure the corresponding framework for that.

Thus, the above context requirements asks for tensorflow with one gpu, and torch with no gpus.

With the `ctx_reqs` dictionary, DRYML will create a `ContextManager` which will attempt to create appropriate contexts with the correct resources. If successful, the user will have access to the necessary GPUs, and the correponding libraries will be configured for the requested devices (if possible).

> Be aware that most frameworks currently have no way of enforcing limits on memory consumption of GPUs. This means, the user is trusted to try and adhere to the memory requirements which DRYML makes available at all times through the `dryml.get_context()` method which returns the current `ContextManager`.

If the user wants their objects to avoid allocating memory on a device, they can simply not set a context, and if a context is required, DRYML will throw an exception.

If there is code you suspect may require device memory, DRYML provides the `context_check` method to trigger a check for whether the current context satisfies some resource constraints. Let's check if the current context has two GPUs allocated to tensorflow. (This should fail!)

In [5]:
dryml.context.context_check({'tf': {'num_gpus': 2}})

NoContextError: 

## `compute_context` decorator

We can create a context in the current process, however currently its not possible to remove the context once created for most frameworks. We can avoid creating a context in our current process, which allows us the possibly change how resources are distributed depending on the model. The `dryml.compute_context` decorator generator is provided by DRYML which gives a wrapped method the power to inspect existing compute contexts, or launch itself in a new process with an appropriate context. This makes it easy to interleave code requiring a context with manager code which may require running a variety of models that could have conflicting context requirements.

Now, `dryml.compute_context` is actually a decorator generator meaning, you need to call it to create the decorator you want to use. This allows the user to customize how a given method gets wrapped, and customizes how compute contexts are spawned. DRYML also provides the `dryml.compute` decorator which is just a shortcut to `compute_context()` when generic behavior is fine.

`compute_context` has a couple of important arguments which can be specified when the decorator is created (when calling `compute_context`), and can be overridden when actually calling the function.
* `ctx_context_reqs`: Probably the most important, specifies a specific set of `context_reqs` to use when checking for an existing context or launching a new context. Override at call time with `call_context_reqs`.
* `ctx_use_existing_context` (Default `True`): When `True`, DRYML should try to use an existing context if available. If the existing context doesn't satisfy the given requirements, it will raise a `WrongContextError` exception rather than create a new context. Override at call time with `call_use_existing_context`
* `ctx_dont_create_context` (Default `False`): When `False`, DRYML won't try to create a new context ever. if no context exists, it'll throw a `NoContextError` exception, and if the existing context doesn't satisfy the given requirements, it will raise a `WrongContextError` exception. Override at call time with `call_dont_create_context`
* `ctx_update_objs` (Default `False`): When `True`, DRYML will update objects in the current process with the state of corresponding objects in the remove process upon completion. Override at call time with `call_update_objs`
* `ctx_verbose` (Default `False`): When `True`, DRYML will print some diagnostic information about the whole compute procedure. Override at call time with `call_verbose`.

Let's create a test function to which will check if an appropriate context is in place. We can run the method in the current thread where it will fail, but then we will wrap that function in a `compute_context` wrapper to ensure it gets launched with the right context, and see that it succeeds.

In [9]:
def check_ctx():
    # A simple method which checks the current context
    import dryml
    # Check whether any tensorflow context is available
    dryml.context.context_check({'tf': {}})

In [10]:
check_ctx()

NoContextError: 

In [14]:
# Wrap that method in a compute context with a tensorflow resource request
compute_check_ctx = dryml.context.compute_context(ctx_context_reqs=ctx_reqs)(check_ctx)

In [15]:
compute_check_ctx()

2023-03-20 17:31:59.258225: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 23 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1080, pci bus id: 0000:02:00.0, compute capability: 6.1


### Setting a 'current' context

If we're sure what kind of context we'll need throughout the program, we can also set the context directly with the `context.set_context` method which takes a resource request. We'll see once we set the current context, the function which failed earlier, will now succeed.

In [16]:
dryml.context.set_context(ctx_reqs)

2023-03-20 17:32:04.777831: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 23 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1080, pci bus id: 0000:02:00.0, compute capability: 6.1


In [17]:
# Now the original function will work
check_ctx()

We can also do a context check with a set of requirements we know is more than our current context has which will result in an Error!

In [18]:
dryml.context.context_check({'tf': {'num_gpus': 2}})

ContextIncompatibilityError: Context doesn't satisfy requirements {'tf': {'num_gpus': 2}}

And we'll just double check that the current context satisfies the requirements we set out earlier:

In [19]:
dryml.context.context_check(ctx_reqs)

## Wrap-up

Like other components of DRYML, `Dataset`s and `dryml.context` can be used outside of `DRYML`. `dryml.context` is very useful for automatically setting ML framework's device settings. and `Dataset` is great for inspecting and bridging existing datasets between different frameworks.