# Notebook 1: Setting up the Environment

## Sharing is caring

Assuming you will be using GravyFlow on a shared GPU compute cluster, it is important to select the appropriate GPU(s) before executing any programs. This introductory notebook will guide you in configuring GravyFlow to automatically select a GPU with available memory. It will also ensure your environment is properly set up for running GravyFlow.

## Notebook imports

First, we will import a few built-in packages (os, sys) along with a key dependency, TensorFlow. These imports will be utilized in various sections of this notebook.

In [1]:
# Importing the os module, which provides functions for interacting with the operating system.
# This can be used for file and directory operations, environment variable management, etc.
import os

# Importing List from the typing module. Typehints can be used to illustrate to the reader of the
# code the type of variables, in this case, lists.
from typing import List

# Importing TensorFlow, a powerful library for machine learning and neural networks.
# Renamed as 'tf' for ease of use in the code.
import tensorflow as tf

2024-02-29 13:45:20.297826: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-29 13:45:20.297891: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-29 13:45:20.299445: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-29 13:45:20.307229: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Import GravyFlow

Next, we will import the GravyFlow module.

In [2]:
# Import the GravyFlow module and rename it as 'gf' for ease of use in the code.
import gravyflow as gf

## GPU Setup

For ease of use, GravyFlow includes a function, `gf.env`, designed to automatically configure the GPU environment.

The `gf.env` function has the following optional arguments:

- `min_gpu_memory_mb` : int = 4000
	> Sets the minimum GPU memory (in megabytes) required for a GPU to be considered available for GravyFlow.
- `max_gpu_utilization_percentage` : float = 80
	> Specifies the maximum GPU utilization (as a percentage) before GravyFlow disallows its use. The default is 80%.
- `num_gpus_to_request` : int = 1
	> Determines the number of GPUs GravyFlow will try to find and use. The default is 1.
- `memory_to_allocate_tf` : int = 2000
	> Sets the amount of GPU memory (in megabytes) that GravyFlow allocates per GPU. This value is fixed to ensure that CUDA functions run efficiently. Default is 2000 Mb.
- `gpus` : Union[str, int, List[Union[int, str]], None] = None
	> Allows manual allocation on specified GPUs. This is not recommended unless necessary. If set to None, GravyFlow automatically selects free GPUs.

The function returns the following object:

- `strategy` : tf.distribute.Strategy: 
	> Enables multi-GPU usage in subsequent TensorFlow operations.

### Function Operations

When called, `gf.env` performs several operations:

1. Identifies available GPUs with free memory exceeding `min_gpu_memory_mb`.
2. Allocates a specified number of GPUs (`num_gpus_to_request`) for GravyFlow and cuPhenom, setting the `CUDA_VISIBLE_DEVICES` environment variable accordingly.
3. Checks for compatibility between the CUDA and TensorFlow versions.
4. Allocates `memory_to_allocate_tf` Mb of GPU memory to TensorFlow per requested GPU.
5. Sets up and returns a `tf.distribute.Strategy` object for multi-GPU operations in TensorFlow.

### Example Usage

Below is an example of how to use `gf.env`:

In [3]:
# Set up the environment using gf.env() and return a tf.distribute.Strategy object.
env : tf.distribute.Strategy = gf.env()

with env:
    # All code within this block will be executed under the TensorFlow strategy scope.

    # Printing the CUDA_VISIBLE_DEVICES environment variable after gf.env() setup.
    # This shows which GPUs are allocated for TensorFlow operations.
    cuda_visible_devices : str = os.environ.get('CUDA_VISIBLE_DEVICES', 'Not set.')
    print(f'CUDA_VISIBLE_DEVICES after environment setup: {cuda_visible_devices}')

    # Printing the list of GPUs visible to TensorFlow after gf.env() has been executed.
    # This confirms the successful allocation and visibility of GPUs to TensorFlow.
    gpus : List = tf.config.list_physical_devices('GPU')
    print(f"GPUs visible to TensorFlow after environment setup: {gpus}")

INFO:root:TensorFlow version: 2.15.0, CUDA version: 12.2
2024-02-29 13:45:42.438165: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2024-02-29 13:45:42.438595: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 2000 MB memory:  -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:0a:00.0, compute capability: 7.0
INFO:root:[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


CUDA_VISIBLE_DEVICES after environment setup: 2
GPUs visible to TensorFlow after environment setup: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


## Important Notes on Using gf.env:

Running any TensorFlow functionality before initializing the environment will cause it to run on all GPUs by default.
You cannot set up the environment more than once per Python kernel session.

In [4]:
# Attempt to run gf.env again, this time requesting 2 GPUs, without restarting the kernel.
env : tf.distribute.Strategy = gf.env(num_gpus_to_request=2)

# gf.env will prevent the creation of a new scope, as TensorFlow does not allow this,
# and will return the same environment that was set up initially.
with env:
    # Fetching and printing the CUDA_VISIBLE_DEVICES environment variable.
    # This check is to see if the environment variable has changed.
    cuda_visible_devices : str = os.environ.get('CUDA_VISIBLE_DEVICES', 'Not set.')
    print(f'CUDA_VISIBLE_DEVICES after environment setup: {cuda_visible_devices}. This has changed correctly.')

    # Fetching and printing the list of GPUs visible to TensorFlow.
    # This is to verify if the visible GPUs to TensorFlow have changed.
    gpus : List = tf.config.list_physical_devices('GPU')
    print(f"GPUs visible to TensorFlow after environment setup: {gpus}. This has not changed.")

INFO:root:TensorFlow version: 2.15.0, CUDA version: 12.2
INFO:root:[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


CUDA_VISIBLE_DEVICES after environment setup: 3,4. This has changed correctly.
GPUs visible to TensorFlow after environment setup: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]. This has not changed.


Now that we have seen how to automatically set up the environment, let's move on to acquiring our first interferometer noise background dataset.