# Setting up the Environment

Gravflow is a GPU based module, it relies heavily on vectorised GPU functions in order to speed up calculations, and many parts of it (primarily cuPhenom), do not currently have a CPU alternative. Most of the functions are built from TensorFlow functions, which can be run in CPU only operation, albeit not as efficiently, and doing so will loose much of the advantage this module provides over other alternative. You're welcome to try to run GravyFlow in a CPU only environment, but I would not count on it.

Much of the enironment setup can be handled automatically by running the `setup.sh` bash script in the base directory, please note that GravyFlow is a very young module early in its development and there are likely to be teething problems, especially with the automated helper scripts. If the setup script does not work, try to ensure that you are running GravyFlow in an enviroment in which TensorFlow can see avalible GPUs.

## Sharing is caring

Presumably, you will be running this on a GPU compute cluster which you share with other people. This means that you have to select which GPU(s) you are going to run on before you execute any programs. This first notebook will teach you how to ensure GravyFlow automatically selects a GPU with avalible GPU memory, and ensures your enviroment is set up in a way to run GravyFlow.

## Notebook imports

First we will import several built-in (os, sys) packages, and one depedancy, TensorFlow. These will be used by some blocks in this notebook.

In [None]:
import os
import sys

import tensorflow as tf

## Import GravyFlow

Next we will import the Gravyflow module. Currenly GravyFlow is not setup to be installable, so to ensure that this notebook can import it, a relative path from the grandfather directory of this notebook is used. If you have moved this notebook, this will need to be adjusted.

In [1]:
# Get the absolute path of the parent directory
parent_dir = os.path.abspath('../../')
print(parent_dir)

# Add the parent directory to sys.path
if parent_dir not in sys.path:
    sys.path.append(parent_dir)

# To import gravyflow simply use:
import gravyflow as gf

2023-11-02 13:59:39.267249: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


/home/michael.norman/data_ad_infinitum



SWIGLAL standard output/error redirection is enabled in IPython.
This may lead to performance penalties. To disable locally, use:

with lal.no_swig_redirect_standard_output_error():
    ...

To disable globally, use:

lal.swig_redirect_standard_output_error(True)

Note however that this will likely lead to error messages from
LAL functions being either misdirected or lost when called from
Jupyter notebooks.


import lal

  from lal import LIGOTimeGPS


# GPU Setup

For ease of use, GravyFlow comes with a function to automatically sort out the GPU environment: `gf.env`.

It has the following optional arguments:

- `min_gpu_memory_mb`: int = 4000
  - This sets the minimum required GPU memory in megabytes for a GPU to be considered as free and allocated for use by GravyFlow. By default, this is 4000 Mb.
- `num_gpus_to_request`: int = 1
  - This determines the number of GPUs that GravyFlow will attempt to find and subsequently run on. The default is 1.
- `memory_to_allocate_tf`: int = 2000
  - This determines the amount of GPU memory that GravyFlow will allocate per GPU. This value is fixed, rather than allowing growth, to ensure that CUDA functions run by GravyFlow have memory to run. If you are using cuPhenom for CBC generation, ensure that there is room left on the GPU for it to run when you set this value. It defaults to 2000 Mb.

## Return Object

The function returns the following object:

- `strategy`: tf.distribute.Strategy
  - This object will allow for multi-GPU usage in subsequent TensorFlow operations.

## Function Operations

Calling this function will do a number of things:

1. Determine avalible GPUs which have free GPU memory greater than the requested: `min_gpu_memory_mb`.
2. Allocate a number of free GPUs equivalent to `num_gpus_to_request` for GravyFlow (TensorFlow) and cuPhenom (CUDA), by setting the `CUDA_VISIBLE_DEVICES` environmental variable.
3. Attempt to check that your CUDA version is compatible with your TensorFlow version.
4. Allocate `memory_to_allocate_tf` Mb of GPU memory to GravyFlow (TensorFlow) per requested GPU.
5. Setup and return a tf.distribute.Strategy object which will allow for multi-GPU usage in subsequent TensorFlow operations.

## Example Usage

Below is an example of its operation:

In [1]:
# Setup enviroment and return tf.distributed stratergy object.
env = gf.env()

with env:

    # ALl code placed her will be in the scope of the TensorFlow strategy.

    # Print CUDA_VISIBLE_DEVICES after gf.env() has run:
    cuda_visible_devices = os.environ.get('CUDA_VISIBLE_DEVICES', 'Not set')
    print(f'CUDA_VISIBLE_DEVICES after environment setup: {cuda_visible_devices}')

    # Print devices visible to TensorFlow after gf.env() has run:
    gpus = tf.config.list_physical_devices('GPU')
    print(f"GPUs visible to TensorFlow after environment setup: {gpus}")

NameError: name 'gf' is not defined

## Important Notes on using gf.env:
- Running any TensorFlow functionality before initilising the environment will cause it to run on all GPUs by default.
- You cannot setup the environment more than once per Python kernel.

Now that we have seen how we can automatically setup the environment, let us move of to acquiring our first interferometer noise background dataset.