# Using FABRIC GPUs

Your compute nodes can include GPUs. These devices are made available as FABRIC components and can be added to your nodes like any other component.

This example notebook will demonstrate how to reserve and use GPU devices on FABRIC.

## Configure the Environment

In [None]:
import os
from fabrictestbed.slice_manager import SliceManager, Status
import json

In [None]:
ssh_key_file_priv="/home/fabric/.ssh/id_rsa"
ssh_key_file_pub="/home/fabric/.ssh/id_rsa.pub"

ssh_key_pub = None
with open (ssh_key_file_pub, "r") as myfile:
    ssh_key_pub=myfile.read()
    ssh_key_pub=ssh_key_pub.strip()

In [None]:
credmgr_host = os.environ['FABRIC_CREDMGR_HOST']
orchestrator_host = os.environ['FABRIC_ORCHESTRATOR_HOST']
print(f"CM Host: {credmgr_host} Orchestrator Host: {orchestrator_host}")

## Create Slice Manager Object

In [None]:
slice_manager = SliceManager(oc_host=orchestrator_host, cm_host=credmgr_host ,project_name='all', scope='all')

# Initialize the slice manager
slice_manager.initialize()

## Create a Node

The cell below creates a slice that contains a single node. The node includes a GPU component.

### Set the Slice Name and FABRIC Site

In [None]:
slice_name="GPUTest"
site_name="LBNL"
node1_name='rtx6000-node'
node2_name='tesla-node'

In [None]:
from fabrictestbed.slice_editor import ExperimentTopology, Capacities, ComponentType
# Create topology
t = ExperimentTopology()

# Add node
n1 = t.add_node(name=node1_name, site=site_name)

# Set capacities
cap = Capacities()
cap.set_fields(core=2, ram=8, disk=50)

# Set Properties
n1.set_properties(capacities=cap, image_type='qcow2', image_ref='default_ubuntu_20')

# Add the PCI NVMe device
n1.add_component(ctype=ComponentType.GPU, model='RTX6000', name='n1_gpu1')

# Add node
n2 = t.add_node(name=node2_name, site=site_name)

# Set capacities
cap = Capacities()
cap.set_fields(core=2, ram=8, disk=50)

# Set Properties
n2.set_properties(capacities=cap, image_type='qcow2', image_ref='default_ubuntu_20')

n2.add_component(ctype=ComponentType.GPU, model='Tesla T4', name='n1_gpu2')

# Generate Slice Graph
slice_graph = t.serialize()

# Request slice from Orchestrator
status, reservations = slice_manager.create(slice_name=slice_name, slice_graph=slice_graph, ssh_key=ssh_key_pub)

print("Response Status {}".format(status))
print("Reservations created {}".format(reservations))

In [None]:
# Set the Slice ID from output of the above command
slice_id=reservations[0].slice_id

## Get the Nodes

Retrieve the node information, check the state of the node, and save the management IP addresses.

Re-run this cell until the node status is Active.

In [None]:
status, slivers = slice_manager.slivers(slice_id=slice_id)

# Check the return value of the slivers call
if status != Status.OK:
    print("Failed to get slivers")

# Get the node with name node_name
node1 = list(filter(lambda slivers: slivers.name == node1_name, slivers))[0]

node1_status=node1.reservation_state
node1_management_ip=node1.management_ip

print("node1_status: {}".format(node1_status))
print("node1_management_ip: {}".format(node1_management_ip))


# Get the node with name node_name
node2 = list(filter(lambda slivers: slivers.name == node2_name, slivers))[0]

node2_status=node2.reservation_state
node2_management_ip=node2.management_ip

print("node2_status: {}".format(node2_status))
print("node2_management_ip: {}".format(node2_management_ip))


## Setup SSH Connection for Commands

Setup <code>paramiko</code> to send commands to the node using <code>ssh</code>.

In [None]:
import paramiko

#Setup connection to node1
key = paramiko.RSAKey.from_private_key_file(ssh_key_file_priv)
client1 = paramiko.SSHClient()
client1.load_system_host_keys()
client1.set_missing_host_key_policy(paramiko.MissingHostKeyPolicy())

client1.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client1.connect(node1_management_ip,username='ubuntu',pkey = key)


#Setup connection to node2
key = paramiko.RSAKey.from_private_key_file(ssh_key_file_priv)
client2 = paramiko.SSHClient()
client2.load_system_host_keys()
client2.set_missing_host_key_policy(paramiko.MissingHostKeyPolicy())

client2.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client2.connect(node2_management_ip,username='ubuntu',pkey = key)

## Configure the GPUs


### NVMe PCI Device

Run the command <code>lspci</code> to see your GPU PCI device(s). This is the raw GPU PCI device that is not yet configured for use.  You can use the GPUs as you would any GPUs.

An example of using the GPUs is coming soon.

View node1's GPU

In [None]:
stdin, stdout, stderr = client1.exec_command('lspci')

print (str(stdout.read(),'utf-8').replace('\\n','\n'))

View node1's GPU

In [None]:
stdin, stdout, stderr = client2.exec_command('lspci')

print (str(stdout.read(),'utf-8').replace('\\n','\n'))

## Cleanup Your Experiment

In [None]:
status, result = slice_manager.delete(slice_id=slice_id)

print("Response Status {}".format(status))
print("Response received {}".format(result))