# 02 : GPU Check

This is a simple test to see if the GPU is available and working correctly.

- https://stackoverflow.com/questions/76581229/is-it-possible-to-check-if-gpu-is-available-without-using-deep-learning-packages
- https://docs.mlrun.org/en/v1.7.2/runtimes/configuring-job-resources.html
- https://docs.k3s.io/advanced#nvidia-container-runtime

In [6]:
import mlrun

In [7]:
# Show the API server URL
mlrun.get_run_db()

HTTPRunDB('http://dragon:30070')

In [8]:
# Set the base project name
project_name = "mlrun-demo"

# Initialize the MLRun project object
project = mlrun.get_or_create_project(
    name=project_name, 
    context="./",
    user_project=True)

# Display the current project name
project_name = project.metadata.name
print(f'Full project name: {project_name}')

> 2025-07-15 14:38:23,915 [info] Project loaded successfully: {"project_name":"mlrun-demo-johannes"}
Full project name: mlrun-demo-johannes


## Get GPU Function

In [9]:
%%writefile 02_get_gpu_info.py

import GPUtil

def get_gpu_info(context):    
    gpus = GPUtil.getGPUs()
    gpu_info = []
    for gpu in gpus:
        gpu_info.append({
            'id': gpu.id,
            'name': gpu.name,
            'load': gpu.load,
            'memory_total': gpu.memoryTotal,
            'memory_free': gpu.memoryFree,
            'memory_used': gpu.memoryUsed,
        })

    context.logger.info(f"GPU Info: {gpu_info}")
    return gpu_info

Overwriting 02_get_gpu_info.py


## ML Run Function

In [10]:
fn_gpu_check = project.set_function(
    func="02_get_gpu_info.py",
    name="gpu-check",
    tag="latest",
    kind="job",
    image="mlrun/mlrun-gpu",
    handler="get_gpu_info",
    requirements=["GPUtil==1.4.0"])

# Then set the GPU resources on the function's spec
fn_gpu_check.with_requests(mem="1G", cpu=1)  # lower bound
fn_gpu_check.with_limits(mem="2G", cpu=2, gpus=1)  # upper bound
# fn_gpu_check.spec.resources = {
#     "limits": {"nvidia.com/gpu": 1},
#     "requests": {"nvidia.com/gpu": 1}
# }

In [None]:
# run the function locally
fn_gpu_check.run(
    local=False,
    handler="get_gpu_info",
    auto_build=True
)

> 2025-07-15 14:38:24,028 [info] Storing function: {"db":"http://dragon:30070","name":"gpu-check-get-gpu-info","uid":"4594045ec4614852a69c47daaff94f8a"}
> 2025-07-15 14:38:24,123 [info] Job is running in the background, pod: gpu-check-get-gpu-info-fczk2
