<a href="https://colab.research.google.com/github/octavioeac/deep-learning-gpu-benchmarks/blob/feature%2Fcuda-experiments/lab-01-cuda-colab-hello-world.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [11]:
!apt-get update
!apt-get install -y cuda-toolkit-12-4

0% [Working]            Hit:1 https://cli.github.com/packages stable InRelease
0% [Waiting for headers] [Waiting for headers] [Waiting for headers] [Connectin                                                                               Hit:2 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
Hit:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Hit:4 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Hit:5 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:6 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:7 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:8 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:9 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:10 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Hit:11 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Reading package lists... Done
W: Skipping acq

#Verify the installation of CUDA in COLAB

In [2]:
!nvcc --version


nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Jun__6_02:18:23_PDT_2024
Cuda compilation tools, release 12.5, V12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0


In [14]:
%%writefile hello_debug.cu
#include <cstdio>
#include <cuda_runtime.h>

__global__ void hello() {
    printf("Hello from GPU -> block %d, thread %d\n", blockIdx.x, threadIdx.x);
}

int main() {
    hello<<<1, 5>>>();
    cudaError_t err = cudaDeviceSynchronize();
    if (err != cudaSuccess) {
        fprintf(stderr, "cudaDeviceSynchronize error: %s\n", cudaGetErrorString(err));
        return 1;
    }
    printf("Done.\n");
    return 0;
}


Overwriting hello_debug.cu


#Compile the CUDA code

In [13]:
!nvidia-smi

Sun Sep  7 18:31:48 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   42C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [16]:
!nvcc -arch=compute_75 -code=sm_75 hello_debug.cu -o hello




In [17]:
!./hello

Hello from GPU -> block 0, thread 0
Hello from GPU -> block 0, thread 1
Hello from GPU -> block 0, thread 2
Hello from GPU -> block 0, thread 3
Hello from GPU -> block 0, thread 4
Done.


# Checklist: Run CUDA Code in Google Colab

1. Enable GPU in Colab
   - Menu → Runtime → Change runtime type → Hardware accelerator: **GPU**.
   - Confirm with:
     ```bash
     !nvidia-smi
     ```

2. Install CUDA Toolkit (nvcc compiler)
   - Colab has drivers, but install nvcc:
     ```bash
     !apt-get update
     !apt-get install -y nvidia-cuda-toolkit
     !nvcc --version
     ```

3. Write your CUDA program
   - Use `%%writefile` to create a `.cu` file:
     ```cpp
     %%writefile hello.cu
     #include <stdio.h>
     __global__ void hello() {
         printf("Hello from GPU thread %d\n", threadIdx.x);
     }
     int main() {
         hello<<<1, 5>>>();
         cudaDeviceSynchronize();
         return 0;
     }
     ```

4. Compile with nvcc
   - For Tesla T4 (Compute Capability 7.5):
     ```bash
     !nvcc -arch=compute_75 -code=sm_75 hello.cu -o hello
     ```

5. Run the program
   ```bash
   !./hello
