## Exercise: Annotate Execution Spaces

The notion of execution space is a foundational concept of accelerated computing.
In this exercise you will verify your expectation of *where* any given code is executed.

Replace all `???` with `CPU` or `GPU`, based on where you think that specific line of code is executing.  The `ach::where_am_I` function is a helper function for you in this exercise.

After making all the changes, run the subsequent cell to verify your expectations.  

In [1]:
import os

if os.getenv("COLAB_RELEASE_TAG"): # If running in Google Colab:
  !mkdir -p Sources
  !wget https://raw.githubusercontent.com/NVIDIA/accelerated-computing-hub/refs/heads/main/tutorials/cuda-cpp/notebooks/01.02-Execution-Spaces/Sources/ach.h -nv -O Sources/ach.h

2025-12-26 18:11:27 URL:https://raw.githubusercontent.com/NVIDIA/accelerated-computing-hub/refs/heads/main/tutorials/cuda-cpp/notebooks/01.02-Execution-Spaces/Sources/ach.h [2893/2893] -> "Sources/ach.h" [1]


In [2]:
%%writefile Sources/no-magic-execution-space-changes.cpp
#include "ach.h"

int main() {
  // TODO: Replace ??? with CPU or GPU
  ach::where_am_I("CPU");

  thrust::universal_vector<int> vec{1};
  thrust::for_each(thrust::device, vec.begin(), vec.end(),
                   [] __host__ __device__(int) { ach::where_am_I("GPU"); });

  thrust::for_each(thrust::host, vec.begin(), vec.end(),
                   [] __host__ __device__(int) { ach::where_am_I("CPU"); });

  ach::where_am_I("CPU");
}

Writing Sources/no-magic-execution-space-changes.cpp


In [3]:
!nvcc -o /tmp/a.out --extended-lambda Sources/no-magic-execution-space-changes.cpp -x cu -arch=native # build executable
!/tmp/a.out # run executable

Correct! The function is invoked on CPU
Correct! The function is invoked on GPU
Correct! The function is invoked on CPU
Correct! The function is invoked on CPU


If you’re unsure how to proceed, consider expanding this section for guidance. Use the hint only after giving the problem a genuine attempt.

<details>
  <summary>Hints</summary>
  
  - for invocations in the main function consult [heterogeneous programming model](01.02.01-Execution-Spaces.ipynb#Heterogeneous-Programming-Model) section
  - for invocations in lambdas consult [execution policy](01.02.01-Execution-Spaces.ipynb#Execution-Policy) section
</details>

Open this section only after you’ve made a serious attempt at solving the problem. Once you’ve completed your solution, compare it with the reference provided here to evaluate your approach and identify any potential improvements.

<details>
  <summary>Solution</summary>

  Key points:
  - The main function always runs on the CPU
  - According to `thrust::device` execution policy, the first `thrust::for_each` call invokes lambda on the GPU
  - According to `thrust::host` execution policy, the second `thrust::for_each` call invokes lambda on the CPU

  Solution:
  ```c++
  ach::where_am_I("CPU");

  thrust::universal_vector<int> vec{1};
  thrust::for_each(thrust::device, vec.begin(), vec.end(),
                   [] __host__ __device__(int) { dli::where_am_I("GPU"); });

  thrust::for_each(thrust::host, vec.begin(), vec.end(),
                   [] __host__ __device__(int) { dli::where_am_I("CPU"); });

  ach::where_am_I("CPU");
  ```

  You can find full solution [here](Solutions/no-magic-execution-space-changes.cpp).
</details>


---
Congratulations! You are now familiar with the concept of execution spaces in CUDA.
Proceed to the [next exercise](01.02.03-Exercise-Changing-Execution-Space.ipynb).