# Exercise: Computing Median Particle Velocity with GPU Acceleration

In our particle simulation work, we often need to calculate statistical properties of the system, such as the median velocity. In this exercise, we'll port a CPU-based median calculation to run on the GPU.


## The Task

We have a particle simulation that updates particle positions and velocities over time. We want to calculate the median velocity at each step to understand the overall system behavior. Currently, we have a CPU-based implementation of the median function, but we want to port it to run on the GPU for better performance.
Here's the starting code that uses the CPU for the median calculation:

In [3]:
#Specifying path to where nvcc exists so that the jupyter notebook reads from it. nvcc is the nvidia cuda compiler for executing cuda. 
import os
os.environ['PATH'] = "/packages/apps/spack/21/opt/spack/linux-rocky8-zen3/gcc-12.1.0/cuda-12.6.1-cf4xlcbcfpwchqwo5bktxyhjagryzcx6/bin:" + os.environ['PATH']

In [1]:
%%writefile codes/particle_median_cpu.cu
#include <thrust/universal_vector.h>
#include <thrust/transform.h>
#include <thrust/execution_policy.h>
#include <algorithm>
#include <cstdio>

struct Particle {
    float x, y;    // position
    float vx, vy;  // velocity
};

// Calculate magnitude of velocity vector
float velocity_magnitude(float vx, float vy) {
    return sqrt(vx*vx + vy*vy);
}

// CPU implementation of median calculation
float median_velocity(thrust::universal_vector<Particle> particles) 
{
    // Extract velocity magnitudes
    thrust::universal_vector<float> velocities(particles.size());
    for (size_t i = 0; i < particles.size(); i++) {
        velocities[i] = velocity_magnitude(particles[i].vx, particles[i].vy);
    }
    
    // Use standard C++ algorithm to sort on CPU
    std::sort(velocities.begin(), velocities.end());
    
    // Return the median value
    return velocities[velocities.size() / 2];
}

int main() 
{
    // Simulation parameters
    float dt = 0.1f;  // time step
    float damping = 0.98f;  // velocity damping factor
    
    // Initial particle states
    thrust::universal_vector<Particle> particles{
        {0.0f, 0.0f, 1.0f, 0.5f},    // Particle 1
        {1.0f, 2.0f, -0.5f, 0.2f},   // Particle 2
        {-1.0f, -1.0f, 0.3f, 0.7f},  // Particle 3
        {2.0f, -2.0f, -0.1f, -0.8f}, // Particle 4
        {3.0f, 1.0f, -0.4f, 0.6f}    // Particle 5
    };
    
    // Update function (runs on GPU)
    auto update_particle = [=] __host__ __device__ (Particle p) { 
        // Update position based on velocity
        p.x += p.vx * dt;
        p.y += p.vy * dt;
        
        // Apply damping to velocities
        p.vx *= damping;
        p.vy *= damping;
        
        return p;
    };
    
    // Simulation loop
    std::printf("step  median_velocity\n");
    for (int step = 0; step < 3; step++) {
        // Update particles on GPU
        thrust::transform(thrust::device, 
                         particles.begin(), particles.end(), 
                         particles.begin(), 
                         update_particle);
        
        // Calculate median velocity on CPU
        float median_vel = median_velocity(particles);
        
        std::printf("%d     %.4f\n", step, median_vel);
    }
}

Overwriting codes/particle_median_cpu.cu


Let's compile and run this CPU version:


In [6]:
%%bash
nvcc -o codes/particle_median_cpu --extended-lambda codes/particle_median_cpu.cu
./codes/particle_median_cpu

step  median_velocity
0     0.7463
1     0.7314
2     0.7168


# Your Challenge: Port to GPU
Now, modify the median_velocity function to use the GPU instead of the CPU. The main changes you'll need to make are:

1. Use thrust::transform to calculate velocity magnitudes on the GPU

2. Replace std::sort with thrust::sort and run it on the GPU

3. Make sure all operations maintain GPU acceleration

Here's the template for your modified code:

In [7]:
%%writefile codes/particle_median_gpu.cu
#include <thrust/universal_vector.h>
#include <thrust/transform.h>
#include <thrust/sort.h>
#include <thrust/execution_policy.h>
#include <cstdio>

struct Particle {
    float x, y;    // position
    float vx, vy;  // velocity
};

// TODO: Modify this function to use GPU for all operations
float median_velocity(thrust::universal_vector<Particle> particles) 
{
    // Extract velocity magnitudes (should use GPU)
    thrust::universal_vector<float> velocities(particles.size());
    
    // TODO: Replace the CPU loop with a GPU operation
    
    // TODO: Replace std::sort with GPU-accelerated version
    
    // Return the median value
    return velocities[velocities.size() / 2];
}

int main() 
{
    // Simulation parameters
    float dt = 0.1f;  // time step
    float damping = 0.98f;  // velocity damping factor
    
    // Initial particle states
    thrust::universal_vector<Particle> particles{
        {0.0f, 0.0f, 1.0f, 0.5f},    // Particle 1
        {1.0f, 2.0f, -0.5f, 0.2f},   // Particle 2
        {-1.0f, -1.0f, 0.3f, 0.7f},  // Particle 3
        {2.0f, -2.0f, -0.1f, -0.8f}, // Particle 4
        {3.0f, 1.0f, -0.4f, 0.6f}    // Particle 5
    };
    
    // Update function (runs on GPU)
    auto update_particle = [=] __host__ __device__ (Particle p) { 
        // Update position based on velocity
        p.x += p.vx * dt;
        p.y += p.vy * dt;
        
        // Apply damping to velocities
        p.vx *= damping;
        p.vy *= damping;
        
        return p;
    };
    
    // Simulation loop
    std::printf("step  median_velocity\n");
    for (int step = 0; step < 3; step++) {
        // Update particles on GPU
        thrust::transform(thrust::device, 
                         particles.begin(), particles.end(), 
                         particles.begin(), 
                         update_particle);
        
        // Calculate median velocity (should use GPU now)
        float median_vel = median_velocity(particles);
        
        std::printf("%d     %.4f\n", step, median_vel);
    }
}

Writing codes/particle_median_gpu.cu


After implementing your solution, compile and run it:


In [8]:
%%bash
nvcc -o codes/particle_median_gpu --extended-lambda codes/particle_median_gpu.cu
./codes/particle_median_gpu

step  median_velocity
0     0.0000
1     0.0000
2     0.0000


# Particle Median Velocity - Hints and Solution

## Hints

<details>
<summary>👉 Hint 1: Velocity Magnitude Calculation</summary>

- You need to transform each Particle into its velocity magnitude
- The velocity magnitude is `sqrt(vx*vx + vy*vy)`
- Consider using `thrust::transform` with a lambda function
- The lambda should take a `Particle` and return a `float`

</details>

<details>
<summary>👉 Hint 2: Sorting on GPU</summary>

- Instead of `std::sort`, use `thrust::sort`
- `thrust::sort` works directly with device vectors
- Don't forget to specify the execution policy (`thrust::device`)

</details>

<details>
<summary>👉 Hint 3: Overall Structure</summary>

Your solution should:
1. Extract velocities using transform
2. Sort the velocities on GPU
3. Return the middle element for the median

</details>

## Solution

<details>
<summary>👉 Click to see complete solution</summary>

```cpp
float median_velocity(thrust::universal_vector<Particle> particles) 
{
    // Extract velocity magnitudes using GPU
    thrust::universal_vector<float> velocities(particles.size());
    
    thrust::transform(
        thrust::device,
        particles.begin(), 
        particles.end(),
        velocities.begin(),
        [] __device__ (const Particle& p) {
            return sqrt(p.vx * p.vx + p.vy * p.vy);
        }
    );
    
    // Sort velocities on GPU
    thrust::sort(
        thrust::device,
        velocities.begin(), 
        velocities.end()
    );
    
    // Return the median value
    return velocities[velocities.size() / 2];
}
```

