## RDF
The radial distribution function (RDF) denoted in equations by g(r) defines the probability of finding a particle at a distance r from another tagged particle. The RDF is strongly dependent on the type of matter so will vary greatly for solids, gases and liquids.
<img src="../images/rdf.png" width="40%" height="40%">

# **Ma version**

In [13]:
%%bash
#Compile the code
make clean && make

rm -f *.o rdf
nvc++  -acc -fast -Minfo=accel -o rdf rdf.cpp  -I/apps/2025/manual_install/nvhpc/24.11/Linux_aarch64/24.11/cuda/include 


main:
     91, Generating copy(h_g2[:nbin]) [if not already present]
         Generating copyin(h_x[:numatm*nconf],h_z[:numatm*nconf],h_y[:numatm*nconf]) [if not already present]
round(float):
    167, Generating implicit acc routine seq
         Generating acc routine seq
         Generating NVIDIA GPU code
pair_gpu(double const*, double const*, double const*, unsigned long long*, int, int, double, double, double, int):
    181, Generating present(d_g2[:],d_x[:],d_z[:],d_y[:])
         Generating implicit firstprivate(numatm,nconf)
         Generating NVIDIA GPU code
        183, #pragma acc loop gang, vector(128) collapse(3) /* blockIdx.x threadIdx.x */
        185,   /* blockIdx.x threadIdx.x collapsed */
        187,   /* blockIdx.x threadIdx.x collapsed */
    187, Generating implicit firstprivate(del,dx,dy,r,xbox,zbox,ybox,ig2,cut,dz)


As you might have observed the code complexity of the algorithm is of order of $N^{2}$ . Let us get into details of the sequential code. **Understand and analyze** the code present at:

[RDF Serial Code](rdf.cpp)

[File Reader](dcdread.h)

[Makefile](Makefile)

Open the downloaded file for inspection. Make the changes and add the OpenACC directives to parallelize the code. Then, run the below cell to compile.


__To pass the assessment, you need to use the data directives and explicitly manage the memory rather than using the `managed` memory flag in the `Makefile`.__

Let's run the executable and validate the output first. Then, profile the code.

In [6]:
%%bash
#Run the multicore code and check the output
./rdf && cat Pair_entropy.dat

Dcd file has 6720 atoms and 10001 frames
Calculating RDF for 10 frames
Reading of input file is completed
#Freeing Host memory
#Number of atoms processed: 6720

#Number of confs processed: 10

s2 value is -2.43191
s2bond value is -3.87014


The output should be the following:

```
s2 value is -2.43191
s2bond value is -3.87014
```

In [11]:
!nsys profile -t openacc --stats=true --force-overwrite true -o rdf ./rdf



Collecting data...
Dcd file has 6720 atoms and 10001 frames
Calculating RDF for 10 frames
Reading of input file is completed
#Freeing Host memory
#Number of atoms processed: 6720

#Number of confs processed: 10

Generating '/tmp/nsys-report-7e32.qdstrm'
[3/7] Executing 'cuda_api_sum' stats report

 Time (%)  Total Time (ns)  Num Calls  Avg (ns)   Med (ns)  Min (ns)  Max (ns)  StdDev (ns)          Name        
 --------  ---------------  ---------  ---------  --------  --------  --------  -----------  --------------------
     95.3         14072288          4  3518072.0    1888.0      1472  14067040    7032645.3  cuStreamSynchronize 
      1.8           260384          5    52076.8    4032.0      2176    126624      67473.4  cuMemAlloc_v2       
      1.2           183136          1   183136.0  183136.0    183136    183136          0.0  cuMemAllocHost_v2   
      1.1           158784          1   158784.0  158784.0    158784    158784          0.0  cuModuleLoadDataEx  
      0.4        

To view the profiler report, you would need to download and save the report file by holding down <mark>Shift</mark> and <mark>Right-Clicking</mark> [Here](rdf.qdrep) (choose *save link as*), and open it via the GUI.

Once you are ready, run the below cell to assess your code.

In [1]:
!./run_assess

.
./run_assess: line 28: 1917701 Segmentation fault      (core dumped) nvc++ -acc -Minfo=accel -o rdf rdf.cpp -L/opt/softwares/cuda/cuda-12.6/lib64 -lnvToolsExt
main:
     91, Generating copy(h_g2[:nbin]) [if not already present]
         Generating copyin(h_x[:numatm*nconf],h_z[:numatm*nconf],h_y[:numatm*nconf]) [if not already present]
round(float):
    167, Generating implicit acc routine seq
         Generating acc routine seq
         Generating NVIDIA GPU code
pair_gpu(double const*, double const*, double const*, unsigned long long*, int, int, double, double, double, int):
    181, Generating present(d_g2[:],d_x[:],d_z[:],d_y[:])
         Generating implicit firstprivate(nconf,numatm)
         Generating NVIDIA GPU code
        183, #pragma acc loop gang, vector(128) collapse(3) /* blockIdx.x threadIdx.x */
        185,   /* blockIdx.x threadIdx.x collapsed */
        187,   /* blockIdx.x threadIdx.x collapsed */
    187, Generating implicit firstprivate(del,dx,dy,r,xbox,zbox,ybo

## Get Credit for Your Work

After successfully completing your work, revisit the web page where you launched this coding environment and click the "ASSESS TASK" button. After doing so you will be get instructions for generating a *Certificate of Competency* for the course.

![get_credit](../images/run_the_assessment.png)

-----


# Links and Resources
<!--[OpenACC API guide](https://www.openacc.org/sites/default/files/inline-files/OpenACC%20API%202.6%20Reference%20Guide.pdf)-->

[NVIDIA Nsight System](https://docs.nvidia.com/nsight-systems/)

<!--[NVIDIA Nsight Compute](https://developer.nvidia.com/nsight-compute)-->

<!--[CUDA Toolkit Download](https://developer.nvidia.com/cuda-downloads)-->

[Profiling timelines with NVTX](https://devblogs.nvidia.com/cuda-pro-tip-generate-custom-application-profile-timelines-nvtx/)

**NOTE**: To be able to see the Nsight System profiler output, please download Nsight System latest version from [here](https://developer.nvidia.com/nsight-systems).

Don't forget to check out additional [OpenACC Resources](https://www.openacc.org/resources) and join our [OpenACC Slack Channel](https://www.openacc.org/community#slack) to share your experience and get more help from the community.

--- 

## Licensing 

This material is released by NVIDIA Corporation under the Creative Commons Attribution 4.0 International (CC BY 4.0). 