<div align="center"><h1>N-Body Problem</h1></div>

In
physics
and
astronomy,
an
N-­‐body
simulation
is
a
simulation
of
a
dynamical
system
of
particles,
usually
under
the
influence
of
physical
forces,
such
as
gravity.
Direct
N-­‐body
simulations
are
used
to
study
the
dynamical
evolution
of
star
clusters.
Beyond
gravitational
masses,
a
variety
of
physical
systems
can
be
modeled
by
the
interaction
of
N
particles,
e.g.,
atoms
or
ions
under
electrostatics
and
van
der
Waals
forces
lead
to
molecular
dynamics.
Also,
the
integral
formulation
of
problems
modeled
by
elliptic
partial
differential
equations
leads
to
numerical
integration
having
the
same
form,
computationally,
as
an
N-­‐body
interaction.
In
this
way,
Nbody
algorithms
are
applicable
to
acoustics,
electromagnetics,
and
fluid
dynamics.
Adding
to
this
diversity
of
applications,
radiosity
algorithms
for
global
illumination
problems
in
computer
graphics
also
benefit
from
N-­‐body
methods.

In [11]:
!nvidia-smi

Thu Dec  3 22:52:35 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla V100-SXM2...  Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   34C    P0    36W / 300W |      0MiB / 16160MiB |      5%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage    

# Serial Barnes-Hut

In [12]:
!nvcc -o SerialBarnesHut Barnes-Hut-Cpp.cpp -lm

In [13]:
!./SerialBarnesHut

configuration: 100 bodies, 1 time steps

1.377000 ms
Timestep 0 Center of Mass = (-1.069650e-01,-2.987980e-02,1.083692e-01)


In [14]:
!nsys profile --stats=true --force-overwrite true -o barnesHutProfile ./SerialBarnesHut

Collecting data...
The target application terminated. One or more process it created re-parented.
Waiting for termination of re-parented processes.
Use the `--wait` option to modify this behavior.
1.396000 ms
Timestep 0 Center of Mass = (-1.069650e-01,-2.987980e-02,1.083692e-01)
configuration: 100 bodies, 1 time steps

Processing events...
Capturing symbol files...
Saving temporary "/tmp/nsys-report-a735-4ad1-166c-2a01.qdstrm" file to disk...
Creating final output files...

Saved report file to "/tmp/nsys-report-a735-4ad1-166c-2a01.qdrep"

Exported successfully to
/tmp/nsys-report-a735-4ad1-166c-2a01.sqlite

Generating CUDA API Statistics...
CUDA API Statistics (nanoseconds)




CUDA trace data was not collected.


Generating Operating System Runtime API Statistics...
Operating System Runtime API Statistics (nanoseconds)

Time(%)      Total Time       Calls         Average         Minimum         Maximum  Name                                                                            


In [15]:
!nvprof ./SerialBarnesHut

configuration: 100 bodies, 1 time steps

1.401000 ms
Timestep 0 Center of Mass = (-1.069650e-01,-2.987980e-02,1.083692e-01)


# NAIVE BARNES HUT:

In [16]:
!(cd DynamicPar_NlogN/ && ls)

Constants.h  build.bash        cudaBhtree.cu  cudanlogn.cu
a.out	     createVideo.bash  cudaOctant.cu  deleteImgs.bash


In [17]:
!(cd DynamicPar_NlogN/ && chmod u+x cudanlogn.cu)

In [24]:
!(cd DynamicPar_NlogN/ && nvcc -o NaiveParallel -std=c++11 -rdc=true -arch=sm_70 cudanlogn.cu -run)

Nbodies: 100

Each Particle weight: 3e+28
______________________________

Beginning timestep: 1
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 1

Beginning timestep: 2
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 2

Beginning timestep: 3
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 3

Beginning timestep: 4
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 4

we made it


In [26]:
!(cd DynamicPar_NlogN/ && nvprof ./NaiveParallel)

Nbodies: 100
==1143== NVPROF is profiling process 1143, command: ./NaiveParallel

Each Particle weight: 3e+28
______________________________

Beginning timestep: 1
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 1

Beginning timestep: 2
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 2

Beginning timestep: 3
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 3

Beginning timestep: 4
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 4

we made it
==1143== Profiling application: ./NaiveParallel
==1143== Profiling result:
            Type  Time(%)      Time     Calls       Avg       Min       Max  Name
 GPU activities:   78.59%  8.9280us   

In [27]:
!(cd DynamicPar_NlogN/ && nsys profile --stats=true --force-overwrite true -o NaiveParallelProfile ./NaiveParallel)


Collecting data...
Nbodies: 100

Each Particle weight: 3e+28
______________________________

Beginning timestep: 1
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 1

Beginning timestep: 2
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 2

Beginning timestep: 3
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 3

Beginning timestep: 4
calculating force from star...
Building octree...
calculating interactions...
updating particle positions...
-------Done------- timestep: 4

we made it
Processing events...
Capturing symbol files...
Saving temporary "/tmp/nsys-report-a18d-207f-ad90-9406.qdstrm" file to disk...
Creating final output files...

Saved report file to "/tmp/nsys-report-a18d-207f-ad90-9406.qdrep"

Exported successful

# Barnes-Hut Algorithm Optimized With Shared Memory 

In [19]:
!(cd BarnesHut-SharedMemory/ && nvcc -arch=sm_70 -o barnesHut main.cu )

In [20]:
!(cd BarnesHut-SharedMemory/ && ./barnesHut 10000 100 0 )

Device count: 1configuration: 10000 bodies, 100 time steps
Start of timestep 
TIMESTEP = 0 
TIMESTEP = 1 
TIMESTEP = 2 
TIMESTEP = 3 
TIMESTEP = 4 
TIMESTEP = 5 
TIMESTEP = 6 
TIMESTEP = 7 
TIMESTEP = 8 
TIMESTEP = 9 
TIMESTEP = 10 
TIMESTEP = 11 
TIMESTEP = 12 
TIMESTEP = 13 
TIMESTEP = 14 
TIMESTEP = 15 
TIMESTEP = 16 
TIMESTEP = 17 
TIMESTEP = 18 
TIMESTEP = 19 
TIMESTEP = 20 
TIMESTEP = 21 
TIMESTEP = 22 
TIMESTEP = 23 
TIMESTEP = 24 
TIMESTEP = 25 
TIMESTEP = 26 
TIMESTEP = 27 
TIMESTEP = 28 
TIMESTEP = 29 
TIMESTEP = 30 
TIMESTEP = 31 
TIMESTEP = 32 
TIMESTEP = 33 
TIMESTEP = 34 
TIMESTEP = 35 
TIMESTEP = 36 
TIMESTEP = 37 
TIMESTEP = 38 
TIMESTEP = 39 
TIMESTEP = 40 
TIMESTEP = 41 
TIMESTEP = 42 
TIMESTEP = 43 
TIMESTEP = 44 
TIMESTEP = 45 
TIMESTEP = 46 
TIMESTEP = 47 
TIMESTEP = 48 
TIMESTEP = 49 
TIMESTEP = 50 
TIMESTEP = 51 
TIMESTEP = 52 
TIMESTEP = 53 
TIMESTEP = 54 
TIMESTEP = 55 
TIMESTEP = 56 
TIMESTEP = 57 
TIMESTEP = 58 
TIMESTEP = 59 
TIMESTEP = 60 
TIMESTEP = 61 
TI

In [21]:
!(cd BarnesHut-SharedMemory/ && nvprof ./barnesHut 10000 100 0 )


==947== NVPROF is profiling process 947, command: ./barnesHut 10000 100 0
Device count: 1configuration: 10000 bodies, 100 time steps
Start of timestep 
TIMESTEP = 0 
TIMESTEP = 1 
TIMESTEP = 2 
TIMESTEP = 3 
TIMESTEP = 4 
TIMESTEP = 5 
TIMESTEP = 6 
TIMESTEP = 7 
TIMESTEP = 8 
TIMESTEP = 9 
TIMESTEP = 10 
TIMESTEP = 11 
TIMESTEP = 12 
TIMESTEP = 13 
TIMESTEP = 14 
TIMESTEP = 15 
TIMESTEP = 16 
TIMESTEP = 17 
TIMESTEP = 18 
TIMESTEP = 19 
TIMESTEP = 20 
TIMESTEP = 21 
TIMESTEP = 22 
TIMESTEP = 23 
TIMESTEP = 24 
TIMESTEP = 25 
TIMESTEP = 26 
TIMESTEP = 27 
TIMESTEP = 28 
TIMESTEP = 29 
TIMESTEP = 30 
TIMESTEP = 31 
TIMESTEP = 32 
TIMESTEP = 33 
TIMESTEP = 34 
TIMESTEP = 35 
TIMESTEP = 36 
TIMESTEP = 37 
TIMESTEP = 38 
TIMESTEP = 39 
TIMESTEP = 40 
TIMESTEP = 41 
TIMESTEP = 42 
TIMESTEP = 43 
TIMESTEP = 44 
TIMESTEP = 45 
TIMESTEP = 46 
TIMESTEP = 47 
TIMESTEP = 48 
TIMESTEP = 49 
TIMESTEP = 50 
TIMESTEP = 51 
TIMESTEP = 52 
TIMESTEP = 53 
TIMESTEP = 54 
TIMESTEP = 55 
TIMESTEP = 56 
TIM

In [22]:
!(cd BarnesHut-SharedMemory/ && nsys profile --stats=true --force-overwrite true -o barnesHutProfile ./barnesHut 10000 100 0)


Collecting data...
Device count: 1configuration: 10000 bodies, 100 time steps
Start of timestep 
TIMESTEP = 0 
TIMESTEP = 1 
TIMESTEP = 2 
TIMESTEP = 3 
TIMESTEP = 4 
TIMESTEP = 5 
TIMESTEP = 6 
TIMESTEP = 7 
TIMESTEP = 8 
TIMESTEP = 9 
TIMESTEP = 10 
TIMESTEP = 11 
TIMESTEP = 12 
TIMESTEP = 13 
TIMESTEP = 14 
TIMESTEP = 15 
TIMESTEP = 16 
TIMESTEP = 17 
TIMESTEP = 18 
TIMESTEP = 19 
TIMESTEP = 20 
TIMESTEP = 21 
TIMESTEP = 22 
TIMESTEP = 23 
TIMESTEP = 24 
TIMESTEP = 25 
TIMESTEP = 26 
TIMESTEP = 27 
TIMESTEP = 28 
TIMESTEP = 29 
TIMESTEP = 30 
TIMESTEP = 31 
TIMESTEP = 32 
TIMESTEP = 33 
TIMESTEP = 34 
TIMESTEP = 35 
TIMESTEP = 36 
TIMESTEP = 37 
TIMESTEP = 38 
TIMESTEP = 39 
TIMESTEP = 40 
TIMESTEP = 41 
TIMESTEP = 42 
TIMESTEP = 43 
TIMESTEP = 44 
TIMESTEP = 45 
TIMESTEP = 46 
TIMESTEP = 47 
TIMESTEP = 48 
TIMESTEP = 49 
TIMESTEP = 50 
TIMESTEP = 51 
TIMESTEP = 52 
TIMESTEP = 53 
TIMESTEP = 54 
TIMESTEP = 55 
TIMESTEP = 56 
TIMESTEP = 57 
TIMESTEP = 58 
TIMESTEP = 59 
TIMESTEP = 60