# Results From Conway Game of Life Simulations using CUDA


By: Tanmay Grandhisiri

## Directions for your instructor

Include detailed instructions for compiling and reproducing the results you have in this report.  Automate as much of the steps as possible to help speed up grading.


A good idea is to show your specific commands here in the jupyter notebook.  Use the exclamation point (!) to run bash commands inside of jupyter. For example:

Change the cc to gcc in makefile if you want to test part 1, change cc to nvcc for parts 2 and execute the commands:
1. make
2. make clean
3. make test

## Serial Experiment

I compiled the serial code on dev-amd20-v100 using the following commands:

```
make

make clean

make test
```

The command below is used for part 1 testing: 100 is replaced with 5,10,15,20,25.
```
time echo "0 100" | ./gol

In ran the code on `dev-amd20-v100` and obtained the following results. The plot below shows the time taken to run the simulation for different iterations with and without the compiler optimization flag `-O3`. 

```
iterations = [5,10,15,20,25]
times = np.array([62.92,91.569,121.26,141.977,172.66])
times_optimized = np.array([30.252,31.667,33.966,35.949,38.586])
```

![image-2.png](attachment:image-2.png)

## CUDA Speedup

I used CUDA to spread the workload over many threads, with each thread responsible for updating a single cell in the grid. In my setup, I defined a grid of thread blocks such that each block contains a small 2D array of threads (16×16). This way, each block handles a portion of the overall grid, and the total number of blocks covers all cells. Each thread reads the current state of its corresponding cell and its neighbors, applies the Game of Life rules, and writes the result back to the correct location in memory. By assigning one thread to each cell, I can process the entire grid in parallel, significantly speeding up the computation compared to a CPU-only implementation.

* Idea 1
* Idea 2
* etc...

Show your timing results as either a table or a graph.

## Concluding Discussion

Describe what you learned during this process, what worked, what didn't work, where did you have trouble and what you learned

----