Parallelizing 2D Fast-Fourier Transform Implemented with Cooley-Tukey Algorithm

The main branch keeps the non-optimized version of code. For optimized versions with different methods, just switch to respective branches by git checkout <the branch>

Datasets

Decision Making in Parallelizing an Algorithm

flowchart TD
	N[Rewrite the Algorithm] --> A
	A --> |Def NO| N
	subgraph code [KNOW UR CODE]
	N
	A
	end
	A{is Parallelizable?}
	Y[["Distribution of Work
	Memory Access Pattern
	Occupancy/Device Usage"]]
	A -->|Independency-YES| Y
	A --> |Dependencies-Maybe NO?| M{{"Requires
	 Synchronizations
	 Barriers"}}
	subgraph machine [KNOW UR MACHINE]
	Y
	M
	end

    style Y fill:#f6f,stroke:#333,stroke-width:4px
	style M fill:#f6f,stroke:#333,stroke-width:4px
	style A fill:#99d,stroke:#f66,stroke-width:2px

File Structure

Each folder has (and will have) its respective readme describing what type of modules should be placed inside

Make sure you have the following folders

Data Files should be placed in ./FFT/Data/
Output results by methods from /FFT/eval/ will be output to /FFT/Data/Results

Prerequisite

FFTW C/C++ library for validating the results (not necessary)
- MacOS installation: brew install fftw

Compile with `FFTW` library

Change the FFTW library path as needed

Compile with `makefile` (recommended)

In the ./FFT/ folder use the bash command make

If a fresh build is desired, simply run make clean before running make or make all

Run

./It_CT.out or the correponding SLURM script if present

TO-DOS

1D FFT

Consider larger dataset

2D FFT

Consider larger dataset

`OpenMP` v.s. `MPI` v.s. `CUDA`

Keep Codes in Separate Branches
makefile for each
- ~~OpenMP~~
- MPI
- ~~CUDA~~
SLURM script for each
- OpenMP
- MPI
- ~~CUDA~~

Progress Table

	OpenMP	MPI	CUDA
1D FFT	✔️	Not Considering	✅
2D FFT	✔️	❌	❌
Performance Eval Subroutines	✔️	✔️	✅

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
EEGmethods		EEGmethods
FFT		FFT
py-plot		py-plot
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parallelizing 2D Fast-Fourier Transform Implemented with Cooley-Tukey Algorithm

Datasets

Decision Making in Parallelizing an Algorithm

File Structure

Prerequisite

Compile with `FFTW` library

Compile with `makefile` (recommended)

Run

TO-DOS

1D FFT

2D FFT

`OpenMP` v.s. `MPI` v.s. `CUDA`

Progress Table

About

Releases

Contributors 2

Languages

Xemin0/HPC_Proj

Folders and files

Latest commit

History

Repository files navigation

Parallelizing 2D Fast-Fourier Transform Implemented with Cooley-Tukey Algorithm

Datasets

Decision Making in Parallelizing an Algorithm

File Structure

Prerequisite

Compile with FFTW library

Compile with makefile (recommended)

Run

TO-DOS

1D FFT

2D FFT

OpenMP v.s. MPI v.s. CUDA

Progress Table

About

Topics

Resources

Stars

Watchers

Forks

Releases

Contributors 2

Languages

Compile with `FFTW` library

Compile with `makefile` (recommended)

`OpenMP` v.s. `MPI` v.s. `CUDA`