Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge r2c into master #61

Merged
merged 72 commits into from
Aug 12, 2017
Merged

Merge r2c into master #61

merged 72 commits into from
Aug 12, 2017

Conversation

mtazzari
Copy link
Owner

We fully implemented usage of the Real to Complex (R2C) Fourier transform, fundamentally addressing issue #39.

In doing so, we also:

The relevant commits resolving the issues are in the last comment of the issue, which will be closed merging r2c to master.

mtazzari and others added 30 commits July 18, 2017 00:33
…e test test_shift_axis0. All 46 CPU tests pass.

The shift_axis0_core and relative helper functions perform the vertical swap (along axis0) of a rectangular
matrix of size (nx, nx/2), equivalent to np.fft.fftshift(data, axes=0).

I added the test test_shift_axis0.
I renamed the test test_shift into test_shift_axes01 to make explicit the fact that it performs a swap on both axes.
Only keep atomic operations. The burden of maintaining combinations is too high
The updated test passes, 19/25 tests on CPU fail and require more updates to reflect
the new data format.

No work done on GPU yet
Many tests fail, interpolate leads to a segfault as it expect a square image

python_package/tests/test_galario.py::test_uv_idx_R2C[SP] FAILED
python_package/tests/test_galario.py::test_uv_idx_R2C[DP] FAILED
python_package/tests/test_galario.py::test_sample_R2C[DP_par1] FAILED
python_package/tests/test_galario.py::test_sample_R2C[DP_par2] FAILED
python_package/tests/test_galario.py::test_sample_R2C[DP_par3] FAILED
python_package/tests/test_galario.py::test_uv_idx[SP] PASSED
python_package/tests/test_galario.py::test_uv_idx[DP] PASSED
python_package/tests/test_galario.py::test_interpolate[SP] PASSED
python_package/tests/test_galario.py::test_interpolate[DP] PASSED
python_package/tests/test_galario.py::test_FFT[SP] PASSED
python_package/tests/test_galario.py::test_FFT[DP] PASSED
python_package/tests/test_galario.py::test_shift_axes01[SP] PASSED
python_package/tests/test_galario.py::test_shift_axes01[DP] PASSED
python_package/tests/test_galario.py::test_shift_axis0[SP] PASSED
python_package/tests/test_galario.py::test_shift_axis0[DP] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[SP_par1] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[DP_par1] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[SP_par2] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[DP_par2] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[SP_par3] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[DP_par3] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[SP_par1] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[DP_par1] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[SP_par2] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[DP_par2] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[SP_par3] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[DP_par3] PASSED
python_package/tests/test_galario.py::test_reduce_chi2[SP] PASSED
python_package/tests/test_galario.py::test_reduce_chi2[DP] PASSED
python_package/tests/test_galario.py::test_loss[SP_par1] FAILED
python_package/tests/test_galario.py::test_loss[DP_par1] FAILED
python_package/tests/test_galario.py::test_sample[SP_par1] python/py.test.sh: line 36: 30519 Segmentation fault      (core dumped) /home/beaujean/.local/miniconda3/envs/galario3/bin/python -m pytest "$@"
…tions of a few functions; Py3 xrange

The CUDA version of the new shift_axis0 were not tested yet, a few bugfixes necessary.

The whole functions sample_d and copy_real_to_device were present twice.
Removed the additional version which was causing 'aready defined' error.

Converted a few xrange to range in utils.py

44 CPU and GPU tests pass.
test_uv_idx_r2c SP and DP do not pass.
`_acc_` is an old relict, now it's gone everywhere
GPU mostly untouched. Most tests even on CPU fail

python_package/tests/test_galario.py::test_uv_idx_R2C[SP] PASSED
python_package/tests/test_galario.py::test_uv_idx_R2C[DP] PASSED
python_package/tests/test_galario.py::test_sample_R2C[DP_par1] FAILED
python_package/tests/test_galario.py::test_sample_R2C[DP_par2] FAILED
python_package/tests/test_galario.py::test_sample_R2C[DP_par3] FAILED
python_package/tests/test_galario.py::test_uv_idx[SP] PASSED
python_package/tests/test_galario.py::test_uv_idx[DP] PASSED
python_package/tests/test_galario.py::test_interpolate[SP] PASSED
python_package/tests/test_galario.py::test_interpolate[DP] PASSED
python_package/tests/test_galario.py::test_FFT[SP] PASSED
python_package/tests/test_galario.py::test_FFT[DP] PASSED
python_package/tests/test_galario.py::test_shift_axes01[SP] PASSED
python_package/tests/test_galario.py::test_shift_axes01[DP] PASSED
python_package/tests/test_galario.py::test_shift_axis0[SP] PASSED
python_package/tests/test_galario.py::test_shift_axis0[DP] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[SP_par1] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[DP_par1] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[SP_par2] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[DP_par2] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[SP_par3] PASSED
python_package/tests/test_galario.py::test_apply_phase_2d[DP_par3] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[SP_par1] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[DP_par1] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[SP_par2] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[DP_par2] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[SP_par3] PASSED
python_package/tests/test_galario.py::test_apply_phase_sampled[DP_par3] PASSED
python_package/tests/test_galario.py::test_reduce_chi2[SP] PASSED
python_package/tests/test_galario.py::test_reduce_chi2[DP] PASSED
python_package/tests/test_galario.py::test_loss[SP_par1] FAILED
python_package/tests/test_galario.py::test_loss[DP_par1] FAILED
python_package/tests/test_galario.py::test_sample[SP_par1] python/py.test.sh: line 36: 11053 Segmentation fault      /usr/bin/python -m pytest "$@"
Some open issues remain, see #57
So far CPU code was build w/o optimization :(
It's an arbitrary matrix, this should help reduce the confusion of
what `nx, ny` mean. In this case, it's just the size of the matrix,
`ny` is not the size of the original real array
Abstraction layer for whether memory allocated by `fftw_malloc or ordinary `malloc
* openmp included even with CUDACC: there still needs to be host code
* fftw only w/o CUDACC
* indent definitions for better readibility
Copies the real input to a complex buffer with padding for FFTW
Not completed on GPU yet
mtazzari and others added 18 commits August 10, 2017 23:52
The phase is always applied to the sampled points, there is not anymore a use case
for applying the phase via 2d product of two matrices.
left over from duv and nrow renamings
Introduced change as profiling should that 1000s of calls to
cudaMemcpy were much slower than the computation
32^2=1024 was the maximum. The cuda manual recommends as rule of thumb to start
with 16^2=256 threads and tune from there

http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#thread-hierarchy
@mtazzari mtazzari added this to the 1.o milestone Aug 11, 2017
@mtazzari mtazzari requested a review from fredRos August 11, 2017 17:36
@mtazzari
Copy link
Owner Author

I would merge r2c to master before implementing the new feature of computing the image on the GPU.

Frederik Beaujean added 5 commits August 12, 2017 23:38
Profiling is done via `speed_benchmark.py` these days. The failing
test should be displayed on travis CI
Now all tests should pass on travis
@fredRos fredRos merged commit 9c3c357 into master Aug 12, 2017
@fredRos
Copy link
Collaborator

fredRos commented Aug 12, 2017

I unified the three speed_benchmark*.py files into one = previous speed_benchmark_dp.py and activated the unit tests on travis. Too many commits already in this branch, time to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interpolate threadblock function definitions w/o cuda Optimize memory creation and access on GPU
2 participants