Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow better scaling of the deposition with number of threads #82

Merged
merged 2 commits into from
Jul 16, 2017

Conversation

RemiLehe
Copy link
Member

This pull request introduces two changes, in order to have better scaling of the deposition routines with respect to the number of threads:

  • The dimension of the global arrays (which duplicate data for each thread) are swapped (so that the thread index is the slowest index)
  • The allocation of arrays that are local to each thread is removed

It turns out that on a Haswell 32-core CPU, this improves the scaling with threads quite dramatically.

@MKirchen MKirchen merged commit f8536d7 into fbpic:cpuprange Jul 16, 2017
@RemiLehe RemiLehe deleted the better_thread_scaling branch July 16, 2017 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants