Skip to content

Ctrls performance#739

Draft
JPRichings wants to merge 2 commits intoQuEST-Kit:develfrom
JPRichings:ctrls_performance
Draft

Ctrls performance#739
JPRichings wants to merge 2 commits intoQuEST-Kit:develfrom
JPRichings:ctrls_performance

Conversation

@JPRichings
Copy link
Copy Markdown
Contributor

Initial pass on performance changed to remove thrust device_vector that is used to move ctrls to device that is causing a performance impact due to the thrust device_vector moving data from host to device on construction.

@JPRichings
Copy link
Copy Markdown
Contributor Author

Todo:

  • Confirm no horrible race condition is introduced by cudamemcpyToSymbol
  • Set the ctrls buffer size programmatically or to some sensible limit, say 50? (at least 64KB of constant memory on device so no worries giving ourselves some extra room
  • Apply fix to all kernels in QuEST

Assumptions in this code:

  • Single operation on quantum register at a time which allows the assumption that ctrls can be overwritten before each subsequent gate application.

@otbrown
Copy link
Copy Markdown
Collaborator

otbrown commented May 3, 2026

Alloc size 64 qubits -- add as macro somewhere if not done already MAX_QUREG_SIZE

const int NUM_THREADS_PER_BLOCK = 128;
const int NUM_THREADS_PER_BLOCK =128;

__device__ __constant__ int ctrl_device[30];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: use a MAX_NUM_QUBITS = 64 or something constant in constants.hpp

@TysonRayJones
Copy link
Copy Markdown
Member

Reminder of other stuff from meeting:

  • storing targets also in constant mem
  • retain passing ptrs to kernels; dispatcher passes constant mem ptr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants