Skip to content

CircuitToEinsum for QFT: batched_amplitudes slower than qsim full statevector simulation #23

Answered by yangcal
rht asked this question in Q&A
Discussion options

You must be logged in to vote

Hello, thanks for reaching out.

First, regarding the performance statement in our tech blog for cuQuantum beta, the benchmark data shown in the blog was measured with cuStateVec. Both cuQuantum libraries (cuStateVec/cuTensorNet) have been improved significantly compared to when the blog was written. Generally compared with state vector simulation, tensor networks trade computational cost for memory saving, and one would expect some performance overhead.

Going back to your benchmark, there are a couple of things to note.

First, when measuring the performance of GPU functions, you should not use timing utils for CPU. You can try either cupyx.profiler.benchmark() or use cupy.cuda.get_elapsed…

Replies: 3 comments 8 replies

Comment options

You must be logged in to vote
8 replies
@leofang
Comment options

@rht
Comment options

@leofang
Comment options

@rht
Comment options

@leofang
Comment options

Answer selected by leofang

This comment has been hidden.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants
Converted from issue

This discussion was converted from issue #22 on December 09, 2022 19:08.