Skip to content

v3.7.0#387

Merged
TysonRayJones merged 25 commits intomasterfrom
develop
Sep 22, 2023
Merged

v3.7.0#387
TysonRayJones merged 25 commits intomasterfrom
develop

Conversation

@TysonRayJones
Copy link
Copy Markdown
Member

@TysonRayJones TysonRayJones commented Sep 21, 2023

Overview

This release integrates a cuQuantum backend, optimises distributed communication, and improves the unit tests.

New features

  • QuEST gained a new backend which integrates cuQuantum and Thrust for optimised simulation on modern NVIDIA GPUs. This is compiled with cmake argument -DUSE_CUQUANTUM=1, as detailed in the compile doc. Unlike QuEST's other backends, this does require prior installation of cuQuantum, outlined here. This deployment mode should run much faster than QuEST's custom GPU backend, and will soon enable multi-GPU simulation. The entirety of QuEST's API is supported! 🎉

Other changes

  • QuEST's distributed communication has been optimised when exchanging states via many maximum-size messages, thanks to the work of Jakub Adamski as per this manuscript.
  • Functions like multiQubitUnitary() and mixMultiQubitKrausMap() have relaxed the precision of their unitarity and CPTP checks, so they will complain less about user matrices. Now, for example, a unitarity matrix U is deemed valid only if every element of U*dagger(U) has a Euclidean distance of at most REAL_EPS from its expected identity-matrix element.
  • Unit tests now check that their initial register states are as expected before testing an operator. This ensures that some tests do not accidentally pass when they should be failing (like when run with an incorrectly specified GPU compute capability) due to an unexpected all-zero initial state.
  • Unit tests now use an improved and numerically stable function for generating random unitaries and Kraus maps, so should trigger fewer precision errors and false test failures.

TysonRayJones and others added 25 commits August 17, 2023 18:43
Note that this forces our cuQuantum backend to require its users to have a stream-ordered memory pool compatible GPU (which seems fair enough)
for converting between QuEST's interface and backend types (like Complex, ComplexMatrixN, and bitmasks) and cuQuantum's
Added all operators (like unitaries, sub-diagonal gates) which can be directly mapped to a cuQuantum calls.

The cuQuantum calls are:
- custatevecApplyMatrix
- custatevecApplyPauliRotation
- custatevecSwapIndexBits
- custatevecApplyGeneralizedPermutationMatrix

It appears that the remainder of QuEST's operators (decoherence channels, full-state diagonals, and phase functions) will need bespoke kernels
Before each unit test, the initial state of the registers (assumed to be in the result of initDebugState) is now explicitly checked.

This prevents passing tests when initDebugState() itself is failing, and (for example) yielding an all-zero state which sneakily satisfies some unit tests.

This will likely noticeably increase the total unit-tests runtime, but will gaurantee tests visibly, instantly fail when (for example) the GPU configuration is wrong and produces all-zero states
since documentation is now generated by Github Actions and published on Github Pages without repo caching
which were unavailable in the API, and were not used internally nor in tests. Furthermore, some of them (`initStateOfSingleQubit`) did something *very* different to what its comments suggested - and inefficiently!
although we are missing imports to avoid git conflict:

# include <thrust/sequence.h>
# include <thrust/iterator/zip_iterator.h>
# include <thrust/for_each.h>
Added all decoherence channels which can be directly mapped (without unacceptable performance damage) to a cuQuantum call.

The cuQuantum calls are:
- custatevecApplyMatrix
- custatevecApplyGeneralizedPermutationMatrix

and are called with matrices (some, diagonal) describing the channel superoperators.

The remaining decoherence channels require linearly combining device vectors (may use Thrust), bespoke GPU kernels, or a clever decomposition of the channel (e.g. 2 qubit depolarising) into a sequence of cuStateVec calls
Changed several operators represented by diagonal matrices but previously effected as one-qubit general matrices, to instead be effected as diagonals (duh)
integrated a new cuQuantum and Thrust GPU backend
Previously, an ad-hoc measure of distance from unitarity (or CPTP) was used.

Now, a unitarity U is deemed valid only if every element of U*dagger(U) has a Euclidean distance of at most REAL_EPS from the corresponding Identity matrix element.

A similiar scheme for CPTP Kraus channels is used.

This effectively loosens the precision required of unitaries and Kraus maps to functions like multiQubitUnitary and multiQubitKrausMap
@TysonRayJones TysonRayJones merged commit d4f75f7 into master Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant