Skip to content

Commit

Permalink
Add Rust-based SparsePauliOp.to_matrix and Miri tests (#11388)
Browse files Browse the repository at this point in the history
* Add Rust-based `SparsePauliOp.to_matrix`

This rewrites the numerical version of `SparsePauliOp.to_matrix` to be
written in parallelised Rust, building up the matrices row-by-row rather
than converting each contained operator to a matrix individually and
summing them.

The new algorithms are complete row-based, which is embarrassingly
parallel for dense matrices, and parallelisable with additional copies
and cumulative sums in the CSR case.

The two new algorithms are an asymptotic complexity improvement for both
dense and sparse matrices over the "sum the individual matrices"
version.  In the dense case, the cost goes from

        O(4 ** num_qubits * num_ops)

to

        O(4 ** num_qubits + (2 ** num_qubits) * reduced_num_ops)

where the first term is from the zeroing of the matrix, and the second
is from updating the elements in-place.  `reduced_num_ops` is the number
of explicitly stored operations after Pauli-operator uniqueness
compaction, so is upper-bounded as `4 ** num_qubits`.  (The Pauli
compaction goes as `O(num_ops)`, so is irrelevant to the complexity
discussion.) The CSR form of the algorithm goes as

        O(2 ** num_qubits * reduced_num_ops * lg(reduced_num_ops))

which (I think! - I didn't fully calculate it) is asymptotically the
same as before, but in practice the constant factors and intermediate
memory use are *dramatically* reduced, and the new algorithm is
threadable with an additional `O(2 ** num_qubits * reduced_num_ops)`
intermediate memory overhead (the serial form has only
`O(reduced_num_ops)` memory overhead).

The `object`-coefficients form is left as-is to avoid exploding the
complexity in Rust space; these objects are already slow and unsuited
for high-performance code, so the effects should be minimal.

* Add non-blocking Miri to CI

As we begin to include more `unsafe` code in the Rust-accelerated
components, it is becoming more important for us to test these in an
undefined-behaviour sanitiser.  This is done in a separate CI job
because:

- we do not yet know how stable Miri will be, so we don't want to block
  on it.

- some dependencies need their version-resolution patching to
  Miri-compatible versions, but we want to run our regular test suites
  with the same versions of packages we will be building against.

* Parallelise cumulative nnz sum

This parallelises the previously serial-only cumulative sum of the
`indptr` array of number of non-zero entries at the end.  In practice, I
didn't actually see any change in performance from this, but
philosophically it feels like the right thing to do.

* Update Miri pin to later version of crossbeam-epohc

* Improve error handling and messages

* Simplify unnecessary match

* Add link to environment variable configuration

* Add link to Rayon plumbing README

* Add explicit test of serial and parallel modes
  • Loading branch information
jakelishman committed Apr 26, 2024
1 parent 1802732 commit d084aeb
Show file tree
Hide file tree
Showing 9 changed files with 983 additions and 14 deletions.
46 changes: 46 additions & 0 deletions .github/workflows/miri.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Miri
on:
push:
pull_request:
concurrency:
group: ${{ github.repository }}-${{ github.ref }}-${{ github.head_ref }}-${{ github.workflow }}
# Only cancel in PR mode. In push mode, don't cancel so we don't see spurious test "failures",
# and we get coverage reports on Coveralls for every push.
cancel-in-progress: ${{ github.event_name == 'pull_request' }}

jobs:
miri:
if: github.repository_owner == 'Qiskit'
name: Miri
runs-on: ubuntu-latest
env:
RUSTUP_TOOLCHAIN: nightly

steps:
- uses: actions/checkout@v4

- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@nightly
with:
components: miri

- name: Prepare Miri
run: |
set -e
# Some of our dependencies aren't Miri-safe with their current release versions. These
# need overriding with known-good versions to run against until the Miri-safe versions are
# released and updated in our Cargo.lock.
cat >>Cargo.toml <<EOF
[patch.crates-io]
crossbeam-epoch = { git = "https://github.com/crossbeam-rs/crossbeam", rev = "9e859610" }
EOF
cargo miri setup
- name: Run Miri
run: cargo miri test
env:
# - `tree-borrows` is required for crossbeam components.
# - `symbolic-alignment-check` is extra checking.
# - `strict-provenance` is extra checking.
MIRIFLAGS: '-Zmiri-tree-borrows -Zmiri-symbolic-alignment-check -Zmiri-strict-provenance'
52 changes: 52 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -556,6 +556,58 @@ you just need to update the reference images as follows:

Note: If you have run `test/ipynb/mpl_tester.ipynb` locally it is possible some file metadata has changed, **please do not commit and push changes to this file unless they were intentional**.


### Testing Rust components

Rust-accelerated functions are generally tested from Python space, but in cases
where there is Rust-specific internal details to be tested, `#[test]` functions
can be included inline. Typically it's most convenient to place these in a
separate inline module that is only conditionally compiled in, such as

```rust
#[cfg(test)]
mod tests {
#[test]
fn my_first_test() {
assert_eq!(2, 1 + 1);
}
}
```

To run the Rust-space tests, do

```bash
cargo test --no-default-features
```

Our Rust-space components are configured such that setting the
``-no-default-features`` flag will compile the test runner, but not attempt to
build a linked CPython extension module, which would cause linker failures.

### Unsafe code and Miri

Any `unsafe` code added to the Rust logic should be exercised by Rust-space
tests, in addition to the more complete Python test suite. In CI, we run the
Rust test suite under [Miri](https://github.com/rust-lang/miri) as an
undefined-behavior sanitizer.

Miri is currently only available on `nightly` Rust channels, so to run it
locally you will need to ensure you have that channel available, such as by
```bash
rustup install nightly --components miri
```

After this, you can run the Miri test suite with
```bash
MIRIFLAGS="<flags go here>" cargo +nightly miri test
```

For the current set of `MIRIFLAGS` used by Qiskit's CI, see the
[`miri.yml`](https://github.com/Qiskit/qiskit/blob/main/.github/workflows/miri.yml)
GitHub Action file. This same file may also include patches to dependencies to
make them compatible with Miri, which you would need to temporarily apply as
well.

## Style and lint

Qiskit uses three tools for verify code formatting and lint checking. The
Expand Down
4 changes: 4 additions & 0 deletions crates/accelerate/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ pub mod two_qubit_decompose;
pub mod utils;
pub mod vf2_layout;

mod rayon_ext;
#[cfg(test)]
mod test;

#[inline]
pub fn getenv_use_multiple_threads() -> bool {
let parallel_context = env::var("QISKIT_IN_PARALLEL")
Expand Down
171 changes: 171 additions & 0 deletions crates/accelerate/src/rayon_ext.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
// This code is part of Qiskit.
//
// (C) Copyright IBM 2023
//
// This code is licensed under the Apache License, Version 2.0. You may
// obtain a copy of this license in the LICENSE.txt file in the root directory
// of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.
//
// Any modifications or derivative works of this code must retain this
// copyright notice, and modified files need to carry a notice indicating
// that they have been altered from the originals.

//! Extension structs for use with Rayon parallelism.

// See https://github.com/rayon-rs/rayon/blob/v1.10.0/src/iter/plumbing/README.md (or a newer
// version) for more of an explanation of how Rayon's plumbing works.

use rayon::iter::plumbing::*;
use rayon::prelude::*;

pub trait ParallelSliceMutExt<T: Send>: ParallelSliceMut<T> {
/// Create a parallel iterator over mutable chunks of uneven lengths for this iterator.
///
/// # Panics
///
/// Panics if the sums of the given lengths do not add up to the length of the slice.
#[track_caller]
fn par_uneven_chunks_mut<'len, 'data>(
&'data mut self,
chunk_lengths: &'len [usize],
) -> ParUnevenChunksMut<'len, 'data, T> {
let mut_slice = self.as_parallel_slice_mut();
let chunk_sum = chunk_lengths.iter().sum::<usize>();
let slice_len = mut_slice.len();
if chunk_sum != slice_len {
panic!("given slices of total size {chunk_sum} for a chunk of length {slice_len}");
}
ParUnevenChunksMut {
chunk_lengths,
data: mut_slice,
}
}
}

impl<T: Send, S: ?Sized> ParallelSliceMutExt<T> for S where S: ParallelSliceMut<T> {}

/// Very similar to Rayon's [rayon::slice::ChunksMut], except that the lengths of the individual
/// chunks are arbitrary, provided they sum to the total length of the slice.
#[derive(Debug)]
pub struct ParUnevenChunksMut<'len, 'data, T> {
chunk_lengths: &'len [usize],
data: &'data mut [T],
}

impl<'len, 'data, T: Send + 'data> ParallelIterator for ParUnevenChunksMut<'len, 'data, T> {
type Item = &'data mut [T];

#[track_caller]
fn drive_unindexed<C: UnindexedConsumer<Self::Item>>(self, consumer: C) -> C::Result {
bridge(self, consumer)
}
}

impl<'len, 'data, T: Send + 'data> IndexedParallelIterator for ParUnevenChunksMut<'len, 'data, T> {
#[track_caller]
fn drive<C: Consumer<Self::Item>>(self, consumer: C) -> C::Result {
bridge(self, consumer)
}

fn len(&self) -> usize {
self.chunk_lengths.len()
}

#[track_caller]
fn with_producer<CB: ProducerCallback<Self::Item>>(self, callback: CB) -> CB::Output {
callback.callback(UnevenChunksMutProducer {
chunk_lengths: self.chunk_lengths,
data: self.data,
})
}
}

struct UnevenChunksMutProducer<'len, 'data, T: Send> {
chunk_lengths: &'len [usize],
data: &'data mut [T],
}

impl<'len, 'data, T: Send + 'data> Producer for UnevenChunksMutProducer<'len, 'data, T> {
type Item = &'data mut [T];
type IntoIter = UnevenChunksMutIter<'len, 'data, T>;

fn into_iter(self) -> Self::IntoIter {
Self::IntoIter::new(self.chunk_lengths, self.data)
}

#[track_caller]
fn split_at(self, index: usize) -> (Self, Self) {
// Technically quadratic for a full-depth split, but let's worry about that later if needed.
let data_mid = self.chunk_lengths[..index].iter().sum();
let (chunks_left, chunks_right) = self.chunk_lengths.split_at(index);
let (data_left, data_right) = self.data.split_at_mut(data_mid);
(
Self {
chunk_lengths: chunks_left,
data: data_left,
},
Self {
chunk_lengths: chunks_right,
data: data_right,
},
)
}
}

#[must_use = "iterators do nothing unless consumed"]
struct UnevenChunksMutIter<'len, 'data, T> {
chunk_lengths: &'len [usize],
// The extra `Option` wrapper here is to satisfy the borrow checker while we're splitting the
// `data` reference. We need to consume `self`'s reference during the split before replacing
// it, which means we need to temporarily set the `data` ref to some unowned value.
// `Option<&mut [T]>` means we can replace it temporarily with the null reference, ensuring the
// mutable aliasing rules are always upheld.
data: Option<&'data mut [T]>,
}

impl<'len, 'data, T> UnevenChunksMutIter<'len, 'data, T> {
fn new(chunk_lengths: &'len [usize], data: &'data mut [T]) -> Self {
Self {
chunk_lengths,
data: Some(data),
}
}
}

impl<'len, 'data, T> Iterator for UnevenChunksMutIter<'len, 'data, T> {
type Item = &'data mut [T];

#[track_caller]
fn next(&mut self) -> Option<Self::Item> {
if self.chunk_lengths.is_empty() {
return None;
}
let (out_data, rem_data) = self
.data
.take()
.unwrap()
.split_at_mut(self.chunk_lengths[0]);
self.chunk_lengths = &self.chunk_lengths[1..];
self.data = Some(rem_data);
Some(out_data)
}

fn size_hint(&self) -> (usize, Option<usize>) {
(self.chunk_lengths.len(), Some(self.chunk_lengths.len()))
}
}
impl<'len, 'data, T> ExactSizeIterator for UnevenChunksMutIter<'len, 'data, T> {}
impl<'len, 'data, T> DoubleEndedIterator for UnevenChunksMutIter<'len, 'data, T> {
#[track_caller]
fn next_back(&mut self) -> Option<Self::Item> {
if self.chunk_lengths.is_empty() {
return None;
}
let pos = self.chunk_lengths.len() - 1;
let data_pos = self.data.as_ref().map(|x| x.len()).unwrap() - self.chunk_lengths[pos];
let (rem_data, out_data) = self.data.take().unwrap().split_at_mut(data_pos);
self.chunk_lengths = &self.chunk_lengths[..pos];
self.data = Some(rem_data);
Some(out_data)
}
}
Loading

0 comments on commit d084aeb

Please sign in to comment.