Allow users to construct decoders with sparse parity check matrix#440
Allow users to construct decoders with sparse parity check matrix#440bmhowe23 wants to merge 11 commits into
Conversation
Signed-off-by: Ben Howe <bhowe@nvidia.com>
Signed-off-by: Ben Howe <bhowe@nvidia.com>
…nitializations Signed-off-by: Ben Howe <bhowe@nvidia.com>
Signed-off-by: Ben Howe <bhowe@nvidia.com>
Signed-off-by: Ben Howe <bhowe@nvidia.com>
This reverts commit 9df0571. Signed-off-by: Ben Howe <bhowe@nvidia.com>
I, Ben Howe <bhowe@nvidia.com>, hereby add my Signed-off-by to this commit: 9df0571 Signed-off-by: Ben Howe <bhowe@nvidia.com>
Signed-off-by: Ben Howe <bhowe@nvidia.com>
| // ============================================================================ | ||
|
|
||
| trt_decoder::trt_decoder(const cudaqx::tensor<uint8_t> &H, | ||
| trt_decoder::trt_decoder(const cudaq::qec::sparse_binary_matrix &H, |
There was a problem hiding this comment.
will this change break downstream users? Since inputs now need to be explicitly wrapped, this could introduce friction. If backward compatibility is important, it might be worth adding a temporary overload that accepts a tensor and converts internally.
| /// @param col_ptrs Column pointer array (length num_cols + 1); column j has | ||
| /// indices in \p row_indices[col_ptrs[j] .. col_ptrs[j+1]-1]. | ||
| /// @param row_indices Row indices of non-zeros (length nnz). | ||
| static sparse_binary_matrix from_csc(index_type num_rows, index_type num_cols, |
There was a problem hiding this comment.
Looks like from_csc / from_csr don’t validate inputs currently. It might be worth checking things like pointer array sizes (col_ptrs, row_ptrs), index bounds, and monotonicity of the pointer arrays. If not malformed inputs could lead to subtle bugs or out-of-bounds access later.
| for (std::size_t c = 0; c < block_size; c++) | ||
| syndrome ^= (c != qErr) && H.at({r, c}); | ||
| for (std::uint32_t c : H_e2d[qErr]) | ||
| syndrome ^= 1; |
There was a problem hiding this comment.
Looks like the logic changed from the old code. The old version XORed all columns except qErr, while the new one only toggles non-zeros in column qErr. The new version seems more aligned with a standard LUT but just wanted to confirm.
Continuation of [NVIDIA#440](NVIDIA#440). Closes [NVIDIA#379](NVIDIA#379). Introduces `cudaq::qec::sparse_binary_matrix` (CSC/CSR, `uint32_t` indices) as the decoder PCM type, so large PCMs no longer have to be materialized as a dense `cudaqx::tensor`. The sparse path is wired through PCM generation, round slicing, decoder construction, and Python bindings. `sliding_window` canonicalizes once and reads `H.ptr() / H.indices()` directly per window. ## Breaking changes - **Decoder constructor** now takes `const sparse_binary_matrix &H`. Dense `cudaqx::tensor<uint8_t>` callers still compile via implicit conversion, but the extension-point template changed — **out-of-tree plugins must be recompiled**. - **`generate_random_pcm` throws above `k_max_dense_pcm_elements` (400M)**. Use `generate_random_pcm_sparse` for larger matrices. - **`single_error_lut_example`** now uses column `qErr` of H directly instead of XOR-of-other-columns (the old form was only correct when every row of H had even weight). Example plugin only; production `single_error_lut` was already correct. ## Follow-up PRs - Sparse-aware PCM utilities: migrate sort_pcm_columns, simplify_pcm, reorder_pcm_columns, shuffle_pcm_columns, pcm_extend_to_n_rounds, pcm_to_sparse_*, pcm_from_sparse_*, and detector_error_model::canonicalize_for_rounds to sparse_binary_matrix so the full NVIDIA#379 scale pipeline avoids any dense cudaqx::tensor<uint8_t> allocation. - Docs: Sphinx entries for the new type/functions and the 400M dense cap. Note to reviewers: Quick context on why it's larger than the ~1000-line guideline: it's a foundational type-system change rather than an additive feature. The new sparse_binary_matrix type, the decoder constructor breaking change, all the plugin updates, the sparse generator, and the Python bindings all have to land together for the NVIDIA#379 fix to actually be usable end-to-end. Splitting it would create intermediate states where the breaking change has landed but users still can't construct a >50k-mechanism PCM which would mean shipping the API churn without the benefit. --------- Signed-off-by: Ben Howe <bhowe@nvidia.com> Signed-off-by: vedika-saravanan <vsaravanan@nvidia.com> Signed-off-by: Angela Burton <angelab@nvidia.com> Co-authored-by: Ben Howe <bhowe@nvidia.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: GitHub Action <action@github.com> Co-authored-by: Angela Burton <angelab@nvidia.com>
|
Superceded by #550. |
Related: #379
Note: before merging, consider breaking change effects further and prepare a separate docs update PR.