refactor: redesign pipeline groups 05/07/08, expand to 24 pipelines#9
Merged
Conversation
Fix three bugs preventing GPU execution across all pipeline variants: 1. SQD eigensolver (krylov/circuits/sqd.py): gpu_eigsh() was called with invalid kwargs (which, use_gpu) and a torch tensor instead of numpy. Every SQD batch silently fell back to CPU. Now converts to numpy first and removes unsupported kwargs. 2. IBM SQD API (methods/nqs/hi_nqs_sqd.py, hi_nqs_skqd.py): Updated recover_configurations() and solve_fermion() calls to match qiskit-addon-sqd >= 0.12 API (new probabilities arg, split num_elec_a/num_elec_b, SCIState return type). Default use_ibm_solver set to False since SCIState.amplitudes shape doesn't align with input configs for NQS teacher training. 3. Dockerfile.gpu: Add configs extra to pip install for YAML support. Also adds: - 23 pipeline scripts (experiments/pipelines/) covering 8 groups × 3 diag modes (classical Krylov, quantum Krylov, SQD) - Pipeline runner (run_all_pipelines.py) with comparison table - Quantum circuit method scripts and configs - quantum_circuit subpackage (methods/quantum_circuit/) All 23 pipelines verified passing on GPU in both WSL2 and Docker.
…elines - Group 05 (HF-only): add hf_sqd.py SQD variant, completing the 3-mode set - Group 07: rewrite from "DCI seed + separate NQS loop" to "NF training + DCI merge → iterative NQS refinement" (same as Group 02 stages 1-2 feeding into Group 06 stage 3) - Group 08: rewrite from "DCI+PT2 seed + separate NQS loop" to "NF training + DCI merge + PT2 expansion → iterative NQS refinement" (same as Group 03 stages 1-2.5 feeding into Group 06 stage 3) - Delete redundant experiments/methods/ (11 scripts) and experiments/configs/ (11 YAML files), replaced by experiments/pipelines/ - Update run_all_pipelines.py from 23 to 24 pipelines - Update all documentation: README.md, AGENTS.md, experiments_guide.md, pipelines.rst, yaml_configs.rst, h2/beh2 tutorials, ADR-001 All 24 pipelines verified passing on GPU in both WSL2 and Docker.
1. CRITICAL: vectorized_dedup returns np.ndarray — wrap with torch.from_numpy before .to(device) to prevent AttributeError crash (hi_nqs_sqd.py:267) 2. MEDIUM: _compute_orbital_occupancies device regression — move coeffs to configs.device (keep GPU) instead of forcing configs to CPU (sqd.py:889-891) 3. MEDIUM: Replace `mol_info.get(key) or fallback` with `mol_info.get(key, fallback)` to avoid falsy-zero trap, plus getattr guard for hamiltonian.integrals on non-molecular Hamiltonians (hi_nqs_sqd.py:219-221, hi_nqs_skqd.py:225-227) 4. LOW: Remove redundant ground_state.to(H_work.device) since both branches already place it there (sqd.py:786) Signed-off-by: thc1006 <84045975+thc1006@users.noreply.github.com>
1. run_all_pipelines.py: refactor --skip-iterative filter from fragile and/or chain to clear str.startswith(tuple) form 2. 04_nf_only/nf_sqd.py: add missing 5th tier (<=5000) in get_training_params to match all other pipeline scripts 3. 07/08 groups: fix YAML key mismatch — scripts read "n_iterations" but YAML defines "max_iterations". Standardize to "max_iterations" 4. 05_hf_only.yaml: add missing SQD config keys (sqd_noise_rate, sqd_num_batches, sqd_self_consistent_iters) Signed-off-by: thc1006 <84045975+thc1006@users.noreply.github.com>
2 tasks
thc1006
reviewed
Mar 26, 2026
Member
thc1006
left a comment
There was a problem hiding this comment.
Code Review Summary
Deep review of all 60 files. Found 12 issues (1 critical, 6 medium, 5 low).
Critical (will crash at runtime)
hi_nqs_sqd.py:267 — vectorized_dedup() returns np.ndarray, but .to(device) is called on it. This raises AttributeError whenever cumulative_basis.shape[0] > 0.
Medium
sqd.py:889—_compute_orbital_occupanciesmoves configs to CPU (coeffs' device) instead of moving coeffs to GPU. Performance regression.hi_nqs_sqd.py:219,hi_nqs_skqd.py:225—mol_info.get(key) or fallbackusesorwhich treats integer 0 as falsy. Should use.get(key, default).- Groups 07/08 (6 pipelines) — NF+DCI basis is computed in stages 1-2 but never passed to
run_hi_nqs_skqdin stage 3. These pipelines are functionally identical to Group 06. Tracked in #10. run_all_pipelines.py—--skip-iterativefilter has confusingand/orprecedence (works by coincidence).04_nf_only/nf_sqd.py—get_training_paramshas 4 tiers (missing <=5000), all other files have 5.- Groups 07/08 YAML — YAML defines
max_iterationsbut scripts readn_iterations— value silently ignored.
Low
05_hf_only.yamlmissing SQD keys- No
hamiltonian.integralsguard for non-molecular Hamiltonians - Redundant
.to(H_work.device)in sqd.py compare.pyincomplete config validation
Fix PR
All fixes except #4 (requires API change, tracked in #10) are in PR #11.
Member
|
可以直接 merge PR #11 到 PR #9 branch,然後再 merge PR #9 到 main @George930502 |
When neither mol_info nor hamiltonian.integrals provides n_orbitals, n_alpha, or n_beta, the old fallback to 0 would cause division-by-zero in occupancy computation and misconfigure AutoregressiveTransformer. Now raises ValueError with a clear message listing which values are missing. Addresses Copilot review on PR #11. Signed-off-by: thc1006 <84045975+thc1006@users.noreply.github.com>
fix: PR #9 code review findings — 10 bug fixes
3 tasks
thc1006
added a commit
to thc1006/qvartools
that referenced
this pull request
Apr 1, 2026
Critical fixes: - #1/#2: Use persistent prev_coeffs/prev_batch_configs for PT2 scoring (was using best_coeffs before it was defined in current iteration) - QuantumNoLab#5: Eviction checks cumulative_basis size (was checking best_batch_configs) - QuantumNoLab#7: Convert numpy indices to torch.LongTensor before indexing CUDA tensor - QuantumNoLab#8: Default energy_weight=0.0, entropy_weight=0.0 (was 0.1/0.05, silently changing existing behavior) Lower priority fixes: - QuantumNoLab#6: CIPSI docstring mentions build_sparse_hamiltonian requirement - QuantumNoLab#9: Warn when energy_weight>0 but hamiltonian=None Not applicable (won't fix): - QuantumNoLab#3/QuantumNoLab#4: config_integer_hash returns tuple for n_sites>=64; our CAS max is 58Q (<64), so int cast is safe Co-authored-by: leo07010 <leo07010@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
hf_sqd.pySQD variant, completing the 3-mode set (classical Krylov, quantum Krylov, SQD)experiments/methods/(11 scripts) andexperiments/configs/(11 YAML files), fully replaced byexperiments/pipelines/All 8 groups now follow a consistent pattern: each group has exactly 3 diag variants (Classical Krylov, Quantum Krylov, SQD) = 24 pipelines total.
Test plan
run_all_pipelines.py h2 --device cuda→ 24/24 OK)docker run --gpus all→ 24/24 OK)make html)ruff check)🤖 Generated with Claude Code