separated Diffusion Model resampler for STM PR from PR #1794 by YongyiBWu · Pull Request #1798 · Mu2e/Offline

YongyiBWu · 2026-04-11T22:46:58Z

Sub-PR with only diffusion model resampler update, separated from PR #1794
Suggested changes from PR#1794 were applied. Response details see PR #1794

=================================
This pull request introduces a new score-based diffusion model machine learning tool, integrates it into the build system, and adds new modules for virtual detector resampling using this model. Additionally, it updates the generator ID enumeration to support a new generator and defines transformation defaults for the VDResampler. The changes are grouped below by theme.

Machine Learning Model Integration:

Added a new ScoreBasedDiffusionModel class for training and sampling using score-based diffusion models, including its header and implementation. This model supports configurable neural network architectures, optimizer and noise schedule selection, and model persistence. (MachineLearningTools/inc/ScoreBasedDiffusionModel.hh, MachineLearningTools/CMakeLists.txt, MachineLearningTools/src/SConscript) [1] [2] [3]
Integrated the new machine learning library into the build system and made it available to other modules. (MachineLearningTools/CMakeLists.txt, MachineLearningTools/src/SConscript, STMMC/src/SConscript) [1] [2] [3]

Virtual Detector Resampling Modules:

Added four new art modules for VDResampler: configuration, training, generation from model, and generation with mixing. These modules utilize the new machine learning tools and are registered in the build system. (STMMC/CMakeLists.txt)

Generator ID Updates:

Extended the GenId enumeration and its string mapping to include a new generator, STMDownStreamGenTool. (MCDataProducts/inc/GenId.hh) [1] [2]

VDResampler Transformation Defaults:

Added a header defining default transformation constants for the VDResampler. (STMMC/inc/VDResamplerTransformDefaults.hh)

FNALbuild · 2026-04-11T22:47:25Z

Hi @YongyiBWu,
You have proposed changes to files in these packages:

MCDataProducts
STMMC
MachineLearningTools

which require these tests: build.

@Mu2e/fnalbuild-users, @Mu2e/write have access to CI actions on main.

⌛ The following tests have been triggered for bbe08a4: build (Build queue - API unavailable)

About FNALbuild. Code review on Mu2e/Offline.

Copilot

Pull request overview

This PR adds a score-based diffusion model implementation to MachineLearningTools and introduces new STMMC modules that configure, train, and sample virtual-detector resampling models (including a “mix across sources” generator). It also extends GenId to tag the new downstream STM generator.

Changes:

Added ScoreBasedDiffusionModel (training, sampling, save/load) and wired it into the build.
Added VD resampler modules: configure (summary + auto-generated training FHiCL), train, generate-from-model, and generate-mix.
Extended GenId with STMDownStreamGenTool for generated particles from these modules.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`MachineLearningTools/inc/ScoreBasedDiffusionModel.hh`	Public API for score-based diffusion model training/sampling and CSV persistence.
`MachineLearningTools/src/ScoreBasedDiffusionModel.cc`	Core diffusion model implementation (network, optimizer, training loop, save/load, sampling).
`MachineLearningTools/CMakeLists.txt`	Builds/install the new MachineLearningTools library.
`MachineLearningTools/src/SConscript`	SCons build integration for the new package.
`STMMC/inc/VDResamplerTransformDefaults.hh`	Centralizes default transform constants shared by training/generation.
`STMMC/src/VDResamplerConfigure_module.cc`	Scans VD hits, writes hit summary CSV, generates per-PDG training FHiCL.
`STMMC/src/VDResamplerTrain_module.cc`	Collects transformed training samples and trains 1-stage or 2-stage diffusion models.
`STMMC/src/VDResamplerGenerateFromModel_module.cc`	Generates `GenParticle`s from a single model (1-stage or 2-stage).
`STMMC/src/VDResamplerGenerateMix_module.cc`	Mixes across multiple summary sources and particle ratios; loads appropriate models and generates `GenParticle`s.
`STMMC/CMakeLists.txt`	Registers the new STMMC art plugins and links required libraries.
`STMMC/src/SConscript`	Adds MachineLearningTools + SeedService dependencies for SCons plugin builds.
`MCDataProducts/inc/GenId.hh`	Adds `STMDownStreamGenTool` enum value and corresponding string name.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

FNALbuild · 2026-04-11T22:59:18Z

☀️ The build tests passed at bbe08a4.

Test	Result	Details
test with	✅	Command did not list any other PRs to include
merge	✅	Merged `bbe08a4` at `2cc428c`
build (prof)	✅	Log file. Build time: 04 min 09 sec
ceSimReco	✅	Log file.
g4test_03MT	✅	Log file.
transportOnly	✅	Log file.
POT	✅	Log file.
g4study	✅	Log file.
cosmicSimReco	✅	Log file.
cosmicOffSpill	✅	Log file.
ceSteps	✅	Log file.
ceDigi	✅	Log file.
muDauSteps	✅	Log file.
ceMix	✅	Log file.
rootOverlaps	✅	Log file.
g4surfaceCheck	✅	Log file.
trigger	✅	Log file.
check_cmake	✅	Log file.
FIXME, TODO	✅	TODO (0) FIXME (0) in 8 files
clang-tidy	➡️	8 errors 27 warnings
whitespace check	✅	no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at bbe08a4 after being merged into the base branch at 2cc428c.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

oksuzian · 2026-04-12T01:09:42Z

I'll review this pull request by examining the details and the code changes.Let me fetch the changed files directly from the PR's head branch to review them.Now let me also check the existing review comments on this PR:Here is my review of PR #1798. This is a substantial PR (~2,967 additions) introducing a custom score-based diffusion model ML framework and four new art modules for virtual detector resampling. I've categorized the issues by severity.

🔴 Bugs / Correctness Issues

1. `HepLorentzVector` constructor argument order is wrong

In both VDResamplerGenerateFromModel_module.cc and VDResamplerGenerateMix_module.cc, the HepLorentzVector is constructed as:

const CLHEP::HepLorentzVector fourMomParticle(eTotal, momParticle);

CLHEP's HepLorentzVector(double, Hep3Vector) interprets the first argument as the x-component, not energy. The correct constructor for (energy, 3-momentum) is HepLorentzVector(Hep3Vector, double):

const CLHEP::HepLorentzVector fourMomParticle(momParticle, eTotal);

This will produce completely wrong 4-vectors as written.

2. `loadModel()` double-initializes weights wastefully and risks inconsistency

In ScoreBasedDiffusionModel::loadModel(), the constructor is called (which randomly initializes all weights), and then immediately overwritten:

ScoreBasedDiffusionModel model(randFlat, randGaussQ, dim, ...);
// Overwrite initialized weights with loaded values
for (size_t l = 0; l < model.network_.size(); ++l) {
    model.network_[l].W = loadedNetwork[l].W;
    model.network_[l].b = loadedNetwork[l].b;
}

This wastes random engine state (consuming random numbers during construction), which subtly changes reproducibility compared to a model that was never loaded. Consider adding a private constructor that skips random initialization.

3. `trainingSampleSize_` assigned from `size_t` to `int` — truncation

In train(), trainingSampleSize_ = N; where N is size_t but trainingSampleSize_ is int. For very large datasets (>2^31 samples), this silently overflows.

4. `VDResamplerConfigure` — `ttree` is uninitialized when `doROOTDump` is false

The member TTree* ttree; is never initialized to nullptr. If doROOTDump is false, ttree is a dangling pointer. While it's not dereferenced in that path currently, this is a latent bug.

🟡 Robustness / Defensive Programming Issues

5. No validation of `VDr > 0` in training or generation modules

As also noted in an existing review comment, if VDr is configured as 0 or negative, the division rho = r / VDr produces inf/NaN. Both VDResamplerTrain, VDResamplerGenerateFromModel, and VDResamplerGenerateMix should validate VDr > 0 in their constructors.

6. Silent fallback for unknown optimizer/noise schedule strings

In VDResamplerTrain_module.cc, unrecognized SBDMoptimizer or SBDMnoiseSchedule values silently default to ADAM/COSINE. A typo like "Aadam" would silently use the wrong optimizer. Add validation or throw on unknown values.

7. Fisher-Yates shuffle has off-by-one potential

size_t j = static_cast<size_t>(randFlat_.fire() * (i + 1));

RandFlat::fire() returns [0, 1), so j ranges over [0, i] which is correct. However, if the engine ever returns exactly 1.0 (which some implementations may do), j == i+1 would be out of bounds. A safer approach: j = std::min(j, i).

8. `computeLoss()` is declared but never called

The member function computeLoss() is implemented but the loss is actually computed inline in the training loop. This is dead code that could become stale.

🟢 Style / Maintenance Issues

9. Inconsistent coding style

The codebase mixes different formatting conventions throughout:

Sometimes for(size_t i=0;i<z.size();++i) (no spaces), other times for (size_t i = 0; i < out; ++i) (with spaces)
Inconsistent brace placement ({ on same line vs. next line)
The Mu2e Offline codebase generally uses consistent formatting — this should be unified.

10. `typedef` instead of `using`

typedef cet::map_vector_key key_type;
typedef unsigned long VolumeId_type;

Modern C++ prefers using declarations. Also, key_type is defined but never used in some modules.

11. Member variables used as scratch space in `analyze()`

In VDResamplerConfigure and VDResamplerTrain, variables like x, y, z, px, py, pz, pdgId are stored as class members but are only used as local variables within analyze(). They should be local variables instead — storing them as members wastes memory and makes the class harder to reason about thread-safety.

12. Duplicated inverse-transform code in Generate modules

The coordinate inverse transform (arctanh, polar→Cartesian, momentum recovery) is duplicated verbatim between VDResamplerGenerateFromModel_module.cc and VDResamplerGenerateMix_module.cc. This should be extracted into a shared utility function (e.g., in a header under STMMC/inc/).

13. Magic numbers

Several magic numbers appear without constants:

100000 threshold for two-stage training in VDResamplerConfigure
1e-6 for radius safety check in multiple files
0.1 for minimum time in t_safe = std::max(time, 0.1)

These should be named constants or configurable parameters.

14. No `#include <memory>` in modules using `std::unique_ptr`

Several module source files use std::unique_ptr without explicitly including <memory>. This compiles because other headers transitively include it, but it's fragile.

Summary

Severity	Count	Key Items
🔴 Bug	4	Wrong `HepLorentzVector` arg order, uninitialized `ttree`, int truncation, RNG state pollution
🟡 Robustness	4	No `VDr` validation, silent optimizer fallback, shuffle edge case, dead code
🟢 Style	6	Duplicated transforms, member-as-local, magic numbers, inconsistent formatting

The most critical fix needed is the HepLorentzVector constructor argument order — this will produce wrong physics results silently.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

YongyiBWu · 2026-04-13T04:47:52Z

Geant4.10 and 11 both have HepLorentzVector(double, Hep3Vector) constructors according to their documentations, and some existing modules us that too. But to make things more consistent throughout the repo, all changed to HepLorentzVector(Hep3Vector, double) constructor now.
Since the loaded model is never used for retrain and only used for generating particles, and the model size remains the same, the reproducibility is not a problem. Random states may be wasted but not a large problem. Still, I added an optional flag to the constructor that turns off the RNG initialization when loading from a saved model.
trainingSampleSize_ has been changed to size_t. Comment added to remind use of std::stoull if parsing from a saved file.
Added a nullptr initialization.
VDr, VDz0 values are now validated in all 3 modules.
Now unrecognized settings prompt a warning message before the values fall back to default.
This does not really matter. The RandFlat.fire() returns in [0,1) so there should not be a problem with j=i+1. But I guess it doesn't hurt to add a line j = std::min(j, i);
Now the loss per dimension calculation calls the computeLoss function instead of using its own implementation. Both loss and gradient calculations now unite to a per dimensional basis, and additional dimensional check were added in computeLoss.
... Fine... spacing conventions unified.
key_type were removed. "typedef unsigned long VolumeId_type;" also used in multiple other files. Kept for consistency.
These are used to store variables and potentially used for writing into ROOT dumps. Not making sense to move them to local vars.
Transform methods along with relevant constants are now collected in a single header file VDResamplerTransforms.hh
Added in corresponding header file or declared as a constant.
Fixed as suggested.

Introduce VDResamplerTransforms (new header) and remove legacy VDResamplerTransformDefaults; centralize forward/inverse sample transforms and numeric safety constants. Harden ScoreBasedDiffusionModel by: adding initializeRandomWeights ctor flag, using size_t for trainingSampleSize_, initializing layer buffers safely, guarding weight initialization, adding dimension checks in computeLoss, improving gradient/Adam math/formatting, fixing gradient clipping early-return, validating and reconstructing loaded network shapes (loadModel uses initializeRandomWeights=false), and various style/robustness fixes. Update STMMC modules (Configure/GenerateFromModel/GenerateMix/Train) to use the new transforms, add runtime validation for VDr/VDz0, improve config parsing with warnings, initialize pointers, replace inlined transform code with calls into VDResampler, and minor API/constructor fixes (HepLorentzVector usage). Misc: remove unused typedefs, add small constants (kTwoStageTrainingHitThreshold), includes, and other small cleanups to improve stability and correctness.

YongyiBWu · 2026-04-13T04:51:53Z

@FNALbuild run build test

FNALbuild · 2026-04-13T04:52:15Z

⌛ The following tests have been triggered for a774f0a: build (Build queue - API unavailable)

FNALbuild · 2026-04-13T05:12:05Z

☀️ The build tests passed at a774f0a.

Test	Result	Details
test with	✅	Command did not list any other PRs to include
merge	✅	Merged `a774f0a` at `1d1b9f0`
build (prof)	✅	Log file. Build time: 08 min 37 sec
ceSimReco	✅	Log file.
g4test_03MT	✅	Log file.
transportOnly	✅	Log file.
POT	✅	Log file.
g4study	✅	Log file.
cosmicSimReco	✅	Log file.
cosmicOffSpill	✅	Log file.
ceSteps	✅	Log file.
ceDigi	✅	Log file.
muDauSteps	✅	Log file.
ceMix	✅	Log file.
rootOverlaps	✅	Log file.
g4surfaceCheck	✅	Log file.
trigger	❓	Log file. Return Code 1.
check_cmake	✅	Log file.
FIXME, TODO	✅	TODO (0) FIXME (0) in 8 files
clang-tidy	➡️	8 errors 24 warnings
whitespace check	✅	no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at a774f0a after being merged into the base branch at 1d1b9f0.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

brownd1978

I can't evaluate this code technically. Since it's a pure addition I think its OK.

separated PR (Diffusion Model resampler for STM) from PR Mu2e#1794

bbe08a4

Copilot AI review requested due to automatic review settings April 11, 2026 22:46

Copilot started reviewing on behalf of YongyiBWu April 11, 2026 22:47 View session

FNALbuild added the build running label Apr 11, 2026

Copilot AI reviewed Apr 11, 2026

View reviewed changes

Comment thread MachineLearningTools/src/ScoreBasedDiffusionModel.cc Outdated

Comment thread STMMC/src/VDResamplerConfigure_module.cc Outdated

Comment thread STMMC/src/VDResamplerTrain_module.cc Outdated

Comment thread STMMC/src/VDResamplerTrain_module.cc Outdated

YongyiBWu mentioned this pull request Apr 11, 2026

Add score-based diffusion model for STM downstream resampler and Stickman production target #1794

Closed

FNALbuild added build finished and removed build running labels Apr 11, 2026

oksuzian requested review from brownd1978 and corrodis April 12, 2026 02:11

Update STMMC/src/VDResamplerConfigure_module.cc

9a255eb

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

FNALbuild added build pending and removed build finished labels Apr 12, 2026

YongyiBWu added 2 commits April 13, 2026 00:49

Add comment

a774f0a

FNALbuild added build running and removed build pending labels Apr 13, 2026

FNALbuild added build finished and removed build running labels Apr 13, 2026

brownd1978 approved these changes Apr 24, 2026

View reviewed changes

oksuzian merged commit bb64680 into Mu2e:main Apr 25, 2026
13 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

separated Diffusion Model resampler for STM PR from PR #1794#1798

separated Diffusion Model resampler for STM PR from PR #1794#1798
oksuzian merged 4 commits intoMu2e:mainfrom
YongyiBWu:diffusion-model

YongyiBWu commented Apr 11, 2026 •

edited

Loading

Uh oh!

FNALbuild commented Apr 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FNALbuild commented Apr 11, 2026

Uh oh!

oksuzian commented Apr 12, 2026

Uh oh!

YongyiBWu commented Apr 13, 2026

Uh oh!

YongyiBWu commented Apr 13, 2026

Uh oh!

FNALbuild commented Apr 13, 2026

Uh oh!

FNALbuild commented Apr 13, 2026

Uh oh!

brownd1978 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

YongyiBWu commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FNALbuild commented Apr 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FNALbuild commented Apr 11, 2026

Uh oh!

oksuzian commented Apr 12, 2026

🔴 Bugs / Correctness Issues

1. HepLorentzVector constructor argument order is wrong

2. loadModel() double-initializes weights wastefully and risks inconsistency

3. trainingSampleSize_ assigned from size_t to int — truncation

4. VDResamplerConfigure — ttree is uninitialized when doROOTDump is false

🟡 Robustness / Defensive Programming Issues

5. No validation of VDr > 0 in training or generation modules

6. Silent fallback for unknown optimizer/noise schedule strings

7. Fisher-Yates shuffle has off-by-one potential

8. computeLoss() is declared but never called

🟢 Style / Maintenance Issues

9. Inconsistent coding style

10. typedef instead of using

11. Member variables used as scratch space in analyze()

12. Duplicated inverse-transform code in Generate modules

13. Magic numbers

14. No #include <memory> in modules using std::unique_ptr

Summary

Uh oh!

YongyiBWu commented Apr 13, 2026

Uh oh!

YongyiBWu commented Apr 13, 2026

Uh oh!

FNALbuild commented Apr 13, 2026

Uh oh!

FNALbuild commented Apr 13, 2026

Uh oh!

brownd1978 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

YongyiBWu commented Apr 11, 2026 •

edited

Loading

1. `HepLorentzVector` constructor argument order is wrong

2. `loadModel()` double-initializes weights wastefully and risks inconsistency

3. `trainingSampleSize_` assigned from `size_t` to `int` — truncation

4. `VDResamplerConfigure` — `ttree` is uninitialized when `doROOTDump` is false

5. No validation of `VDr > 0` in training or generation modules

8. `computeLoss()` is declared but never called

10. `typedef` instead of `using`

11. Member variables used as scratch space in `analyze()`

14. No `#include <memory>` in modules using `std::unique_ptr`