Proof of concept: Initialise all models as ThreeFlavor #335

JostMigenda · 2024-05-15T18:28:24Z

For demonstration purposes, following the discussion in #309.

The changes are fairly small and well contained.

For most models (presn and ccsn), we already had code to map the columns in the input files (usually NU_E, NU_E_BAR and NU_X, i.e. a “OnePointFiveFlavor scheme”) to the TwoFlavor scheme; I’ve simply updated these existing mappings. With this, all tests run by pytest -m 'not snowglobes' succeed.

Of course, the FlavorTransformations are currently hardcoded to the TwoFlavor scheme; so any code that uses these transformations (mainly via SupernovaModel.get_transformed_flux) cannot currently switch to ThreeFlavor scheme. This is already changing in #308; so for the purposes of this proof of concept, I’ve marked those as xfails.

This is a proof of concept with minimal changes. With this, `pytest -m 'not snowglobes'` should succeed.

Odrzywolek_2010 didn’t require any changes ☺️

Sheshuk

Thank you for preparing this example.
It shows that really the transitioning all the model loaders to ThreeFlavor is not that hard, yes.
However

I wouldn't call this compact:

Already for CCSN models you have to make changes in:

PinchedModel
GarchingArchiveModel
Kuroda_2020
Fornax_2019 (several places)
Fornax_2021
Fornax_2022

Of course it's mostly repeating code, with slight variations. And of course it's there because the Fornax_2019 code itself is quite large. But with these changes it becomes even larger.

These changes are harder to test:

Any code is error prone and needs to be tested.

To validate these changes you need to test each of these models - that they still produce the same result, as before, when you convert the results back to TwoFlavor (and for this you'll have to write another conversion by hand, since you ditch ThreeFlavor>>TwoFlavor matrices).

With conversion matrices you can make proper unit testing of the matrices and their multiplication with objects - and be sure that all the conversions are correct.

In summary

I agree that this is doable and not too hard.
But what is the benefit of this approach? What do we gain?
Does it make the code cleaner, easier to maintain or to test?
Does it make any difference to the final user, or implementing FlavorTransformations, or further development?

python/snewpy/models/ccsn_loaders.py

jpkneller · 2024-05-15T19:28:58Z

Jost, some models distinguish NU_X and NU_X_BAR.

…tion in Fornax_2019

JostMigenda · 2024-05-16T21:18:54Z

@jpkneller: Good catch, thanks! Fixed now.

@Sheshuk:

I wouldn't call this compact:

Already for CCSN models you have to make changes in:

PinchedModel

GarchingArchiveModel

Kuroda_2020

Fornax_2019 (several places)

Fornax_2021

Fornax_2022

Yes, though that is a fundamental fact of SNEWPY’s current implementation: It is not possible to make such a change solely in the SupernovaModel abstract base class, because __init__ and get_initial_spectra both need to be overridden in child classes.
(Changing get_flux solely in the base class is possible; but as discussed in #309, that would leave a situation where a user would receive a mix of ThreeFlavor and TwoFlavor fluxes from a single model; so to avoid those, we’d again need to modify the __init__ and/or get_initial_spectra functions of all these child classes.)

Of course it's mostly repeating code, with slight variations. And of course it's there because the Fornax_2019 code itself is quite large. But with these changes it becomes even larger.

The reason for the repetition in my Fornax_2019 changes is that the Fornax_2019 code itself already was repetitive there; and I wanted to keep the changes minimal and obvious for this proof of concept.
In the new commit, I’ve made a slightly larger change so that the self._flavorkeys dict is only defined once; after these changes, Fornax_2019.__init__ is actually four lines shorter than before—and the total PR is five lines longer (of which four lines are xfails for the tests).

Regarding testing
I agree that testing is important to ensure we can trust any changes we make.
Right now, we already have tests that verify that our models can be initialised and that get_initial_spectra returns something in the expected format; those tests all succeed for this branch. ✅
We also have integration tests, which check the calculated event rates for a few scenarios. Unfortunately, these include calls to get_transformed_spectra; so they cannot run on our two draft PRs (#324 and this one). ⚠️

In principle, I think this combination of tests would normally be sufficient; in practice, while the integration tests don’t run, some replacement would be highly desirable. I’ve therefore spent the evening extending the existing initialisation tests to

initialise each model/progenitor with the snewpy v1.5 release
get_initial_spectra (i.e. TwoFlavor) and write those to file
initialise each model/progenitor with the branch from this PR here
run get_initial_spectra (with ThreeFlavor)
read the TwoFlavor file and compare numerical values of TwoFlavor.NU_X with ThreeFlavor.NU_{MU,TAU}, and of TwoFlavor.NU_X_BAR with ThreeFlavor.NU_{MU,TAU}_BAR

The result: In all cases the numerical values are identical. ✅

The test code is a bit hacky, unfortunately; but if you want, I can write it up and share it on Tuesday. (I’m on leave Fri/Mon.)
But to be clear: This is intended as a one-off test. My suggestion is that we transform all models into ThreeFlavor at initialisation and don’t use TwoFlavor at all, so keeping a test for converting back to TwoFlavor would not make sense—just like we haven’t had any tests that convert models from TwoFlavor back to “OnePointFiveFlavor” to test that we’re adding the NU_X_BAR component correctly.

In summary

I agree that this is doable and not too hard. But what is the benefit of this approach? What do we gain? Does it make the code cleaner, easier to maintain or to test?

In my opinion, yes.
Cleaner, because it would allow us to completely delete most TwoFlavor code, including special cases for handling NU_X(_BAR). (Maybe even all TwoFlavor code; but as you wrote earlier, maybe for some FlavorTransformations it’s easier to write them down in TwoFlavor and convert to ThreeFlavor with a matrix internally.)
Easier to maintain, because we can be certain that after model initialisation everything is (at least) ThreeFlavor. If we leave the models TwoFlavor after initialisation and only transform to ThreeFlavor at a later point, we’d always have to think about whether a model might still be TwoFlavor in the part of the code we’re currently editing, and whether we should use NU_X or NU_MU/TAU …
Easier to test, because—see above—we could delete a bunch of TwoFlavor-specific code; so that would simplify our testing, too.

Sheshuk

This looks good. We'll just have to remember to remove the xfails

…tion

JostMigenda added 3 commits May 15, 2024 18:25

switch all ccsn SupernovaModel classes to ThreeFlavor

de3c9d6

This is a proof of concept with minimal changes. With this, `pytest -m 'not snowglobes'` should succeed.

switch all presn SupernovaModel classes to ThreeFlavor

04d61ce

Odrzywolek_2010 didn’t require any changes ☺️

mark tests that use get_transformed_flux as xfail

9c995c3

JostMigenda mentioned this pull request May 15, 2024

Comparing Flavors #309

Open

jpkneller marked this pull request as ready for review May 15, 2024 18:36

Sheshuk reviewed May 15, 2024

View reviewed changes

python/snewpy/models/ccsn_loaders.py Outdated Show resolved Hide resolved

JostMigenda added 2 commits May 16, 2024 17:55

use NU_X_BAR values if available; remove duplicate _flavorkeys defini…

990d413

…tion in Fornax_2019

remove last usages of TwoFlavor

e3982e2

JostMigenda mentioned this pull request Jun 7, 2024

Adding the implementation of the FlavorScheme and FlavorMatrix #324

Merged

6 tasks

Sheshuk changed the base branch from Sheshuk/Flavor_Matrix_implementation to release_v2.0 June 7, 2024 15:48

Sheshuk previously approved these changes Jun 7, 2024

View reviewed changes

Merge branch 'release_v2.0' into JostMigenda/Flavor_Matrix_implementa…

d428a6e

…tion

Sheshuk dismissed their stale review via d428a6e June 7, 2024 15:59

Sheshuk approved these changes Jun 7, 2024

View reviewed changes

Sheshuk merged commit 690063a into release_v2.0 Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proof of concept: Initialise all models as ThreeFlavor #335

Proof of concept: Initialise all models as ThreeFlavor #335

JostMigenda commented May 15, 2024

Sheshuk left a comment

jpkneller commented May 15, 2024

JostMigenda commented May 16, 2024

I wouldn't call this compact:

In summary

Sheshuk left a comment

Proof of concept: Initialise all models as ThreeFlavor #335

Proof of concept: Initialise all models as ThreeFlavor #335

Conversation

JostMigenda commented May 15, 2024

Sheshuk left a comment

Choose a reason for hiding this comment

I wouldn't call this compact:

These changes are harder to test:

In summary

jpkneller commented May 15, 2024

JostMigenda commented May 16, 2024

I wouldn't call this compact:

In summary

Sheshuk left a comment

Choose a reason for hiding this comment