Skip to content

Change Q-cut workflow to return data per-arc#71

Merged
SimonHeybrock merged 19 commits intomainfrom
qmap-per-arc
Oct 23, 2025
Merged

Change Q-cut workflow to return data per-arc#71
SimonHeybrock merged 19 commits intomainfrom
qmap-per-arc

Conversation

@SimonHeybrock
Copy link
Copy Markdown
Member

As requested by scientists. See also scipp/esslivedata#503.

SimonHeybrock and others added 3 commits October 13, 2025 10:39
- Create TestBifrostQCutWorkflow class with two comprehensive tests
- test_cut_along_q_norm_and_energy_transfer: Tests cutting along |Q| and energy transfer
- test_cut_along_qx_direction_and_energy_transfer: Tests cutting along Qx direction
- Both tests verify:
  * Correct output dimensions and coordinates
  * Proper unit conversions
  * Count preservation (no events lost during cutting operation)
- Tests follow existing patterns in workflow_test.py
- All tests pass successfully

Original prompt: We need to add tests for BifrostQCutWorkflow. TL;DR is that we want to be able to run something like `workflow.compute(CutData[SampleRun])` and verify the result (as far as feasible). @tests/bifrost/workflow_test.py demonstrates how similar workflows are created, thw Q-cut workflow is an extension.

Follow-up: Please group the test in a class TestBifrostQCutworflow. Use better test names. Try if you can add an assertion for the sum over the result, to see if no counts were lost.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Modified BifrostQCutWorkflow and the cut method to preserve the arc
dimension (renamed from triplet) in the output. The output is now 3-D
with dimensions (arc, axis_1, axis_2) instead of 2-D.

Changes:
- Added ArcEnergy type (NewType alias for sc.Variable) in types.py
- Added arc_energy() provider with values [2.7, 3.2, 3.8, 4.4, 5.0] meV
- Modified cut() function to:
  - Rename triplet dimension to arc
  - Concat only over non-arc dimensions
  - Add arc coordinate with proper energy values
- Updated existing tests to expect 3-D output with arc dimension
- Added new test test_cut_preserves_arc_dimension to verify arc preservation

All tests passing. Implemented using TDD (red-green cycle).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---

Original prompt: We need to modify BifrostQCutWorkflow and in particular
the cut method in @src/ess/bifrost/live.py --- the new requirement is to
preserve the arc dimension (which should have length 5). I think the input
EnergyData already has this dim, so all we might need to do is specify
which dims to concat() (instead of all dims). The new output should be 3-D,
with an arc dimension. There should be a coord with values [2.7, 3.2, 3.8,
4.4, 5.0] and unit=meV. Add a provider in the workflow instead of hardcoding
(unless there is one already). Use TDD red/green.

Follow-up: So the old code seems to use triplet instead of arc. We cannot
change this right now, but our modified cut function should rename triplet->arc.
The arc_number function is weird, not sure why it is needed (tracking down the
original author!). For now, let us just create our own ArcEnergy type (NewType
alias for sc.Variable) and set & use that.
Comment thread src/ess/bifrost/live.py Outdated
``data`` projected and histogrammed along the cut axes.
``data`` projected and histogrammed along the cut axes, with arc dimension.
"""
# Rename triplet dimension to arc
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greg and Rasmus refer to "arcs". I did not want to change the rest of the package, but we should consider doing so separately from this PR.

Comment thread src/ess/bifrost/live.py Outdated
``data`` projected and histogrammed along the cut axes, with arc dimension.
"""
# Rename triplet dimension to arc
data = data.rename_dims(triplet='arc')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. In the workflow, a 'triplet' is a bank. So there are 45 of them, not 5. How come the tests pass?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inspected this before implementing and it had length 5. I think there is just a confusion in terminology: Yes, a bank contains a triplet of tubes, and the "triplet" refers to 3 tubes within a channel. So there are 9*5 triplets, but the triplet dim has length 5 (9 is the channel)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a coincidence. The tests only use the first 5 banks:

@pytest.fixture(scope='module')
def simulation_detector_names() -> list[NeXusDetectorName]:
    with snx.File(simulated_elastic_incoherent_with_phonon()) as f:
        names = list(f['entry']['instrument'][snx.NXdetector].keys())
    return names[:5] # These should be enough to test the workflow.

You need to compute the arc number from the 'triplet' index (or from the position as in Greg's original code).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I use the triplet index, this relies on the (predictable???) order of groups in the NeXus file?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or worse, it relies on the order of what the user passes into the workflow at creation?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I can simply use the detector_number? The arc should be detector_number // (3*900), right?

But I still don't like the unpredictable shape/ordering. Is this really necessary, or could we refactor all the Bifrost workflows to use something predictable like (arc, channel, tube, pixel), with the option to have fewer than the full arc and channel count loaded (but no free triplet selection, breaking from what is possible now)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the workflow know how to order and shape the array? It has to ultimately rely on one of the mechanisms you described.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, instead of accepting a list of detector names the package could define this. We thus have control over the order, and know how to concat the banks into the (arc, channel) subspace.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require hard-coding the names of the detector groups. That seems too fragile given that they have already changed.

I think the only robust implementation is one that loads the detector positions or analyzer positions or d-spacings and groups by those. Similarly to how Greg's code works. But maybe we can do this without computing final energies; grouping in L2 or analyzer d-spacing should be enough.

Comment thread src/ess/bifrost/live.py
The final energies for each of the 5 BIFROST analyzer arcs.
"""
return ArcEnergy(
sc.array(dims=['arc'], values=[2.7, 3.2, 3.8, 4.4, 5.0], unit='meV')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of relying on low precision, herd coded numbers, can we use the computed final energies?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. Is this available here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the purpose of this is as a label for plots and UI.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think coords are the right place for storing pure plotting labels because they participate in computations.
I would have expected to store the arc index as a coord and then build a plot based on that and a table of energies if they are too difficult to compute. You could just add 'final_energy' to

def add_inelastic_coordinates(

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But aren't all the energies fixed? If there is calibration, wouldn't it differ for every channel (whereas here we want an energy coord per-arc)?

@SimonHeybrock
Copy link
Copy Markdown
Member Author

@jl-wynen Updated as discussed, does this make sense?

Comment thread src/ess/bifrost/detector.py Outdated
# to determine arc and channel indices
detector_numbers = []
for triplet in triplets:
if 'detector_number' not in triplet.coords:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this happen? I thought we always have a detector number.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before looking at it in detail, can't you assign an arc number in the same place where we fold the detector into (tube, length)? Then you wouldn't have to loop through detectors multiple times here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll try!

SimonHeybrock and others added 4 commits October 17, 2025 09:54
…ated_detector_bifrost

- Add scalar 'arc' and 'channel' coordinates in get_calibrated_detector_bifrost
  using sc.index() to avoid unnecessary units
- Simplify merge_triplets to read pre-computed arc/channel coordinates instead
  of recalculating them from detector_number
- Remove ~50 lines of redundant arc/channel calculation logic
- Update test to allow for new scalar coordinates in comparison

Original prompt: Can we simplify `merge_triplets` by assigning a scalar arc and
channel coord in get_calibrated_detector_bifrost?

Follow-up: Use sc.index instead of sc.scalar to avoid units, and don't specify
dtype since the data is small.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Comment thread src/ess/bifrost/detector.py Outdated
Comment on lines +154 to +155
arcs = [pair[0] for pair in sorted_pairs]
channels = [pair[1] for pair in sorted_pairs]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can directly construct sets here to avoid the intermediate lists.

Comment thread src/ess/bifrost/detector.py Outdated
Comment on lines +162 to +167
if len(sorted_triplets) == expected_count:
# Verify that all (arc, channel) combinations are present
expected_pairs = [
(arc, channel) for arc in unique_arcs for channel in unique_channels
]
if sorted_pairs == expected_pairs:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(sorted_triplets) == expected_count:
# Verify that all (arc, channel) combinations are present
expected_pairs = [
(arc, channel) for arc in unique_arcs for channel in unique_channels
]
if sorted_pairs == expected_pairs:
expected_pairs = [
(arc, channel) for arc in unique_arcs for channel in unique_channels
]
if sorted_pairs == expected_pairs:

Comment thread src/ess/bifrost/detector.py Outdated
return folded.squeeze()

# Fall back to simple concatenation if not a regular grid
return sc.concat(triplets, dim="triplet")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also use sorted_triplets here to be more consistent? You can just construct concatenated before the if and return it here.

Comment thread src/ess/bifrost/detector.py Outdated
sizes={'arc': len(unique_arcs), 'channel': len(unique_channels)},
)
# Remove size-1 dimensions (e.g., if only one channel is present)
return folded.squeeze()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match the docstring which claims that both dims as always present. And I think that is what we should return here so that downstream code doesn't have to support every possible combination of dims but only (arc, channel, ...) and (triplet, ...).

Comment thread src/ess/bifrost/live.py

# Add arc coordinate if we're working with arc dimension
# Note that this is for identifying the arc, it should NOT be used
# as a replacement for a precise final_energy coordinate!
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way people will see this comment when using the code. But I also don't know an alternative given that coords encode both compute and display data.

Comment thread tests/bifrost/workflow_test.py Outdated
# After slicing, rename arc -> triplet for comparison
expected = expected.rename_dims(triplet='arc')
elif 'arc' in energy_data.dims:
expected = expected.rename_dims(triplet='arc')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just make a new reference instead of writing a complicated test?

Comment thread tests/bifrost/workflow_test.py Outdated
sc.testing.assert_allclose(energy_data.bins.data, expected.bins.data)


class TestBifrostQCutWorkflow:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move to a separate file to match the file structure of the package and make it easier to find tests.

Comment thread tests/bifrost/workflow_test.py Outdated

# Compute both cut data and energy data to compare total counts
cut_data = qcut_workflow.compute(CutData[SampleRun])
energy_data = qcut_workflow.compute(EnergyData[SampleRun])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not how we normally test workflow. We just check the final result. If you compare two steps like this, you effectively test a single provider. So it shouldn't be a workflow test but a provider test.

Comment thread tests/bifrost/workflow_test.py Outdated
total_counts_after = sc.sum(cut_data).value
assert total_counts_before == total_counts_after

def test_cut_preserves_arc_dimension(self, qcut_workflow: sciline.Pipeline) -> None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant? The other tests already check the arc dim.

Avoid intermediate lists when extracting unique arcs and channels
by using set comprehension directly.

User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.
Remove redundant expected_count check and just compare the
sorted pairs directly with expected pairs.

User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.
Construct concatenated before the if statement and return it
in both the regular grid and fallback cases to ensure
consistent ordering.

User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.
Remove squeeze() to ensure the output always has either
(arc, channel) dimensions or (triplet) dimension, never
a mix like just (arc) or just (channel). This ensures
downstream code only needs to handle two cases.

User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.
Instead of complex logic to handle dimension transitions, create
a new reference file that matches the current output format with
arc and channel dimensions (5x2 grid).

The reference file needs to be uploaded to:
https://public.esss.dk/groups/scipp/ess/bifrost/3/computed_energy_data_simulated_5x2.h5

User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.
Move TestBifrostQCutWorkflow from workflow_test.py to live_test.py
to match the package structure (src/ess/bifrost/live.py).

User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.
Create cutting_test.py for provider tests that verify the cut
function behavior (e.g., count preservation). Keep only workflow
output verification in live_test.py.

User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.
Merge arc coordinate value checks into the first test to avoid
redundancy. The arc dimension presence and values are already
verified in test_cut_along_q_norm_and_energy_transfer.

User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.
@SimonHeybrock SimonHeybrock merged commit f3fc16c into main Oct 23, 2025
4 checks passed
@SimonHeybrock SimonHeybrock deleted the qmap-per-arc branch October 23, 2025 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants