Change Q-cut workflow to return data per-arc by SimonHeybrock · Pull Request #71 · scipp/essspectroscopy

SimonHeybrock · 2025-10-13T11:07:30Z

As requested by scientists. See also scipp/esslivedata#503.

- Create TestBifrostQCutWorkflow class with two comprehensive tests - test_cut_along_q_norm_and_energy_transfer: Tests cutting along |Q| and energy transfer - test_cut_along_qx_direction_and_energy_transfer: Tests cutting along Qx direction - Both tests verify: * Correct output dimensions and coordinates * Proper unit conversions * Count preservation (no events lost during cutting operation) - Tests follow existing patterns in workflow_test.py - All tests pass successfully Original prompt: We need to add tests for BifrostQCutWorkflow. TL;DR is that we want to be able to run something like `workflow.compute(CutData[SampleRun])` and verify the result (as far as feasible). @tests/bifrost/workflow_test.py demonstrates how similar workflows are created, thw Q-cut workflow is an extension. Follow-up: Please group the test in a class TestBifrostQCutworflow. Use better test names. Try if you can add an assertion for the sum over the result, to see if no counts were lost. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Modified BifrostQCutWorkflow and the cut method to preserve the arc dimension (renamed from triplet) in the output. The output is now 3-D with dimensions (arc, axis_1, axis_2) instead of 2-D. Changes: - Added ArcEnergy type (NewType alias for sc.Variable) in types.py - Added arc_energy() provider with values [2.7, 3.2, 3.8, 4.4, 5.0] meV - Modified cut() function to: - Rename triplet dimension to arc - Concat only over non-arc dimensions - Add arc coordinate with proper energy values - Updated existing tests to expect 3-D output with arc dimension - Added new test test_cut_preserves_arc_dimension to verify arc preservation All tests passing. Implemented using TDD (red-green cycle). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --- Original prompt: We need to modify BifrostQCutWorkflow and in particular the cut method in @src/ess/bifrost/live.py --- the new requirement is to preserve the arc dimension (which should have length 5). I think the input EnergyData already has this dim, so all we might need to do is specify which dims to concat() (instead of all dims). The new output should be 3-D, with an arc dimension. There should be a coord with values [2.7, 3.2, 3.8, 4.4, 5.0] and unit=meV. Add a provider in the workflow instead of hardcoding (unless there is one already). Use TDD red/green. Follow-up: So the old code seems to use triplet instead of arc. We cannot change this right now, but our modified cut function should rename triplet->arc. The arc_number function is weird, not sure why it is needed (tracking down the original author!). For now, let us just create our own ArcEnergy type (NewType alias for sc.Variable) and set & use that.

SimonHeybrock · 2025-10-13T11:08:32Z

-        ``data`` projected and histogrammed along the cut axes.
+        ``data`` projected and histogrammed along the cut axes, with arc dimension.
    """
+    # Rename triplet dimension to arc


Greg and Rasmus refer to "arcs". I did not want to change the rest of the package, but we should consider doing so separately from this PR.

jl-wynen · 2025-10-13T12:21:33Z

+        ``data`` projected and histogrammed along the cut axes, with arc dimension.
    """
+    # Rename triplet dimension to arc
+    data = data.rename_dims(triplet='arc')


I don't think this is correct. In the workflow, a 'triplet' is a bank. So there are 45 of them, not 5. How come the tests pass?

I inspected this before implementing and it had length 5. I think there is just a confusion in terminology: Yes, a bank contains a triplet of tubes, and the "triplet" refers to 3 tubes within a channel. So there are 9*5 triplets, but the triplet dim has length 5 (9 is the channel)?

This is a coincidence. The tests only use the first 5 banks:

@pytest.fixture(scope='module') def simulation_detector_names() -> list[NeXusDetectorName]: with snx.File(simulated_elastic_incoherent_with_phonon()) as f: names = list(f['entry']['instrument'][snx.NXdetector].keys()) return names[:5] # These should be enough to test the workflow.

You need to compute the arc number from the 'triplet' index (or from the position as in Greg's original code).

If I use the triplet index, this relies on the (predictable???) order of groups in the NeXus file?

Or worse, it relies on the order of what the user passes into the workflow at creation?

Maybe I can simply use the detector_number? The arc should be detector_number // (3*900), right?

But I still don't like the unpredictable shape/ordering. Is this really necessary, or could we refactor all the Bifrost workflows to use something predictable like (arc, channel, tube, pixel), with the option to have fewer than the full arc and channel count loaded (but no free triplet selection, breaking from what is possible now)?

How does the workflow know how to order and shape the array? It has to ultimately rely on one of the mechanisms you described.

Well, instead of accepting a list of detector names the package could define this. We thus have control over the order, and know how to concat the banks into the (arc, channel) subspace.

That would require hard-coding the names of the detector groups. That seems too fragile given that they have already changed.

I think the only robust implementation is one that loads the detector positions or analyzer positions or d-spacings and groups by those. Similarly to how Greg's code works. But maybe we can do this without computing final energies; grouping in L2 or analyzer d-spacing should be enough.

jl-wynen · 2025-10-13T12:23:03Z

+        The final energies for each of the 5 BIFROST analyzer arcs.
+    """
+    return ArcEnergy(
+        sc.array(dims=['arc'], values=[2.7, 3.2, 3.8, 4.4, 5.0], unit='meV')


Instead of relying on low precision, herd coded numbers, can we use the computed final energies?

I don't know. Is this available here?

Note that the purpose of this is as a label for plots and UI.

I don't think coords are the right place for storing pure plotting labels because they participate in computations.
I would have expected to store the arc index as a coord and then build a plot based on that and a table of energies if they are too difficult to compute. You could just add 'final_energy' to

essspectroscopy/src/ess/spectroscopy/indirect/conversion.py

Line 162 in 3e1a906

def add_inelastic_coordinates(

But aren't all the energies fixed? If there is calibration, wouldn't it differ for every channel (whereas here we want an energy coord per-arc)?

SimonHeybrock · 2025-10-16T13:31:16Z

@jl-wynen Updated as discussed, does this make sense?

jl-wynen · 2025-10-17T06:59:46Z

+    # to determine arc and channel indices
+    detector_numbers = []
+    for triplet in triplets:
+        if 'detector_number' not in triplet.coords:


When does this happen? I thought we always have a detector number.

jl-wynen · 2025-10-17T07:03:09Z

Before looking at it in detail, can't you assign an arc number in the same place where we fold the detector into (tube, length)? Then you wouldn't have to loop through detectors multiple times here.

Good point, I'll try!

…ated_detector_bifrost - Add scalar 'arc' and 'channel' coordinates in get_calibrated_detector_bifrost using sc.index() to avoid unnecessary units - Simplify merge_triplets to read pre-computed arc/channel coordinates instead of recalculating them from detector_number - Remove ~50 lines of redundant arc/channel calculation logic - Update test to allow for new scalar coordinates in comparison Original prompt: Can we simplify `merge_triplets` by assigning a scalar arc and channel coord in get_calibrated_detector_bifrost? Follow-up: Use sc.index instead of sc.scalar to avoid units, and don't specify dtype since the data is small. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

jl-wynen · 2025-10-22T11:31:15Z

+    arcs = [pair[0] for pair in sorted_pairs]
+    channels = [pair[1] for pair in sorted_pairs]


You can directly construct sets here to avoid the intermediate lists.

jl-wynen · 2025-10-22T11:33:02Z

+    if len(sorted_triplets) == expected_count:
+        # Verify that all (arc, channel) combinations are present
+        expected_pairs = [
+            (arc, channel) for arc in unique_arcs for channel in unique_channels
+        ]
+        if sorted_pairs == expected_pairs:


Suggested change

if len(sorted_triplets) == expected_count:

# Verify that all (arc, channel) combinations are present

expected_pairs = [

(arc, channel) for arc in unique_arcs for channel in unique_channels

]

if sorted_pairs == expected_pairs:

expected_pairs = [

(arc, channel) for arc in unique_arcs for channel in unique_channels

]

if sorted_pairs == expected_pairs:

jl-wynen · 2025-10-22T11:34:43Z

+            return folded.squeeze()
+
+    # Fall back to simple concatenation if not a regular grid
    return sc.concat(triplets, dim="triplet")


Can you also use sorted_triplets here to be more consistent? You can just construct concatenated before the if and return it here.

jl-wynen · 2025-10-22T11:37:03Z

+                sizes={'arc': len(unique_arcs), 'channel': len(unique_channels)},
+            )
+            # Remove size-1 dimensions (e.g., if only one channel is present)
+            return folded.squeeze()


This doesn't match the docstring which claims that both dims as always present. And I think that is what we should return here so that downstream code doesn't have to support every possible combination of dims but only (arc, channel, ...) and (triplet, ...).

jl-wynen · 2025-10-22T11:40:02Z

+
+    # Add arc coordinate if we're working with arc dimension
+    # Note that this is for identifying the arc, it should NOT be used
+    # as a replacement for a precise final_energy coordinate!


There is no way people will see this comment when using the code. But I also don't know an alternative given that coords encode both compute and display data.

jl-wynen · 2025-10-22T11:41:49Z

+            # After slicing, rename arc -> triplet for comparison
+            expected = expected.rename_dims(triplet='arc')
+        elif 'arc' in energy_data.dims:
+            expected = expected.rename_dims(triplet='arc')


Can you just make a new reference instead of writing a complicated test?

jl-wynen · 2025-10-22T11:43:01Z

    sc.testing.assert_allclose(energy_data.bins.data, expected.bins.data)
+
+
+class TestBifrostQCutWorkflow:


Please move to a separate file to match the file structure of the package and make it easier to find tests.

jl-wynen · 2025-10-22T11:45:27Z

+
+        # Compute both cut data and energy data to compare total counts
+        cut_data = qcut_workflow.compute(CutData[SampleRun])
+        energy_data = qcut_workflow.compute(EnergyData[SampleRun])


This is not how we normally test workflow. We just check the final result. If you compare two steps like this, you effectively test a single provider. So it shouldn't be a workflow test but a provider test.

jl-wynen · 2025-10-22T11:46:29Z

+        total_counts_after = sc.sum(cut_data).value
+        assert total_counts_before == total_counts_after
+
+    def test_cut_preserves_arc_dimension(self, qcut_workflow: sciline.Pipeline) -> None:


Redundant? The other tests already check the arc dim.

Avoid intermediate lists when extracting unique arcs and channels by using set comprehension directly. User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.

Remove redundant expected_count check and just compare the sorted pairs directly with expected pairs. User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.

Construct concatenated before the if statement and return it in both the regular grid and fallback cases to ensure consistent ordering. User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.

Remove squeeze() to ensure the output always has either (arc, channel) dimensions or (triplet) dimension, never a mix like just (arc) or just (channel). This ensures downstream code only needs to handle two cases. User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.

Instead of complex logic to handle dimension transitions, create a new reference file that matches the current output format with arc and channel dimensions (5x2 grid). The reference file needs to be uploaded to: https://public.esss.dk/groups/scipp/ess/bifrost/3/computed_energy_data_simulated_5x2.h5 User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.

Move TestBifrostQCutWorkflow from workflow_test.py to live_test.py to match the package structure (src/ess/bifrost/live.py). User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.

Create cutting_test.py for provider tests that verify the cut function behavior (e.g., count preservation). Keep only workflow output verification in live_test.py. User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.

Merge arc coordinate value checks into the first test to avoid redundancy. The arc dimension presence and values are already verified in test_cut_along_q_norm_and_energy_transfer. User request: Can you have a look a the comments in #71 (review) (only the latest review from today). Please address each comment, one per commit.

SimonHeybrock and others added 3 commits October 13, 2025 10:39

Apply automatic formatting

116fa79

SimonHeybrock commented Oct 13, 2025

View reviewed changes

SimonHeybrock requested a review from jl-wynen October 13, 2025 11:08

SimonHeybrock mentioned this pull request Oct 13, 2025

[Bifrost] Workflow changes and improvements scipp/esslivedata#503

Open

8 tasks

jl-wynen reviewed Oct 13, 2025

View reviewed changes

SimonHeybrock and others added 4 commits October 16, 2025 15:30

Add comments

fc8b667

Add folding mechanim

03e39f8

Update tests

7de113f

Apply automatic formatting

2292a1a

jl-wynen reviewed Oct 17, 2025

View reviewed changes

SimonHeybrock and others added 4 commits October 17, 2025 09:54

Remove redundant check

c9040fe

Remove redundant try/except

90e4b73

Fix docs build

6fd98d7

jl-wynen reviewed Oct 22, 2025

View reviewed changes

SimonHeybrock added 8 commits October 22, 2025 11:57

SimonHeybrock enabled auto-merge October 22, 2025 13:34

jl-wynen approved these changes Oct 23, 2025

View reviewed changes

SimonHeybrock merged commit f3fc16c into main Oct 23, 2025
4 checks passed

SimonHeybrock deleted the qmap-per-arc branch October 23, 2025 07:00

		arcs = [pair[0] for pair in sorted_pairs]
		channels = [pair[1] for pair in sorted_pairs]

		sc.testing.assert_allclose(energy_data.bins.data, expected.bins.data)


		class TestBifrostQCutWorkflow:

Conversation

SimonHeybrock commented Oct 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SimonHeybrock commented Oct 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants