Support 2d sources and sources away from the origin#106
Conversation
There was a problem hiding this comment.
We keep the pulse_profile.py around for backward compatibility (the ess_pulse submoduleneeds it and this is needed in some places inessreduce`). But in principle, we don't need it anymore.
| mask = ~one_mask(self.data.masks) | ||
| mask.unit = "" | ||
| return (self.data.coords[self.dim] * mask).min() | ||
| da = self.data.copy(deep=False) |
There was a problem hiding this comment.
Fixed a bug here: the min value was always returning 0 because something multiplied by a False would give 0 and that is the min. Instead, the masked values need to be ignored.
| if wmax is None: | ||
| wmax = p_wav.coords[w_dim].max() | ||
|
|
||
| time_interpolator = interp1d(p_time, dim=t_dim, fill_value="extrapolate") |
There was a problem hiding this comment.
Removed the internal linear interpolation and the somewhat unusual sampling arg.
We now require users to provide a distribution with enough data points.
|
Reviewed with Claude: Review of PR #106: Support 2D Sources and Sources Away from OriginReviewer: Claude SummaryThis PR adds important scientific capability: 2D probability distributions for neutron sources and support for sources positioned away from the origin. The implementation is well-designed with solid mathematics and good test coverage for the main use cases. Key Changes:
Overall Assessment: The core feature is sound and ready, but there are several edge cases that should be addressed before merging. Issues to AddressHigh Priority1. Division by Zero with Single-Point DistributionsLocation: dt = 0.5 * (tmax - tmin).value / (p.sizes[t_dim] - 1)
dw = 0.5 * (wmax - wmin).value / (p.sizes[w_dim] - 1)If a distribution has only 1 point in a dimension, this divides by zero, producing NaN. The rejection sampling loop then runs indefinitely since Test case: p_time = sc.DataArray(
data=sc.array(dims=['birth_time'], values=[1.0]),
coords={'birth_time': sc.array(dims=['birth_time'], values=[1000.0], unit='us')}
)
# Result: dt = nan, infinite loopSuggested fix: if p.sizes[t_dim] < 2 or p.sizes[w_dim] < 2:
raise ValueError(
f"Distribution must have at least 2 points in each dimension. "
f"Got {p.sizes[t_dim]} in {t_dim}, {p.sizes[w_dim]} in {w_dim}"
)2. IndexError with Empty Components ListLocation: components = sorted(...)
if components[0].distance < self.source.distance: # IndexError if empty
raise ValueError(...)Creating a model with no choppers or detectors causes an IndexError. Test case: model = tof.Model(source=source, choppers=[], detectors=[])
result = model.run() # IndexError: list index out of rangeSuggested fix: if components and components[0].distance < self.source.distance:
raise ValueError(...)Or validate in if not self.choppers and not self.detectors:
raise ValueError("Model must have at least one chopper or detector")3. Uninitialized Distance AttributeLocation: The Suggested fix: if self._facility is None:
# Should only be created via class methods
if not hasattr(self, '_distance'):
raise ValueError(
"Cannot create Source with facility=None directly. "
"Use Source.from_neutrons() or Source.from_distribution()"
)Medium Priority4. Zero or Negative WavelengthsLocation: No validation prevents wavelength ≤ 0, which produces infinite speed via the physics formula v = h/(m·λ). This doesn't crash but gives physically incorrect results. Test case: wavelengths = sc.array(dims=['event'], values=[0.0, 5.0], unit='angstrom')
source = Source.from_neutrons(birth_times=times, wavelengths=wavelengths)
# Result: speed = [inf, 791.207] m/sSuggested fix: Add validation in both if sc.any(wavelengths <= sc.scalar(0.0, unit=wavelengths.unit)):
raise ValueError("All wavelengths must be positive")5. Zero Distribution ValuesLocation: If a user provides an all-zero distribution (or negative values), the normalization Current error: Suggested fix: Add helpful error before normalization: prob_sum = p_flat.data.sum()
if prob_sum <= 0:
raise ValueError(
"Distribution must have at least one positive probability value. "
f"Sum of probabilities is {prob_sum.value}"
)6. Error Message ClarityLocation: The error message says "At least one component is located before the source" but the check only looks at if components and components[0].distance < self.source.distance:
raise ValueError(
f"The first component ({components[0].name} at {components[0].distance}) "
f"is located before the source (at {self.source.distance})"
)Low Priority / Polish7. Network DependencyLocation: The new pooch-based data fetching introduces a network dependency:
Suggestions:
8. Better Error for Unknown SourceLocation: def get_source_path(name: str) -> str:
return _source_registry.fetch(_source_library[name]["path"]) # KeyErrorSuggested improvement: def get_source_path(name: str) -> str:
if name not in _source_library:
available = ', '.join(_source_library.keys())
raise ValueError(
f"Unknown source '{name}'. Available sources: {available}"
)
return _source_registry.fetch(_source_library[name]["path"])9. Fully Masked Data in ReadingLocation: If all neutrons are blocked by choppers (fully masked data), the Suggested fix: def min(self):
if self.data.sum().value == 0:
return sc.scalar(float('nan'), unit=self.data.coords[self.dim].unit)
da = self.data.copy(deep=False)
da.data = da.coords[self.dim]
return da.min().dataVerified Correct ✓I did deep mathematical verification on the core algorithms:
The physics is correct, the math is solid, and the implementation is clean. Code Quality ObservationsStrengths:
Suggestions:
Test Coverage GapsMissing tests for edge cases:
Well covered:
DocumentationThe documentation updates are excellent:
Minor suggestion: Add explanation of the difference between RecommendationAPPROVE after addressing high-priority issues The 2D source feature is well-designed and the core implementation is mathematically sound. The main concerns are edge cases that could cause crashes or confusing errors. I'd recommend: Before merge:
Can be follow-up:
These are all straightforward fixes that will make the feature more robust without changing the core design. Summary of Changes Needed# In src/tof/source.py:_make_pulses, after line 78:
if p.sizes[t_dim] < 2 or p.sizes[w_dim] < 2:
raise ValueError(...)
# In src/tof/source.py:_make_pulses, after line 111:
if p_flat.data.sum() <= 0:
raise ValueError(...)
# In src/tof/source.py:from_neutrons and _make_pulses result:
if sc.any(wavelength <= 0):
raise ValueError("All wavelengths must be positive")
# In src/tof/model.py:276:
if components and components[0].distance < self.source.distance:
raise ValueError(...)
# In src/tof/source.py:__init__, at end:
if self._facility is None and self._data is None:
raise ValueError(...)All of these are quick validation additions that will prevent confusing errors. |
| unit=WAV_UNIT, | ||
| ).fold(dim=dim, sizes={"pulse": pulses, dim: neutrons}) | ||
| wavelengths = np.concatenate(wavs) | ||
| if np.any(wavelengths < 0): |
I won't fix this one. The same goes for the frequency. |
In this PR, we add support for sources that have a 2d probability distribution.

Previously, a source was defined by a 1d time distribution and a 1d wavelength distribution.
This meant that intrinsically the distribution was square in the (time, wavelength) parameter space.
We now generalize this to a 2d

pdistribution that can represent sources such as the one for Odin, taken some meters away from the moderator:This last parameter of the source (2.5m away from the origin) meant that we also needed to modify slightly how the model works to handle sources that are not located at 0m.
A library of sources was created and the data is saved in https://github.com/scipp/tof-sources.
Note that the old 1D
p_timeandp_wavstill work, a 2d broadcast of the profiles is performed internally.We also removed a behind-the-scenes linear interpolation between data points given by the source profiles, which was getting expensive when working in 2d. We now require users to supply a profile with enough resolution to satisfy their needs (note changes in docs where some
linspacewas added instead of using only 3 data points in the profiles).