Rechunk derived #6516

pp-mo · 2025-06-16T23:19:58Z

Automatic rechunking of derived coordinates

Investigation of the #6404 problem reveals that the points/bounds arrays of our derived (aka factory) coords have arrays which are mostly single chunks, which could thus be very large.

This was due to the fact that they tend to be like a broadcast product of several simple one-dimensional coords (dim or aux), each spanning a different dim or two, which themselves are quite small and so tend to be all single chunks.
When these are broadcast together, the result then tends to be one massive chunk, which can blow memory.

For example:
a result formed like A * B * C,
where these might have dims (T, Z, Y, X) of:

A: (NT, 1, 1, 1),
B: (1, NZ, 1, 1),
C: (1, 1, NY, NX),

which are all relatively small, and so can be single chunks.

Say NT, NZ, NY, NX = 100, 70, 1000, 500.
then the result is (100 * 70 * 1000 * 500) -> 3.5Gpoints.
If element size is a typical 4 bytes, and dask chunksize is a typical 200Mb, then we expect a chunk ~50M array elements.
An array of this size, loaded from an input netcdf file, might get chunked (1, 70, 1000, 500) ~35M elements, or 140Mb.
But our derived coord will have the whole array, 3,500 Melements --> ~14Gb in a single chunk.

It seems likely that this problem has been noticed more recently because, since #5369, we now have derived coordinates which are time-dependent, so that is multiplying up the total size where before it did not.
However even before this, we were potentially mutliplying e.g. the size of a field * the number of model levels, which already lead to single-chunk arrays larger than ideal. Typical numbers : 70 * 1024 * 768 * 4 = 220Mib, already reaching the standard Dask chunksize of 200Mib (so hi-res fields or double resolution will clearly exceed).

Todo:

more testing to ensure coverage of all new code
when this is agreed + passing :
- agree on Allow derived coord references to be lazy. #6517
- merge that in here
- check it doesn't break this
- when+if all good, merge this

…unks.

codecov · 2025-06-17T14:15:20Z

Codecov Report

Attention: Patch coverage is 88.63636% with 5 lines in your changes missing coverage. Please review.

Project coverage is 89.87%. Comparing base (90e36fe) to head (bf1e132).
Report is 47 commits behind head on main.

Files with missing lines	Patch %	Lines
lib/iris/aux_factory.py	88.63%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6516      +/-   ##
==========================================
+ Coverage   89.80%   89.87%   +0.07%     
==========================================
  Files          90       90              
  Lines       23752    23927     +175     
  Branches     4418     4463      +45     
==========================================
+ Hits        21331    21505     +174     
+ Misses       1672     1670       -2     
- Partials      749      752       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lib/iris/aux_factory.py

…ult.

pp-mo · 2025-06-18T17:10:27Z

Update 2025-06-18

Thanks @trexfeathers @stephenworsley for offlines discussions on this,
leading to a decision to make basically all the calculations lazy.
That certainly makes the code simpler, and I have convinced myself it should not adversely affect performance.
We doubt that the slight behaviour change will be significant to anyone.

I'm happy that tests now give full code coverage + I'm marking this ready for review.

github-actions · 2025-06-18T17:55:31Z

⏱️ Performance Benchmark Report: `c617e86`

Performance shifts

Full benchmark results


Benchmarks that have stayed the same:

| Change   | Before [37852b9b]    | After [c617e862]    |   Ratio | Benchmark (Parameter)                                                                       |
|----------|----------------------|---------------------|---------|---------------------------------------------------------------------------------------------|
|          | 21.3±0.2ms           | 20.8±0.2ms          |    0.98 | aggregate_collapse.Aggregation.time_aggregated_by_COUNT(False)                              |
|          | 53.0±1ms             | 52.0±0.9ms          |    0.98 | aggregate_collapse.Aggregation.time_aggregated_by_COUNT(True)                               |
|          | 35.9±0.5ms           | 34.9±0.4ms          |    0.97 | aggregate_collapse.Aggregation.time_aggregated_by_FAST_PERCENTILE(False)                    |
|          | 165±2ms              | 167±2ms             |    1.01 | aggregate_collapse.Aggregation.time_aggregated_by_FAST_PERCENTILE(True)                     |
|          | 23.3±0.3ms           | 23.0±0.4ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_GMEAN(False)                              |
|          | 31.6±0.3ms           | 32.1±0.6ms          |    1.02 | aggregate_collapse.Aggregation.time_aggregated_by_GMEAN(True)                               |
|          | 23.3±0.2ms           | 22.6±0.4ms          |    0.97 | aggregate_collapse.Aggregation.time_aggregated_by_HMEAN(False)                              |
|          | 31.9±0.4ms           | 31.8±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_HMEAN(True)                               |
|          | 21.2±0.3ms           | 20.9±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_MAX(False)                                |
|          | 45.1±0.8ms           | 44.6±0.8ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_MAX(True)                                 |
|          | 126±0.6ms            | 125±0.8ms           |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_MAX_RUN(False)                            |
|          | 128±2ms              | 125±2ms             |    0.98 | aggregate_collapse.Aggregation.time_aggregated_by_MAX_RUN(True)                             |
|          | 21.8±0.5ms           | 21.8±0.1ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MEAN(False)                               |
|          | 48.5±0.9ms           | 48.1±1ms            |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_MEAN(True)                                |
|          | 23.7±0.5ms           | 23.5±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_MEDIAN(False)                             |
|          | 58.1±1ms             | 58.1±1ms            |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MEDIAN(True)                              |
|          | 21.2±0.4ms           | 20.6±0.2ms          |    0.97 | aggregate_collapse.Aggregation.time_aggregated_by_MIN(False)                                |
|          | 44.7±0.8ms           | 44.8±0.8ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MIN(True)                                 |
|          | 1.09±0s              | 1.08±0.01s          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PEAK(False)                               |
|          | 1.08±0.01s           | 1.08±0.01s          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PEAK(True)                                |
|          | 217±2ms              | 218±1ms             |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PERCENTILE(False)                         |
|          | 349±9ms              | 354±8ms             |    1.01 | aggregate_collapse.Aggregation.time_aggregated_by_PERCENTILE(True)                          |
|          | 21.9±0.2ms           | 21.9±0.4ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PROPORTION(False)                         |
|          | 30.4±0.2ms           | 30.5±0.4ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PROPORTION(True)                          |
|          | 22.4±0.1ms           | 22.2±0.2ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_RMS(False)                                |
|          | 59.2±0.9ms           | 58.3±1ms            |    0.98 | aggregate_collapse.Aggregation.time_aggregated_by_RMS(True)                                 |
|          | 23.6±0.4ms           | 23.2±0.2ms          |    0.98 | aggregate_collapse.Aggregation.time_aggregated_by_STD_DEV(False)                            |
|          | 62.4±1ms             | 61.2±0.7ms          |    0.98 | aggregate_collapse.Aggregation.time_aggregated_by_STD_DEV(True)                             |
|          | 23.3±0.6ms           | 23.0±0.2ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_VARIANCE(False)                           |
|          | 57.9±0.5ms           | 57.5±1ms            |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_VARIANCE(True)                            |
|          | 7.84±0.06ms          | 7.62±0.2ms          |    0.97 | aggregate_collapse.Aggregation.time_collapsed_by_COUNT(False)                               |
|          | 21.9±0.7ms           | 21.8±0.6ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_COUNT(True)                                |
|          | 19.8±0.3ms           | 19.8±0.3ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_FAST_PERCENTILE(False)                     |
|          | 125±1ms              | 124±0.8ms           |    1    | aggregate_collapse.Aggregation.time_collapsed_by_FAST_PERCENTILE(True)                      |
|          | 8.16±0.1ms           | 8.03±0.1ms          |    0.98 | aggregate_collapse.Aggregation.time_collapsed_by_GMEAN(False)                               |
|          | 20.2±0.5ms           | 20.4±0.5ms          |    1.01 | aggregate_collapse.Aggregation.time_collapsed_by_GMEAN(True)                                |
|          | 8.03±0.1ms           | 8.01±0.08ms         |    1    | aggregate_collapse.Aggregation.time_collapsed_by_HMEAN(False)                               |
|          | 19.8±0.5ms           | 19.8±0.5ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_HMEAN(True)                                |
|          | 7.70±0.1ms           | 7.66±0.07ms         |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_MAX(False)                                 |
|          | 20.0±0.5ms           | 19.9±0.6ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_MAX(True)                                  |
|          | 24.0±0.3ms           | 24.0±0.4ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MAX_RUN(False)                             |
|          | 34.1±0.6ms           | 34.2±0.7ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MAX_RUN(True)                              |
|          | 7.82±0.2ms           | 7.84±0.05ms         |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MEAN(False)                                |
|          | 20.4±0.6ms           | 20.9±0.9ms          |    1.02 | aggregate_collapse.Aggregation.time_collapsed_by_MEAN(True)                                 |
|          | 9.41±0.2ms           | 9.22±0.03ms         |    0.98 | aggregate_collapse.Aggregation.time_collapsed_by_MEDIAN(False)                              |
|          | 23.1±0.5ms           | 23.5±0.5ms          |    1.02 | aggregate_collapse.Aggregation.time_collapsed_by_MEDIAN(True)                               |
|          | 7.71±0.03ms          | 7.59±0.08ms         |    0.98 | aggregate_collapse.Aggregation.time_collapsed_by_MIN(False)                                 |
|          | 20.5±0.5ms           | 20.2±0.6ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_MIN(True)                                  |
|          | 528±2ms              | 531±4ms             |    1.01 | aggregate_collapse.Aggregation.time_collapsed_by_PEAK(False)                                |
|          | 537±1ms              | 537±4ms             |    1    | aggregate_collapse.Aggregation.time_collapsed_by_PEAK(True)                                 |
|          | 46.0±0.5ms           | 45.9±0.4ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_PERCENTILE(False)                          |
|          | 132±1ms              | 134±0.8ms           |    1.01 | aggregate_collapse.Aggregation.time_collapsed_by_PERCENTILE(True)                           |
|          | 8.05±0.1ms           | 7.80±0.1ms          |    0.97 | aggregate_collapse.Aggregation.time_collapsed_by_PROPORTION(False)                          |
|          | 20.0±0.3ms           | 19.5±0.5ms          |    0.98 | aggregate_collapse.Aggregation.time_collapsed_by_PROPORTION(True)                           |
|          | 8.06±0.05ms          | 7.84±0.07ms         |    0.97 | aggregate_collapse.Aggregation.time_collapsed_by_RMS(False)                                 |
|          | 22.6±0.5ms           | 22.3±0.7ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_RMS(True)                                  |
|          | 8.11±0.06ms          | 8.25±0.05ms         |    1.02 | aggregate_collapse.Aggregation.time_collapsed_by_STD_DEV(False)                             |
|          | 22.0±0.5ms           | 21.8±0.5ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_STD_DEV(True)                              |
|          | 8.16±0.07ms          | 8.25±0.05ms         |    1.01 | aggregate_collapse.Aggregation.time_collapsed_by_VARIANCE(False)                            |
|          | 21.5±0.7ms           | 21.6±0.6ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_VARIANCE(True)                             |
|          | 22.5±0.5ms           | 22.4±0.3ms          |    0.99 | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_MEAN(False)                     |
|          | 83.4±2ms             | 82.8±1ms            |    0.99 | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_MEAN(True)                      |
|          | 22.7±0.2ms           | 22.2±0.2ms          |    0.98 | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_RMS(False)                      |
|          | 94.5±0.6ms           | 94.0±0.9ms          |    0.99 | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_RMS(True)                       |
|          | 21.1±0.1ms           | 21.1±0.2ms          |    1    | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_SUM(False)                      |
|          | 56.2±0.8ms           | 55.2±0.8ms          |    0.98 | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_SUM(True)                       |
|          | 8.32±0.04ms          | 8.25±0.1ms          |    0.99 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_MEAN(False)                      |
|          | 27.4±0.8ms           | 27.1±0.6ms          |    0.99 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_MEAN(True)                       |
|          | 8.24±0.1ms           | 8.07±0.1ms          |    0.98 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_RMS(False)                       |
|          | 29.2±0.8ms           | 28.7±0.4ms          |    0.98 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_RMS(True)                        |
|          | 7.81±0.1ms           | 7.79±0.1ms          |    1    | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_SUM(False)                       |
|          | 22.7±0.6ms           | 22.5±0.8ms          |    0.99 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_SUM(True)                        |
|          | 217±2ms              | 217±2ms             |    1    | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_WPERCENTILE(False)               |
|          | 274±3ms              | 273±4ms             |    1    | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_WPERCENTILE(True)                |
|          | 1.17±0.02ms          | 1.14±0.01ms         |    0.97 | cube.CubeCreation.time_create(False, 'construct')                                           |
|          | 403±6μs              | 401±5μs             |    1    | cube.CubeCreation.time_create(False, 'instantiate')                                         |
|          | 981±10μs             | 955±7μs             |    0.97 | cube.CubeCreation.time_create(True, 'construct')                                            |
|          | 590±10μs             | 576±5μs             |    0.98 | cube.CubeCreation.time_create(True, 'instantiate')                                          |
|          | 87.2±3ms             | 85.5±2ms            |    0.98 | cube.CubeEquality.time_equality(False, False, 'all_equal')                                  |
|          | 25.6±2ms             | 25.2±0.7ms          |    0.99 | cube.CubeEquality.time_equality(False, False, 'coord_inequality')                           |
|          | 99.4±3ms             | 98.5±2ms            |    0.99 | cube.CubeEquality.time_equality(False, False, 'data_inequality')                            |
|          | 17.9±0.4μs           | 17.4±0.2μs          |    0.97 | cube.CubeEquality.time_equality(False, False, 'metadata_inequality')                        |
|          | 86.6±3ms             | 84.9±1ms            |    0.98 | cube.CubeEquality.time_equality(False, True, 'all_equal')                                   |
|          | 26.8±0.4ms           | 26.6±0.7ms          |    0.99 | cube.CubeEquality.time_equality(False, True, 'coord_inequality')                            |
|          | 101±1ms              | 99.3±2ms            |    0.99 | cube.CubeEquality.time_equality(False, True, 'data_inequality')                             |
|          | 17.9±0.3μs           | 17.5±0.2μs          |    0.98 | cube.CubeEquality.time_equality(False, True, 'metadata_inequality')                         |
|          | 168±2ms              | 169±1ms             |    1    | cube.CubeEquality.time_equality(True, False, 'all_equal')                                   |
|          | 66.8±0.5ms           | 66.6±0.5ms          |    1    | cube.CubeEquality.time_equality(True, False, 'coord_inequality')                            |
|          | 196±2ms              | 192±1ms             |    0.98 | cube.CubeEquality.time_equality(True, False, 'data_inequality')                             |
|          | 53.0±1μs             | 54.2±0.3μs          |    1.02 | cube.CubeEquality.time_equality(True, False, 'metadata_inequality')                         |
|          | 237±3ms              | 236±2ms             |    1    | cube.CubeEquality.time_equality(True, True, 'all_equal')                                    |
|          | 137±3ms              | 136±0.7ms           |    0.99 | cube.CubeEquality.time_equality(True, True, 'coord_inequality')                             |
|          | 267±2ms              | 265±2ms             |    0.99 | cube.CubeEquality.time_equality(True, True, 'data_inequality')                              |
|          | 56.8±1μs             | 55.9±0.7μs          |    0.98 | cube.CubeEquality.time_equality(True, True, 'metadata_inequality')                          |
|          | 806±20μs             | 792±6μs             |    0.98 | import_iris.Iris.time__concatenate                                                          |
|          | 182±2μs              | 182±2μs             |    1    | import_iris.Iris.time__constraints                                                          |
|          | 113±0.7μs            | 113±2μs             |    1    | import_iris.Iris.time__data_manager                                                         |
|          | 91.5±0.6μs           | 91.1±0.6μs          |    1    | import_iris.Iris.time__deprecation                                                          |
|          | 161±2μs              | 160±3μs             |    0.99 | import_iris.Iris.time__lazy_data                                                            |
|          | 907±20μs             | 897±10μs            |    0.99 | import_iris.Iris.time__merge                                                                |
|          | 74.2±1μs             | 73.5±0.6μs          |    0.99 | import_iris.Iris.time__representation                                                       |
|          | 599±3μs              | 597±4μs             |    1    | import_iris.Iris.time_analysis                                                              |
|          | 139±2μs              | 137±0.8μs           |    0.99 | import_iris.Iris.time_analysis__area_weighted                                               |
|          | 105±2μs              | 106±2μs             |    1.01 | import_iris.Iris.time_analysis__grid_angles                                                 |
|          | 246±5μs              | 243±0.9μs           |    0.99 | import_iris.Iris.time_analysis__interpolation                                               |
|          | 188±2μs              | 188±2μs             |    1    | import_iris.Iris.time_analysis__regrid                                                      |
|          | 109±0.4μs            | 109±0.5μs           |    1    | import_iris.Iris.time_analysis__scipy_interpolate                                           |
|          | 136±5μs              | 135±4μs             |    0.99 | import_iris.Iris.time_analysis_calculus                                                     |
|          | 321±4μs              | 323±1μs             |    1.01 | import_iris.Iris.time_analysis_cartography                                                  |
|          | 93.9±3μs             | 90.0±0.5μs          |    0.96 | import_iris.Iris.time_analysis_geomerty                                                     |
|          | 210±2μs              | 209±2μs             |    1    | import_iris.Iris.time_analysis_maths                                                        |
|          | 94.0±0.4μs           | 94.5±0.4μs          |    1    | import_iris.Iris.time_analysis_stats                                                        |
|          | 168±1μs              | 169±2μs             |    1.01 | import_iris.Iris.time_analysis_trajectory                                                   |
|          | 311±3μs              | 320±5μs             |    1.03 | import_iris.Iris.time_aux_factory                                                           |
|          | 80.3±0.4μs           | 80.4±1μs            |    1    | import_iris.Iris.time_common                                                                |
|          | 159±4μs              | 161±4μs             |    1.01 | import_iris.Iris.time_common_lenient                                                        |
|          | 1.33±0.01ms          | 1.33±0.01ms         |    1    | import_iris.Iris.time_common_metadata                                                       |
|          | 169±4μs              | 166±0.7μs           |    0.98 | import_iris.Iris.time_common_mixin                                                          |
|          | 1.16±0.02ms          | 1.16±0.01ms         |    1    | import_iris.Iris.time_common_resolve                                                        |
|          | 194±2μs              | 195±1μs             |    1.01 | import_iris.Iris.time_config                                                                |
|          | 127±3μs              | 126±1μs             |    0.99 | import_iris.Iris.time_coord_categorisation                                                  |
|          | 374±3μs              | 378±5μs             |    1.01 | import_iris.Iris.time_coord_systems                                                         |
|          | 740±3μs              | 750±6μs             |    1.01 | import_iris.Iris.time_coords                                                                |
|          | 632±7μs              | 633±6μs             |    1    | import_iris.Iris.time_cube                                                                  |
|          | 242±2μs              | 241±2μs             |    1    | import_iris.Iris.time_exceptions                                                            |
|          | 74.3±0.4μs           | 74.3±0.4μs          |    1    | import_iris.Iris.time_experimental                                                          |
|          | 180±1μs              | 181±4μs             |    1    | import_iris.Iris.time_fileformats                                                           |
|          | 252±4μs              | 251±2μs             |    0.99 | import_iris.Iris.time_fileformats__ff                                                       |
|          | 2.65±0.02ms          | 2.65±0ms            |    1    | import_iris.Iris.time_fileformats__ff_cross_references                                      |
|          | 75.1±0.2μs           | 74.3±1μs            |    0.99 | import_iris.Iris.time_fileformats__pp_lbproc_pairs                                          |
|          | 111±0.7μs            | 112±0.5μs           |    1    | import_iris.Iris.time_fileformats_abf                                                       |
|          | 431±6μs              | 427±6μs             |    0.99 | import_iris.Iris.time_fileformats_cf                                                        |
|          | 4.70±0.03ms          | 4.70±0.03ms         |    1    | import_iris.Iris.time_fileformats_dot                                                       |
|          | 71.8±2μs             | 71.3±1μs            |    0.99 | import_iris.Iris.time_fileformats_name                                                      |
|          | 247±4μs              | 250±1μs             |    1.01 | import_iris.Iris.time_fileformats_name_loaders                                              |
|          | 113±0.8μs            | 112±0.7μs           |    1    | import_iris.Iris.time_fileformats_netcdf                                                    |
|          | 118±1μs              | 118±0.7μs           |    1    | import_iris.Iris.time_fileformats_nimrod                                                    |
|          | 204±1μs              | 206±1μs             |    1.01 | import_iris.Iris.time_fileformats_nimrod_load_rules                                         |
|          | 784±5μs              | 786±2μs             |    1    | import_iris.Iris.time_fileformats_pp                                                        |
|          | 174±1μs              | 178±4μs             |    1.02 | import_iris.Iris.time_fileformats_pp_load_rules                                             |
|          | 134±0.9μs            | 131±1μs             |    0.98 | import_iris.Iris.time_fileformats_pp_save_rules                                             |
|          | 539±7μs              | 540±4μs             |    1    | import_iris.Iris.time_fileformats_rules                                                     |
|          | 219±1μs              | 217±2μs             |    0.99 | import_iris.Iris.time_fileformats_structured_array_identification                           |
|          | 80.0±0.7μs           | 79.8±0.5μs          |    1    | import_iris.Iris.time_fileformats_um                                                        |
|          | 155±4μs              | 156±1μs             |    1.01 | import_iris.Iris.time_fileformats_um__fast_load                                             |
|          | 135±0.7μs            | 136±0.4μs           |    1    | import_iris.Iris.time_fileformats_um__fast_load_structured_fields                           |
|          | 72.3±0.5μs           | 71.7±0.3μs          |    0.99 | import_iris.Iris.time_fileformats_um__ff_replacement                                        |
|          | 77.9±0.4μs           | 78.6±0.7μs          |    1.01 | import_iris.Iris.time_fileformats_um__optimal_array_structuring                             |
|          | 959±8μs              | 960±3μs             |    1    | import_iris.Iris.time_fileformats_um_cf_map                                                 |
|          | 136±1μs              | 135±2μs             |    0.99 | import_iris.Iris.time_io                                                                    |
|          | 174±2μs              | 174±0.8μs           |    1    | import_iris.Iris.time_io_format_picker                                                      |
|          | 207±3μs              | 208±1μs             |    1.01 | import_iris.Iris.time_iris                                                                  |
|          | 124±0.6μs            | 124±1μs             |    1    | import_iris.Iris.time_iterate                                                               |
|          | 8.13±0.08ms          | 8.12±0.05ms         |    1    | import_iris.Iris.time_palette                                                               |
|          | 1.77±0.01ms          | 1.76±0.01ms         |    0.99 | import_iris.Iris.time_plot                                                                  |
|          | 217±2μs              | 216±1μs             |    1    | import_iris.Iris.time_quickplot                                                             |
|          | 2.20±0.02ms          | 2.20±0.01ms         |    1    | import_iris.Iris.time_std_names                                                             |
|          | 1.87±0.01ms          | 1.85±0.01ms         |    0.99 | import_iris.Iris.time_symbols                                                               |
|          | 18.8±2ms             | 18.3±0.9ms          |    0.98 | import_iris.Iris.time_tests                                                                 |
|          | 250±2μs              | 250±0.8μs           |    1    | import_iris.Iris.time_third_party_cartopy                                                   |
|          | 5.03±0.01ms          | 5.06±0.04ms         |    1.01 | import_iris.Iris.time_third_party_cf_units                                                  |
|          | 116±4μs              | 115±0.3μs           |    0.99 | import_iris.Iris.time_third_party_cftime                                                    |
|          | 2.72±0.02ms          | 2.72±0.01ms         |    1    | import_iris.Iris.time_third_party_matplotlib                                                |
|          | 1.30±0ms             | 1.30±0.01ms         |    1    | import_iris.Iris.time_third_party_numpy                                                     |
|          | 166±0.9μs            | 168±4μs             |    1.01 | import_iris.Iris.time_third_party_scipy                                                     |
|          | 96.9±0.6μs           | 96.7±0.7μs          |    1    | import_iris.Iris.time_time                                                                  |
|          | 338±2μs              | 342±2μs             |    1.01 | import_iris.Iris.time_util                                                                  |
|          | 73.0±1μs             | 72.3±0.4μs          |    0.99 | iterate.IZip.time_izip                                                                      |
|          | 9.60±0.04ms          | 9.67±0.09ms         |    1.01 | load.LoadAndRealise.time_load((1280, 960, 5), False, 'FF')                                  |
|          | 15.4±0.1ms           | 15.5±0.6ms          |    1.01 | load.LoadAndRealise.time_load((1280, 960, 5), False, 'NetCDF')                              |
|          | 9.69±0.1ms           | 9.65±0.1ms          |    1    | load.LoadAndRealise.time_load((1280, 960, 5), False, 'PP')                                  |
|          | 9.63±0.1ms           | 9.58±0.09ms         |    1    | load.LoadAndRealise.time_load((1280, 960, 5), True, 'FF')                                   |
|          | 13.5±0.2ms           | 13.0±0.07ms         |    0.97 | load.LoadAndRealise.time_load((1280, 960, 5), True, 'NetCDF')                               |
|          | 9.82±0.2ms           | 9.61±0.1ms          |    0.98 | load.LoadAndRealise.time_load((1280, 960, 5), True, 'PP')                                   |
|          | 1.43±0.01s           | 1.43±0s             |    1    | load.LoadAndRealise.time_load((2, 2, 1000), False, 'FF')                                    |
|          | 12.1±0.04ms          | 11.9±0.08ms         |    0.99 | load.LoadAndRealise.time_load((2, 2, 1000), False, 'NetCDF')                                |
|          | 1.44±0.01s           | 1.45±0.01s          |    1.01 | load.LoadAndRealise.time_load((2, 2, 1000), False, 'PP')                                    |
|          | 1.43±0.01s           | 1.44±0.02s          |    1.01 | load.LoadAndRealise.time_load((2, 2, 1000), True, 'FF')                                     |
|          | 12.0±0.05ms          | 11.9±0.06ms         |    0.99 | load.LoadAndRealise.time_load((2, 2, 1000), True, 'NetCDF')                                 |
|          | 1.43±0.01s           | 1.46±0.01s          |    1.02 | load.LoadAndRealise.time_load((2, 2, 1000), True, 'PP')                                     |
|          | 5.08±0.02ms          | 5.18±0.07ms         |    1.02 | load.LoadAndRealise.time_load((50, 50, 2), False, 'FF')                                     |
|          | 11.7±0.05ms          | 11.6±0.1ms          |    0.99 | load.LoadAndRealise.time_load((50, 50, 2), False, 'NetCDF')                                 |
|          | 5.17±0.06ms          | 5.06±0.02ms         |    0.98 | load.LoadAndRealise.time_load((50, 50, 2), False, 'PP')                                     |
|          | 5.16±0.3ms           | 5.08±0.03ms         |    0.98 | load.LoadAndRealise.time_load((50, 50, 2), True, 'FF')                                      |
|          | 11.7±0.06ms          | 11.7±0.05ms         |    0.99 | load.LoadAndRealise.time_load((50, 50, 2), True, 'NetCDF')                                  |
|          | 5.05±0.03ms          | 5.07±0.07ms         |    1.01 | load.LoadAndRealise.time_load((50, 50, 2), True, 'PP')                                      |
|          | 23.1±1ms             | 22.4±1ms            |    0.97 | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'FF')                               |
|          | 26.8±0.3ms           | 26.7±1ms            |    1    | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'NetCDF')                           |
|          | 11.6±1ms             | 11.9±1ms            |    1.03 | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'PP')                               |
|          | 27.4±1ms             | 28.4±1ms            |    1.04 | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'FF')                                |
|          | 71.3±3ms             | 70.4±3ms            |    0.99 | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'NetCDF')                            |
|          | 27.7±0.9ms           | 27.6±0.9ms          |    1    | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'PP')                                |
|          | 601±2ms              | 600±3ms             |    1    | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'FF')                                 |
|          | 3.45±0.1ms           | 3.33±0.1ms          |    0.97 | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'NetCDF')                             |
|          | 604±2ms              | 597±4ms             |    0.99 | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'PP')                                 |
|          | 613±2ms              | 609±2ms             |    0.99 | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'FF')                                  |
|          | 3.51±0.2ms           | 3.36±0.1ms          |    0.96 | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'NetCDF')                              |
|          | 608±2ms              | 609±2ms             |    1    | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'PP')                                  |
|          | 2.15±0.04ms          | 2.02±0.07ms         |    0.94 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'FF')                                  |
|          | 3.35±0.08ms          | 3.37±0.09ms         |    1.01 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'NetCDF')                              |
|          | 2.09±0.06ms          | 2.02±0.07ms         |    0.97 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'PP')                                  |
|          | 2.02±0.04ms          | 2.11±0.07ms         |    1.04 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'FF')                                   |
|          | 3.38±0.08ms          | 3.49±0.1ms          |    1.03 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'NetCDF')                               |
|          | 2.03±0.05ms          | 2.08±0.06ms         |    1.03 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'PP')                                   |
|          | 344±3ms              | 340±1ms             |    0.99 | load.ManyCubes.time_many_cube_load                                                          |
|          | 86.6±1ms             | 86.0±1ms            |    0.99 | load.ManyVars.time_many_var_load                                                            |
|          | 9.72±0.1ms           | 9.66±0.02ms         |    0.99 | load.STASHConstraint.time_stash_constraint((1280, 960, 5), 'FF')                            |
|          | 9.77±0.09ms          | 9.88±0.1ms          |    1.01 | load.STASHConstraint.time_stash_constraint((1280, 960, 5), 'PP')                            |
|          | 1.47±0.01s           | 1.45±0.02s          |    0.99 | load.STASHConstraint.time_stash_constraint((2, 2, 1000), 'FF')                              |
|          | 1.48±0.01s           | 1.47±0.01s          |    1    | load.STASHConstraint.time_stash_constraint((2, 2, 1000), 'PP')                              |
|          | 5.23±0.04ms          | 5.12±0.03ms         |    0.98 | load.STASHConstraint.time_stash_constraint((2, 2, 2), 'FF')                                 |
|          | 5.11±0.06ms          | 5.17±0.05ms         |    1.01 | load.STASHConstraint.time_stash_constraint((2, 2, 2), 'PP')                                 |
|          | 8.65±0.03ms          | 8.62±0.05ms         |    1    | load.StructuredFF.time_structured_load((1280, 960, 5), False)                               |
|          | 5.47±0.1ms           | 5.46±0.02ms         |    1    | load.StructuredFF.time_structured_load((1280, 960, 5), True)                                |
|          | 1.43±0.01s           | 1.42±0.02s          |    0.99 | load.StructuredFF.time_structured_load((2, 2, 1000), False)                                 |
|          | 426±4ms              | 421±8ms             |    0.99 | load.StructuredFF.time_structured_load((2, 2, 1000), True)                                  |
|          | 4.22±0.02ms          | 4.17±0.02ms         |    0.99 | load.StructuredFF.time_structured_load((2, 2, 2), False)                                    |
|          | 4.13±0.02ms          | 4.15±0.04ms         |    1.01 | load.StructuredFF.time_structured_load((2, 2, 2), True)                                     |
|          | 156±3ms              | 156±0.7ms           |    1    | load.TimeConstraint.time_time_constraint(20, 'FF')                                          |
|          | 15.0±0.1ms           | 14.8±0.07ms         |    0.99 | load.TimeConstraint.time_time_constraint(20, 'NetCDF')                                      |
|          | 159±1ms              | 157±2ms             |    0.99 | load.TimeConstraint.time_time_constraint(20, 'PP')                                          |
|          | 31.6±0.6ms           | 31.3±0.1ms          |    0.99 | load.TimeConstraint.time_time_constraint(3, 'FF')                                           |
|          | 14.7±0.2ms           | 14.4±0.08ms         |    0.98 | load.TimeConstraint.time_time_constraint(3, 'NetCDF')                                       |
|          | 31.7±0.2ms           | 31.9±0.6ms          |    1.01 | load.TimeConstraint.time_time_constraint(3, 'PP')                                           |
|          | 15.2±0.3ms           | 14.8±0.3ms          |    0.97 | load.ugrid.BasicLoading.time_load_file(1)                                                   |
|          | 45.9±0.6ms           | 45.8±0.4ms          |    1    | load.ugrid.BasicLoading.time_load_file(200000)                                              |
|          | 8.95±0.1ms           | 8.76±0.2ms          |    0.98 | load.ugrid.BasicLoading.time_load_mesh(1)                                                   |
|          | 16.4±0.3ms           | 16.4±0.4ms          |    1    | load.ugrid.BasicLoading.time_load_mesh(200000)                                              |
|          | 15.1±0.2ms           | 15.1±0.3ms          |    1    | load.ugrid.BasicLoadingTime.time_load_file(1)                                               |
|          | 15.3±0.3ms           | 15.4±0.2ms          |    1.01 | load.ugrid.BasicLoadingTime.time_load_file(200000)                                          |
|          | 8.70±0.06ms          | 8.85±0.2ms          |    1.02 | load.ugrid.BasicLoadingTime.time_load_mesh(1)                                               |
|          | 11.5±0.09ms          | 11.9±0.5ms          |    1.03 | load.ugrid.BasicLoadingTime.time_load_mesh(200000)                                          |
|          | 16.2±0.4ms           | 16.4±0.4ms          |    1.01 | load.ugrid.Callback.time_load_file_callback(1)                                              |
|          | 55.1±0.6ms           | 55.4±0.6ms          |    1.01 | load.ugrid.Callback.time_load_file_callback(200000)                                         |
|          | 16.2±0.2ms           | 16.2±0.3ms          |    1    | load.ugrid.CallbackTime.time_load_file_callback(1)                                          |
|          | 17.2±0.4ms           | 16.8±0.2ms          |    0.98 | load.ugrid.CallbackTime.time_load_file_callback(200000)                                     |
|          | 3.29±0.06ms          | 3.28±0.2ms          |    1    | load.ugrid.DataRealisation.time_realise_data(10000)                                         |
|          | 6.03±0.8ms           | 6.12±0.07ms         |    1.02 | load.ugrid.DataRealisation.time_realise_data(200000)                                        |
|          | 36.0±1ms             | 37.5±3ms            |    1.04 | load.ugrid.DataRealisationTime.time_realise_data(10000)                                     |
|          | 778±8ms              | 777±10ms            |    1    | load.ugrid.DataRealisationTime.time_realise_data(200000)                                    |
|          | 1.56±0.03s           | 1.55±0.03s          |    1    | merge_concat.Concatenate.time_concatenate(False)                                            |
|          | 430±6ms              | 424±5ms             |    0.99 | merge_concat.Concatenate.time_concatenate(True)                                             |
|          | 2.42±0G              | 2.42±0G             |    1    | merge_concat.Concatenate.tracemalloc_concatenate(False)                                     |
|          | 111±5M               | 118±5M              |    1.06 | merge_concat.Concatenate.tracemalloc_concatenate(True)                                      |
|          | 31.3±1ms             | 35.8±3ms            |    1.14 | merge_concat.Merge.time_merge                                                               |
|          | 126±0.02M            | 126±0.02M           |    1    | merge_concat.Merge.tracemalloc_merge                                                        |
|          | 368±0.9ns            | 362±1ns             |    0.98 | mesh.utils.regions_combine.CombineRegionsComputeRealData.time_compute_data(50)              |
|          | 198±1ms              | 196±1ms             |    0.99 | mesh.utils.regions_combine.CombineRegionsComputeRealData.time_compute_data(500)             |
|          | 772±0.5k             | 772±0.5k            |    1    | mesh.utils.regions_combine.CombineRegionsComputeRealData.tracemalloc_compute_data(50)       |
|          | 60.2±0M              | 60.2±0M             |    1    | mesh.utils.regions_combine.CombineRegionsComputeRealData.tracemalloc_compute_data(500)      |
|          | 16.4±0.09ms          | 16.3±0.1ms          |    1    | mesh.utils.regions_combine.CombineRegionsCreateCube.time_create_combined_cube(50)           |
|          | 19.5±0.3ms           | 19.3±0.4ms          |    0.99 | mesh.utils.regions_combine.CombineRegionsCreateCube.time_create_combined_cube(500)          |
|          | 1.27±0.04M           | 1.27±0.04M          |    1    | mesh.utils.regions_combine.CombineRegionsCreateCube.tracemalloc_create_combined_cube(50)    |
|          | 25±0.04M             | 25±0.04M            |    1    | mesh.utils.regions_combine.CombineRegionsCreateCube.tracemalloc_create_combined_cube(500)   |
|          | 116±0.6ms            | 114±1ms             |    0.98 | mesh.utils.regions_combine.CombineRegionsFileStreamedCalc.time_stream_file2file(50)         |
|          | 582±5ms              | 573±4ms             |    0.99 | mesh.utils.regions_combine.CombineRegionsFileStreamedCalc.time_stream_file2file(500)        |
|          | 1.49±0.02M           | 1.49±0.02M          |    1    | mesh.utils.regions_combine.CombineRegionsFileStreamedCalc.tracemalloc_stream_file2file(50)  |
|          | 96.5±0.03M           | 96.5±0.03M          |    1    | mesh.utils.regions_combine.CombineRegionsFileStreamedCalc.tracemalloc_stream_file2file(500) |
|          | 75.0±1ms             | 73.6±0.9ms          |    0.98 | mesh.utils.regions_combine.CombineRegionsSaveData.time_save(50)                             |
|          | 528±6ms              | 532±5ms             |    1.01 | mesh.utils.regions_combine.CombineRegionsSaveData.time_save(500)                            |
|          | 1.44±0.02M           | 1.42±0.03M          |    0.98 | mesh.utils.regions_combine.CombineRegionsSaveData.tracemalloc_save(50)                      |
|          | 96.5±0.02M           | 96.5±0.04M          |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.tracemalloc_save(500)                     |
|          | 2.1752849999999997   | 2.1752849999999997  |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.track_filesize_saved(50)                  |
|          | 216.01528499999998   | 216.01528499999998  |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.track_filesize_saved(500)                 |
|          | 6.80±0.05ms          | 6.80±0.1ms          |    1    | plot.AuxSort.time_aux_sort                                                                  |
|          | 78.4±2ms             | 80.1±2ms            |    1.02 | regridding.CurvilinearRegridding.time_regrid_pic                                            |
|          | 136±3M               | 136±3M              |    1    | regridding.CurvilinearRegridding.tracemalloc_regrid_pic                                     |
|          | 103±6ms              | 103±4ms             |    1    | regridding.HorizontalChunkedRegridding.time_regrid_area_w                                   |
|          | 57.9±0.6ms           | 58.4±1ms            |    1.01 | regridding.HorizontalChunkedRegridding.time_regrid_area_w_new_grid                          |
|          | 107±0.04M            | 107±0.06M           |    1    | regridding.HorizontalChunkedRegridding.tracemalloc_regrid_area_w                            |
|          | 147±0.04M            | 147±0.04M           |    1    | regridding.HorizontalChunkedRegridding.tracemalloc_regrid_area_w_new_grid                   |
|          | 4.67±0.07ms          | 4.65±0.04ms         |    1    | save.NetcdfSave.time_netcdf_save_cube(50, False)                                            |
|          | 79.0±0.4ms           | 79.3±1ms            |    1    | save.NetcdfSave.time_netcdf_save_cube(50, True)                                             |
|          | 40.9±0.5ms           | 41.0±0.9ms          |    1    | save.NetcdfSave.time_netcdf_save_cube(600, False)                                           |
|          | 467±5ms              | 463±6ms             |    0.99 | save.NetcdfSave.time_netcdf_save_cube(600, True)                                            |
|          | 88.1±2ns             | 87.2±0.4ns          |    0.99 | save.NetcdfSave.time_netcdf_save_mesh(50, False)                                            |
|          | 62.6±0.5ms           | 62.1±0.4ms          |    0.99 | save.NetcdfSave.time_netcdf_save_mesh(50, True)                                             |
|          | 90.0±2ns             | 86.9±0.5ns          |    0.97 | save.NetcdfSave.time_netcdf_save_mesh(600, False)                                           |
|          | 407±3ms              | 411±5ms             |    1.01 | save.NetcdfSave.time_netcdf_save_mesh(600, True)                                            |
|          | 31.6±0.4k            | 31.7±0.4k           |    1    | save.NetcdfSave.tracemalloc_netcdf_save(50, False)                                          |
|          | 1.87±0.1M            | 1.87±0.1M           |    1    | save.NetcdfSave.tracemalloc_netcdf_save(50, True)                                           |
|          | 31.6±0.5k            | 31.5±0.4k           |    1    | save.NetcdfSave.tracemalloc_netcdf_save(600, False)                                         |
|          | 190±20M              | 225±4M              |    1.18 | save.NetcdfSave.tracemalloc_netcdf_save(600, True)                                          |
|          | 39.1±0.3ms           | 38.6±0.4ms          |    0.99 | stats.PearsonR.time_lazy                                                                    |
|          | 9.24±0.1ms           | 9.21±0.2ms          |    1    | stats.PearsonR.time_real                                                                    |
|          | 29.1±0.7M            | 29.5±0.5M           |    1.01 | stats.PearsonR.tracemalloc_lazy                                                             |
|          | 18.4±0.01M           | 18.4±0.01M          |    1    | stats.PearsonR.tracemalloc_real                                                             |
|          | 24.5±0.2ms           | 24.5±0.3ms          |    1    | trajectory.TrajectoryInterpolation.time_trajectory_linear                                   |
|          | 60.8±0.5ms           | 59.7±0.8ms          |    0.98 | trajectory.TrajectoryInterpolation.time_trajectory_nearest                                  |
|          | 17.6±0.02M           | 17.6±0.02M          |    1    | trajectory.TrajectoryInterpolation.tracemalloc_trajectory_linear                            |
|          | 7.75±0.02M           | 7.75±0.02M          |    1    | trajectory.TrajectoryInterpolation.tracemalloc_trajectory_nearest                           |

Generated by GHA run 15739221119

stephenworsley

First pass through, overall looks good. For now, just a question about the test coverage.

stephenworsley · 2025-06-23T13:15:43Z

lib/iris/aux_factory.py

+        lazy_deps = [
+            # Note: no attempt to make clever chunking choices here.  If needed it
+            #  should get fixed later.  Plus, single chunks keeps graph overhead small.
+            dep if is_lazy_data(dep) else da.from_array(dep, chunks=-1)


It looks like this only guarantees a single chunk if the data was initially non-lazy. From what I can tell of the tests it seems like you're only testing the case where there is a single chunk given. I think it would be worth making sure there is testing for the case where lazy_deps contains chunked arrays.

OK I think the comment is really the problem here :
this "single chunks" statement really only applies to the real arrays which we wrap as lazy.
I will try and fix this ...

Background:

The initial calculation is supposed to produce a result that we can simply use, if its chunksize is OK, but we need it to be definitely lazy so that we can pre-check the chunksize before committing to do the calculation.
So we need to ensure that the initial 'test' calculation is lazy.
I did consider ensuring that just the first, or smallest term was lazy, but I realised that in the calculation, dask itself would then wrap any other real terms, using "auto" chunking by default, which is probably sub-optimal for our purposes.

If we were making our best single effort at producing a usable result array, we might logically use our "optimal_chunksize" scheme here in wrapping the real terms.
But in fact that is not a good approach, because the whole point is that you need to consider the terms (and especially their chunking) in alignment with all dimensions of the calculated result, and not just in their own individual dimensions. That's effectively the whole problem here.

So, I chose to first wrap all real terms as single chunks, and then assess the chunksize of the calculated result.
Only if that simplistic approach produces a chunksize which is too large, does the code then make a bigger effort to re-consider the chunking across all the terms, and re-chunk everything in certain dimensions.
I thought it was probably "safer" not to do that co-optimisation unless it is clearly needed, as the results might be a bit sub-optimal.

stephenworsley · 2025-06-23T13:17:49Z

lib/iris/tests/unit/aux_factory/test_AuxCoordFactory.py

+                pts = np.ones(dims, dtype=np.int32)
+                bds = np.stack([pts - 0.5, pts + 0.5], axis=-1)
+                # Make them lazy with a single chunk in both cases
+                pts, bds = (da.from_array(x, chunks=-1) for x in (pts, bds))


As mentioned above, it looks like you're only testing the case where you're deriving from single chunked arrays. It may be worth checking what happens in the multi-chunk case.

OK I'll look into this.
Watch this space ...

In fact, for a proper stress test you could try introducing a chunked coordinate, which would end up getting rechunked (by line 154 of aux_factory.py). And for extra measure you could set it up so that the chunks were also of uneven size (say, due to slicing) so that the rechunked chunks don't quite line up with the original chunks.

stephenworsley

A couple more comments now that I've read through this more thoroughly. I think test coverage is still basically my main concern here.

stephenworsley · 2025-06-24T12:49:51Z

lib/iris/tests/unit/aux_factory/test_AuxCoordFactory.py

+                # Create simple points + bounds arrays
+                pts = np.ones(dims, dtype=np.int32)
+                bds = np.stack([pts - 0.5, pts + 0.5], axis=-1)
+                # Make them lazy with a single chunk in both cases


It's not clear what "both cases" means here, since there are 3 coordinates being made.

stephenworsley · 2025-06-24T12:56:32Z

lib/iris/aux_factory.py

+                if new_chunks != dep_chunks:
+                    # When dep chunksize needs to change, produce a rechunked version.
+                    if is_lazy_data(original_dep):
+                        dep = original_dep.rechunk(new_chunks)


It might be worthwhile ensuring that this line gets test coverage.

stephenworsley · 2025-06-24T13:01:39Z

lib/iris/tests/unit/aux_factory/test_AuxCoordFactory.py

+                pts = np.ones(dims, dtype=np.int32)
+                bds = np.stack([pts - 0.5, pts + 0.5], axis=-1)
+                # Make them lazy with a single chunk in both cases
+                pts, bds = (da.from_array(x, chunks=-1) for x in (pts, bds))


In fact, for a proper stress test you could try introducing a chunked coordinate, which would end up getting rechunked (by line 154 of aux_factory.py). And for extra measure you could set it up so that the chunks were also of uneven size (say, due to slicing) so that the rechunked chunks don't quite line up with the original chunks.

pp-mo added 3 commits June 17, 2025 00:18

Generalise derived aux-coord calculation and rechunk to avoid huge ch…

19b1cea

…unks.

Add test for derived coord rechunk behaviour.

b41c429

Fix

dc4a5d9

pp-mo force-pushed the rechunk_derived branch from 7ca7e93 to dc4a5d9 Compare June 16, 2025 23:49

scitools-ci bot added this to 🚴 Peloton Jun 17, 2025

pp-mo force-pushed the rechunk_derived branch from 1f0f8fa to edd4e78 Compare June 17, 2025 11:44

Fixes to allow for non-lazy compenents.

c023a07

pp-mo force-pushed the rechunk_derived branch from 85340af to c023a07 Compare June 17, 2025 14:05

pp-mo commented Jun 17, 2025

View reviewed changes

lib/iris/aux_factory.py Outdated Show resolved Hide resolved

lib/iris/aux_factory.py Outdated Show resolved Hide resolved

lib/iris/aux_factory.py Outdated Show resolved Hide resolved

lib/iris/aux_factory.py Outdated Show resolved Hide resolved

pp-mo added 3 commits June 18, 2025 14:16

Allow for non-lazy result also : WIP.

09f3077

Overhaul treatment of real+lazy components: Always produce a lazy res…

411f039

…ult.

Parametrize to unify testing with real+lazy dependencies.

bf1e132

pp-mo marked this pull request as ready for review June 18, 2025 17:03

pp-mo added the benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts label Jun 18, 2025

stephenworsley reviewed Jun 23, 2025

View reviewed changes

stephenworsley requested changes Jun 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rechunk derived #6516

Rechunk derived #6516

Uh oh!

pp-mo commented Jun 16, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pp-mo commented Jun 18, 2025

Uh oh!

github-actions bot commented Jun 18, 2025

Uh oh!

stephenworsley left a comment

Uh oh!

stephenworsley Jun 23, 2025

Uh oh!

pp-mo Jun 24, 2025 •

edited

Loading

Uh oh!

stephenworsley Jun 23, 2025

Uh oh!

pp-mo Jun 24, 2025

Uh oh!

stephenworsley Jun 24, 2025

Uh oh!

stephenworsley left a comment

Uh oh!

stephenworsley Jun 24, 2025

Uh oh!

stephenworsley Jun 24, 2025

Uh oh!

stephenworsley Jun 24, 2025

Uh oh!

Uh oh!

Rechunk derived #6516

Are you sure you want to change the base?

Rechunk derived #6516

Uh oh!

Conversation

pp-mo commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Todo:

Uh oh!

codecov bot commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pp-mo commented Jun 18, 2025

Update 2025-06-18

Uh oh!

github-actions bot commented Jun 18, 2025

⏱️ Performance Benchmark Report: c617e86

Uh oh!

stephenworsley left a comment

Choose a reason for hiding this comment

Uh oh!

stephenworsley Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

pp-mo Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Background:

Uh oh!

stephenworsley Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

pp-mo Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

stephenworsley Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

stephenworsley left a comment

Choose a reason for hiding this comment

Uh oh!

stephenworsley Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

stephenworsley Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

stephenworsley Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pp-mo commented Jun 16, 2025 •

edited

Loading

codecov bot commented Jun 17, 2025 •

edited

Loading

⏱️ Performance Benchmark Report: `c617e86`

pp-mo Jun 24, 2025 •

edited

Loading