Include padding in max slices and number of blocks calculations by yousefmoazzam · Pull Request #680 · DiamondLightSource/httomo

yousefmoazzam · 2026-02-02T15:23:17Z

As mentioned in 3936d41, part of this change is essentially reverting the preview extending implemented in #670.

The two main changes are:

the BlockSplitter is aware of the padding that blocks it gets from the `DataSetSource has
the max slices calculation is aware of the padding applied to the chunk that is associated with a section

Checklist

I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have made corresponding changes to the documentation

dkazanc · 2026-02-04T11:30:14Z

thanks @yousefmoazzam . So, if I understand correctly, this change removes the need to expand the previewer, but rather deal with how the data should be loaded assuming that padding is needed?

I'm just curious about those 2 cases, can you help please:

We have 1 slice in the preview, padding = (5, 5) and 4 processes. determine_max_slices tells that we can fit more than 11 slices on the GPU. What happens in this case?
Same conditions: 1 slice in the preview, padding = (5, 5) and 4 processes, but determine_max_slices says that we can fit only 5 slices for instance.

Cheers!

yousefmoazzam · 2026-02-04T12:14:30Z

this change removes the need to expand the previewer

Yeah, this change has removed all code that expands the preview based on methods padding requirements

but rather deal with how the data should be loaded assuming that padding is needed?

This change doesn't do anything to data loading, that's the same as before - which is that the data needed for the next section is padded as required by either the loader or the dataset store. This change mainly deals with changing how the number of blocks are calculated, and how the max slices is calculated.

I'm just curious about those 2 cases, can you help please:

We have 1 slice in the preview, padding = (5, 5) and 4 processes. determine_max_slices tells that we can fit more than 11 slices on the GPU. What happens in this case?

Same conditions: 1 slice in the preview, padding = (5, 5) and 4 processes, but determine_max_slices says that we can fit only 5 slices for instance.

For case 1, I think this shouldn't happen. In determine_max_slices(), the max slices starts off as being the length of the padded chunk:

httomo/httomo/runner/task_runner.py

Lines 373 to 375 in f535c19

    
           max_slices = ( 
        
               data_shape[slicing_dim] + self.source.padding[0] + self.source.padding[1] 
        
           )

and as memory estimation of each method is done, the minimum of the current max slices and the method's max slices is taken:

httomo/httomo/runner/task_runner.py

Line 423 in f535c19

max_slices_methods[idx] = min(max_slices, slices_estimated)

If the section is 1 slice unpadded and 11 slices padded, then determine_max_slices() should never calculate a max slices value above 11 slices.

For case 2, as things are currently, I think this would break somewhere in the block splitter! This is an edge case where the calculated max slices is lower than 1 slice + padding (1 + 10 = 11 in this case), and there'd be a negative value somewhere in the block splitter which would likely cause problems. Thanks for pointing this out, this needs to be handled in some manner, otherwise some confusing/hard to debug errors will probably occur in such a case.

If that case does happen, it'd be that not even 1 slice can be processed due to padding requirements: would it be reasonable to throw an error in such a case? Being able to handle low-ish GPU resources is fair, but if the machine's GPU can't even process 1 slice, I'm not sure what else can be done to salvage the pipeline run?

dkazanc · 2026-02-04T13:50:15Z

thanks @yousefmoazzam, I see now that
section.max_slices -= padding[0] + padding[1]
been replaced with

 data_shape = self.source.chunk_shape
        max_slices = (
            data_shape[slicing_dim] + self.source.padding[0] + self.source.padding[1] )

which takes into account padding rather than subtracting from the estimated slices as before.

Yes, the second case should probably throw a meaningful error, I guess the padded method establishes a some kind of limit on the size of the chunk/block that can be processed. This case can happen easily when, for instance, iterative methods are running on a small memory GPU systems. One potential solution would be to go 2D when 3D padding doesn't fit the memory, but, as you say, it is the edge case for "laptop"-type processing. We might handle it properly later, if needed at all.

dkazanc · 2026-02-04T13:53:09Z

Also I think this ticket will be fixed with this change. I'll check if that this the case.

yousefmoazzam · 2026-02-05T12:42:06Z

I had a think about what was discussed in the sprint planning regarding case 2 above and potentially reducing the padding so then the GPU could fit the core slice(s) + padding slices.

The main thing I thought would be an issue is that the data source object for a section is created before the max slices for that section is determined, meaning that the padding had already been determined and thus couldn't be changed by the time execution gets to determine_max_slices().

I think if the dataset store is backed by an hdf5 file, this isn't a problem, and nothing special needs to be done other than change the padding of the data source object.

But, if the dataset store is backed by RAM, then it is a problem and something additional needs to be done other than simply changing the padding of the data source object. The issue is that the edges of the padded chunk array would have more padding than needed (for example, original padding was (5, 5) but was reduced to (3, 3) to be able to fit data into GPU memory). So a shift of some sort would need to be introduced for reading blocks at the boundaries of the padded chunk array to make the extra/unnecessary padding slices be ignored.

One option could be to keep determine_max_slices() after the data source object is created, and introduce the shifting as mentioned above. Another option could be to move determine_max_slices() before the data source is created, and somehow determine the padding that'd allow at least one slice (if not more) be processed, and then create the data source object.

I think both of these options are non-trivial to implement, and given that this is an edge case for when GPU memory can hold barely any slices, I'm tempted to leave this for later and simply raise an error for now. What do you think?

dkazanc · 2026-02-06T10:00:50Z

thank for looking into it @yousefmoazzam. Fair enough, as we suspected it is not as simple as to reduce the padding after the estimation and proceed. Happy with leaving this for later! I guess we can catch this case and insert a sensible error? May be we should say that the the chosen method (method that requires the largest padding in the pipeline) cannot process the data due to GPU memory limitations. Please remove that method from the pipeline or run it on more powerful GPU device?
Can we do something like that?
The 3rd option is to enable 2D processing for 3D methods that need padding, then we don't need to do much in the framework. It is a bit of work though, and hopefully people won't be using 2-4GB GPUs to process the data :)

yousefmoazzam · 2026-02-06T10:06:58Z

I guess we can catch this case and insert a sensible error? May be we should say that the the chosen method (method that requires the largest padding in the pipeline) cannot process the data due to GPU memory limitations. Please remove that method from the pipeline or run it on more powerful GPU device?
Can we do something like that?

I think this should be possible, I'll give it a go!

Apart from some minor irrelevant details, this essentially reverts #670.

The shape of the data being transferred is in principle dependent on the max slices calculation, which in turn is dependent on the GPU model being used. As the GPU model can vary widely, the max slices and therefore the data shape can vary widely, making this assertion difficult to pass if run on different GPUs. As the purpose of the small pipeline tests is not focused on specific data shapes, but functional behaviour of pipeline execution, asserting that there's a line in the logfile corresponding to a data transfer occurring for GPU 0 is sufficient, and there's no need to be specific about the shape of the data that was transferred.

See the following PR comments for a discussion on what could be done in the future to try and salvage the pipeline run instead of raising an error: #680.

yousefmoazzam requested a review from dkazanc February 2, 2026 15:23

yousefmoazzam marked this pull request as draft February 2, 2026 15:33

yousefmoazzam removed the request for review from dkazanc February 2, 2026 15:33

yousefmoazzam marked this pull request as ready for review February 2, 2026 16:03

yousefmoazzam requested a review from dkazanc February 2, 2026 16:03

yousefmoazzam added 7 commits February 6, 2026 13:33

Remove padding-based preview extension

3dbf499

Apart from some minor irrelevant details, this essentially reverts #670.

Add padding property to DataSetSource protocol

e687db2

Implement padding property on loader and dataset store reader

d281695

Include padding in max slices and no. of blocks calculations

863689f

Parametrise block splitter tests over padding

10ef6fd

Raise error if max slices smaller than one slice plus padding slices

5c0e26c

See the following PR comments for a discussion on what could be done in the future to try and salvage the pipeline run instead of raising an error: #680.

yousefmoazzam force-pushed the padding-aware-block-splitter branch from f535c19 to 5c0e26c Compare February 6, 2026 13:42

yousefmoazzam requested review from dkazanc and removed request for dkazanc February 6, 2026 13:50

dkazanc mentioned this pull request Feb 9, 2026

refactoring logic for the preview expander #678

Closed

3 tasks

yousefmoazzam merged commit c94fccd into main Feb 11, 2026
6 checks passed

yousefmoazzam deleted the padding-aware-block-splitter branch February 11, 2026 11:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include padding in max slices and number of blocks calculations#680

Include padding in max slices and number of blocks calculations#680
yousefmoazzam merged 7 commits intomainfrom
padding-aware-block-splitter

yousefmoazzam commented Feb 2, 2026 •

edited

Loading

Uh oh!

dkazanc commented Feb 4, 2026

Uh oh!

yousefmoazzam commented Feb 4, 2026

Uh oh!

dkazanc commented Feb 4, 2026

Uh oh!

dkazanc commented Feb 4, 2026

Uh oh!

yousefmoazzam commented Feb 5, 2026 •

edited

Loading

Uh oh!

dkazanc commented Feb 6, 2026 •

edited

Loading

Uh oh!

yousefmoazzam commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yousefmoazzam commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

dkazanc commented Feb 4, 2026

Uh oh!

yousefmoazzam commented Feb 4, 2026

Uh oh!

dkazanc commented Feb 4, 2026

Uh oh!

dkazanc commented Feb 4, 2026

Uh oh!

yousefmoazzam commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dkazanc commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yousefmoazzam commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yousefmoazzam commented Feb 2, 2026 •

edited

Loading

yousefmoazzam commented Feb 5, 2026 •

edited

Loading

dkazanc commented Feb 6, 2026 •

edited

Loading