Separate DataSetBlock from DataSet#233
Merged
yousefmoazzam merged 17 commits intofeature/transparent-file-storefrom Mar 5, 2024
Merged
Separate DataSetBlock from DataSet#233yousefmoazzam merged 17 commits intofeature/transparent-file-storefrom
DataSetBlock from DataSet#233yousefmoazzam merged 17 commits intofeature/transparent-file-storefrom
Conversation
Co-authored-by: Yousef Moazzam <yousefmoazzam@users.noreply.github.com>
Refactored DataSetBlock to be standalone and more robust
Many block reading tests using the loader are now broken due to moving away from using `FullFileDataSet` to generate blocks for the loader, but some have been fixed in this change. In particular, the block reading loader tests that have been fixed are the ones that don't preview/crop the data to be loaded, and the ones that still need to fixed are the ones that use previewing (and thus require some offsetting to do the block reading).
Refactor loader to create blocks directly, without using `FullFileDataSet`
Co-authored-by: Yousef Moazzam <yousefmoazzam@users.noreply.github.com>
…tasetblock Fix methods with new datasetblock (WIP)
In addition to getting this test to work with the refactored `DataSetBlock`, the test's behaviour has also been modified slightly: - only one method is used to process the two blocks (two methods in the section was unnecessary for this test's purpose of checking that the last block being processed triggers the task runner to update its collection of side outputs) - after the first block of two has been processed, it's asserted that the task runner hasn't yet attempted to update its collection of side outputs
bd2c34b to
298e4f8
Compare
A consequence of changes being done in parallel to how darks/flats are handled in the pipelione mean that it'll soon no longer be necessary to account for the darks/flats in the max slices estimation. Doing the removal here makes it simpler to handle darks/flats from the loader's perspective, because now that the max slices estimation doesn't need the darks/flats, the task runner won't need to ask the loader for the darks/flats. This means that the loader doesn't need the behaviour of translating empty darks/flats in the form of `None` into empty arrays (which is currently only in `DataSetBlock`). Put another way, removing the darks/flats input to the max slices calculation here allows the avoidance of copying the logic in `DataSetBlock` to translate `None` darks/flats values to empty arrays, because the task runner no longer needs dark/flats from the loader for max slices calculation. (Putting the logic into `AuxiliaryData` to be usable by both the loader and the block class was also an option, but it felt messy to give the `AuxiliaryData` the slicing dimension and data shape in order to produce the appropriately shaped empty array in the detector y and detector x dimensions). Now, only `DataSetBlock` has the need to be able to translate `None` darks/flats values to appropriately shaped empty arrays.
298e4f8 to
f73ab5c
Compare
The original intention of this check was to protect against having less angles than projections. However, in the case of a pipeline involving 360 data, the angles get truncated to half their original length, while the reconstruction shape in the 0th dimension remains (in general) much larger than the number of angles after truncation. This check perhaps could be moved to the loader to check when angles are being loaded. Alternatively, the numpy array containing the angles in the `AuxiliaryData` could be locked/ not writeable by default to provide a higher level of certainty that the angles are only modified in a few exempt situations, as a protection measure against the the angles array becoming inconsistent with the data.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
There's several ways to describe the main change at a high-level:
DataSetBlocksuch that it is fully functional without being a child class ofDataSetDataSetBlockfromDataSetDataSetBlock) without a "base" (instance ofDataSet)At a lower-level, the changes are:
DataSetBlockto be fully functional (as required by methods and the task runner) without usingDataSetas a base classDataSetandFullFileDataSetfrom the data store writer + reader and task runnerFullFileDataSetfromStandardTomoLoaderDataSetandFullFileDataSetcompletelyMotivation
This change was partly motivated by the difficulties seen in trying to use
FullFileDataSetinStandardTomoLoaderto create blocks (ie, to create instances ofDataSetBlock).Another reason for this change was that, as development has continued, it was recognised that the main data object that methods and the task runner work with are blocks rather than chunks. In other words, blocks are the fundamental data object in httomo, not chunks. On the other hand, the original dataset design involves a class hierarchy where a chunk (represented by
DataSet) is the fundamental data object, and from which other data objects derive from to get certain behaviour (DataSetBlockandFullFileDataSet). These two choices about which data object is the fundamental one clash, and consequences of this clash are likely manifested in some of the difficulties that have been encountered during development.