Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data container reductions #1515

Merged
merged 43 commits into from
Nov 8, 2023
Merged

Data container reductions #1515

merged 43 commits into from
Nov 8, 2023

Conversation

hrobarts
Copy link
Contributor

@hrobarts hrobarts commented Oct 2, 2023

Describe your changes

Updated reduction functions on the DataContainer class to accept a direction argument as a string or tuple of strings which match values in dimension_labels
Added a function _directional_reduction_unary() which:

  • checks whether the direction argument is specified and whether the direction values exist in dimension_labels
  • checks and raises an error if direction argument and numpy argument axis are both specified
  • passes the arguments to a specified function
  • checks if the function returns a numpy array, and if so returns the output as a DataContainer with the correct dimension_labels after reduction

The _directional_rediction_unary function is called by DataContainer.mean(), DataContainer.min(), DataContainer.max() and DataContainer.sum()

Describe any testing you have performed

Added tests passing different direction arguments to DataContainer.mean(), DataContainer.min(), DataContainer.max() and DataContainer.sum()

Link relevant issues

#1507

Checklist when you are ready to request a review

  • I have performed a self-review of my code
  • I have added docstrings in line with the guidance in the developer guide
  • I have implemented unit tests that cover any new or modified functionality
  • CHANGELOG.md has been updated with any functionality change
  • Request review from all relevant developers
  • Change pull request label to 'Waiting for review'

Contribution Notes

Please read and adhere to the developer guide and local patterns and conventions.

  • The content of this Pull Request (the Contribution) is intentionally submitted for inclusion in CIL (the Work) under the terms and conditions of the Apache-2.0 License.
  • I confirm that the contribution does not violate any intellectual property rights of third parties

@gfardell
Copy link
Member

gfardell commented Oct 6, 2023

I had a chat with @paskino about this. We decided that:

  1. The accumulation should always happen at 64bit, we can remove this as an option for the user.
  2. The return data container dtype should be the same as the input dtype. Again this is not something we should expose as an argument.
  3. The reductions can return an AcquisitionData or ImageData when this is passed to out. In this instance we will check that the array is the expected size and dtype, we will not check or update the geometry. In practise this will mean doing the reduction, getting the numpy array and then using out.fill(), we cannot use out.aray directly as it will (probably) be the wrong dtype to accumulate in to.

@hrobarts hrobarts marked this pull request as ready for review October 6, 2023 15:49
@hrobarts
Copy link
Contributor Author

hrobarts commented Oct 9, 2023

Hello @gfardell and @paskino, just thinking about point 2:

  1. The return data container dtype should be the same as the input dtype. Again this is not something we should expose as an argument.

do you think this should also be the case for ints? Especially for mean that might not be intuitive

Copy link
Contributor

@paskino paskino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks almost good to me: small changes and additions to docstrings

hrobarts and others added 4 commits October 17, 2023 14:23
Co-authored-by: Edoardo Pasca <edo.paskino@gmail.com>
Signed-off-by: hrobarts <77114597+hrobarts@users.noreply.github.com>
Copy link
Member

@gfardell gfardell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking very neat now! A few suggestions for unit tests and some areas you can reuse code, but we can discuss in person too.

CHANGELOG.md Outdated Show resolved Hide resolved
Wrappers/Python/cil/framework/framework.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/framework/framework.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/framework/framework.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/framework/framework.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/framework/framework.py Outdated Show resolved Hide resolved
Wrappers/Python/test/test_DataContainer.py Outdated Show resolved Hide resolved
Wrappers/Python/test/test_DataContainer.py Outdated Show resolved Hide resolved
Wrappers/Python/test/test_DataContainer.py Outdated Show resolved Hide resolved
Co-authored-by: Gemma Fardell <47746591+gfardell@users.noreply.github.com>
Signed-off-by: Hannah Robarts <77114597+hrobarts@users.noreply.github.com>
@gfardell
Copy link
Member

gfardell commented Nov 6, 2023

image
The link hasn't rendered nicely in the documentation - but I'm not sure it's necessary or adds much. Maybe for min and max should enforce the same data type as the input. For mean I think you could just say the accumulated 64bit data will be cast to the output data type.

@hrobarts
Copy link
Contributor Author

hrobarts commented Nov 6, 2023

image The link hasn't rendered nicely in the documentation - but I'm not sure it's necessary or adds much. Maybe for min and max should enforce the same data type as the input. For mean I think you could just say the accumulated 64bit data will be cast to the output data type.

This should be rendering correctly now. I think it makes sense to keep the same behaviour for min and max because we might want to be able to pass a different data type to out, so we don't want to enforce the same data type as the input.

Copy link
Contributor

@paskino paskino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See suggestion, otherwise it looks good to me.

Wrappers/Python/cil/framework/framework.py Outdated Show resolved Hide resolved
Co-authored-by: Edoardo Pasca <edo.paskino@gmail.com>
Signed-off-by: Hannah Robarts <77114597+hrobarts@users.noreply.github.com>
@hrobarts hrobarts added the Merge pending jenkins Once jenkins is happy with can be merged label Nov 6, 2023
@hrobarts hrobarts dismissed lauramurgatroyd’s stale review November 8, 2023 09:58

This has been addressed

@hrobarts hrobarts requested a review from paskino November 8, 2023 09:59
Signed-off-by: Hannah Robarts <77114597+hrobarts@users.noreply.github.com>
@hrobarts hrobarts merged commit b7d05fa into master Nov 8, 2023
3 checks passed
@hrobarts hrobarts deleted the DataContainer_reductions branch November 8, 2023 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merge pending jenkins Once jenkins is happy with can be merged Waiting for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants