Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fits reading emits a warning when the data array is not actually memory mapped #6837

Closed
wants to merge 8 commits into from

Conversation

davnish
Copy link
Contributor

@davnish davnish commented Mar 12, 2023

This adds a warning to the read function in sunpy.io. This warning is emitted when memmap=True and the data array is not actually memory mapped.

Merging this PR:
Closes #6141

@davnish davnish requested a review from a team as a code owner March 12, 2023 00:59
@davnish davnish changed the title Fits Warning Fits reading emits a warning when the data array is not actually memory mapped Mar 12, 2023
@ayshih
Copy link
Member

ayshih commented Mar 12, 2023

Thanks for the PR! There are issues with how your use of metadata keywords, but it's moot because you should instead explicitly check whether the data array is memory mapped. Please see #6110 (comment) for one way to do so.

I'm going to switch this PR to draft status so that it will not be critically reviewed before it is ready (e.g., with unit tests). We continue to pay attention to all PRs as they are developed, even draft PRs.

@ayshih ayshih marked this pull request as draft March 12, 2023 02:14
@ayshih ayshih added the io Affects the io submodule label Mar 12, 2023
@davnish
Copy link
Contributor Author

davnish commented Mar 13, 2023

Thanks for helping me @ayshih. I have updated the approach. Will this approach work?

@davnish davnish marked this pull request as ready for review March 13, 2023 10:38
Copy link
Member

@ayshih ayshih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add one or more unit tests to verify that this PR works as intended. The code I wrote in #6110 (comment) uses recursion because in general you could be inspecting a view of a non-memory-mapped array, and then base would not be None, but that doesn't mean memory mapping is being used.

Also, the warning to the user shouldn't blindly suggest using disable_image_compression=True or do_not_scale_image_data=True. The suggestion should be tailored to why we believe the array was not memory mapped. That is, don't suggest to a user that they turn off image compression when the image wasn't actually compressed in the first place.

@nabobalis nabobalis added the No Backport A PR that isn't to be backported to any release branch. (To be used as a flag to other maintainers) label Mar 13, 2023
@davnish
Copy link
Contributor Author

davnish commented Mar 14, 2023

@ayshih Is this okay? I have added the isinstance function to check whether data.base is mmap.mmap object or not.
I have also tailored the warnings to be more specific.

Also, how should I approach testing on this? I did some test on TEST_AIA_IMAGE and TEST_RHESSI_IMAGE and the read function works properly on them.

@nabobalis nabobalis requested a review from ayshih March 20, 2023 17:10
@ayshih
Copy link
Member

ayshih commented Mar 22, 2023

Also, how should I approach testing on this?

In our sample data set (as opposed to our test data set), AIA_171_IMAGE is compressed and SWAP_LEVEL1_IMAGE is scaled, so neither should successfully memory map. We could add small test files to our test data set, but for now you can proceed with creating online-required unit tests.

Additionally, be aware that:

  • AIA_171_IMAGE has the compressed data as the second HDU, so your code will error as currently written because it assumes that the first HDU has a data array.
  • For seeing whether the data array has been rescaled, your check for the BSCALE keyword will not work because the BSCALE keyword is removed by Astropy when it rescales the data. You need to instead check whether data._orig_bzero !=0 or data._orig_bscale != 1 or data._blank is not None.
  • For seeing whether the data array needed to be uncompressed, your check for the BZERO keyword doesn't make sense because that is not a keyword relevant to compression. You should be able to instead simply check whether the HDU is of type CompImageHDU.

if memmap == True and not (isinstance(pairs[0].data.base, mmap.mmap) and pairs[0].data.base is not None):
try:
if pairs[0].header['BZERO'] != 0:
warn_user("Data array is not memory mapped. Use 'disable_image_compression=True' to preserve memory mapping.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can preserve the memmap, but it would be entirely useless as you wont be able to see the image any more? I don't think this is a good warning to throw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning is raised by sunpy.io.read_file(), which could conceivably be used by a user with some other purpose for the file contents than to use it as a Map. Plus, note that it would only get triggered when a user explicitly specifies memmap=True, versus the default of memmap=None.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that's true, but I don't think that this warning is helpful as written for someone who doesn't have a good understanding of FITS internals. I can just see someone doing that and then being like waaaat

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The easy "fix" would be to elaborate in the warning, e.g.:

Data array is not memory mapped because it is stored in the file using compression. Specify disable_image_compression=True to preserve memory mapping, but only if you do not need the data array to be decompressed.

@@ -99,6 +100,15 @@ def read(filepath, hdus=None, memmap=None, **kwargs):
message += repr(e)
warn_user(message)

if memmap == True and not (isinstance(pairs[0].data.base, mmap.mmap) and pairs[0].data.base is not None):
try:
if pairs[0].header['BZERO'] != 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually a reliable way to detect a compressed image header?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it isn't for two reasons. See #6837 (comment)

@nabobalis nabobalis marked this pull request as draft March 24, 2023 18:29
@davnish
Copy link
Contributor Author

davnish commented Mar 25, 2023

How are these changes?
Do I have to create a seperate PR for the tests?
I tested on AIA_171_IMAGE and SWAP_LEVEL1_IMAGE. Works fine on AIA_171_IMAGE it throws the warning, but when I read SWAP_LEVEL1_IMAGE and memmap = True it already throws a ValueError: Cannot load a memory-mapped image: BZERO/BSCALE/BLANK header keywords present. Set memmap=False.

@nabobalis
Copy link
Contributor

Do I have to create a seperate PR for the tests?

No, they should be in this PR.

@davnish
Copy link
Contributor Author

davnish commented Apr 10, 2023

@ayshih can you review this again?

@github-actions
Copy link

github-actions bot commented Sep 8, 2023

Hello 👋, Thanks for your contribution to sunpy!
I have marked this pull request as stale because there hasn't had any activity in five months. If you are still working on this, or if it's waiting on a maintainer to look at it then please let us know and we will keep it open. Please add a comment with: @sunpy/sunpy-developers to get someone's attention.
If nobody comments on this pull request for another month, it will be closed.

@github-actions github-actions bot added the Stale The bot will close this PR after 6 months label Sep 8, 2023
@github-actions
Copy link

github-actions bot commented Oct 9, 2023

Hello again 👋, We want to thank you again for your contribution to sunpy!
This pull request has had no activity since my last reminder, so I am going to close it. If at any time you want to come back to this please feel free to reopen it! If you want to discuss this, please add a comment with: @sunpy/sunpy-developers and someone will get back to you soon.

@github-actions github-actions bot closed this Oct 9, 2023
@nabobalis nabobalis added the Needs Adoption PRs that were abandoned but should be picked up again and merged in label Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
io Affects the io submodule Needs Adoption PRs that were abandoned but should be picked up again and merged in No Backport A PR that isn't to be backported to any release branch. (To be used as a flag to other maintainers) Stale The bot will close this PR after 6 months
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FITS reading should emit a warning when memory mapping is explicitly requested but not successful
4 participants