Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more explicit errors and warnings in data cleaning #79

Merged
merged 18 commits into from
Nov 8, 2021

Conversation

vandalt
Copy link
Contributor

@vandalt vandalt commented Nov 6, 2021

When playing with simulated NIRISS data, I encountered a few corner cases during the cleaning due to the small size (80x80) of the images. The problems were not caught and resulted in error later when the image was re-used:

  • Using an isz that went out of bounds gave a non-square array and did not warn the user. After discussing with @DrSoulain, we chose to simply raise and error and suggest maximum possible value to the user.
  • Using a sky subtraction (inner) radius that was out of the image was not caught and resulted in a NaN value for the whole background. Now, there is a warning and the background subraction is skipped (same behaviour as the previous one if an IndexError was caught.
  • I also added a warning if the outer sky subtraction radius is out of bounds, to let the user know that the outer radius does not restrict the background and that everything beyond r1 is used.

I also added tests for these changes in two new files test_processing.py and test_tools.py.

@codecov
Copy link

codecov bot commented Nov 6, 2021

Codecov Report

Merging #79 (9bcf979) into master (dca5e07) will increase coverage by 2.22%.
The diff coverage is 85.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #79      +/-   ##
==========================================
+ Coverage   48.24%   50.46%   +2.22%     
==========================================
  Files          19       21       +2     
  Lines        3646     3725      +79     
==========================================
+ Hits         1759     1880     +121     
+ Misses       1887     1845      -42     
Flag Coverage Δ
unittests 50.46% <85.56%> (+2.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
amical/data_processing.py 28.71% <69.56%> (+21.44%) ⬆️
amical/tests/test_processing.py 100.00% <100.00%> (ø)
amical/tests/test_tools.py 100.00% <100.00%> (ø)
amical/tools.py 33.08% <100.00%> (+2.41%) ⬆️
amical/tests/test_extraction.py 97.43% <0.00%> (-2.57%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dca5e07...9bcf979. Read the comment docs.

@vandalt
Copy link
Contributor Author

vandalt commented Nov 6, 2021

I also added x and y limits to the show_clean_params function, to restrict the plot to the actual image even if the background radii go out of bounds.

Copy link
Collaborator

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits. In general I'm totally on board with this and I strongly believe clear warnings and errors are the bread and butter to build a satisfying UX.

Comment on lines 9 to 12
img_size = 80 # Same size as NIRISS images
img = np.ones((img_size, img_size))
xmax, ymax = 17, 57
img[ymax, xmax] = img.max() * 5 # Add off-centered max pixel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a very particular case, maybe it'd make a more robust test if the image was random. You can then manually set the center to the lowest value since it's important for this test that the max be off-centered

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean making the whole image random instead of ones and one value > 1 ? Or do you mean making the max position random? I'm not sure what you by "manually set the center to the lowest value". Also, after looking into it a bit more, it should not matter that the location is off-centered, as long as the cropped size is bigger than the distance to the edges.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the test random in the last commit. Should I use pytest-repeat to repeat the test a few times to make it more robust ?

amical/tests/test_tools.py Outdated Show resolved Hide resolved
amical/data_processing.py Outdated Show resolved Hide resolved
@vandalt vandalt changed the title Use more explicit errors and warnings for image boundary problems in cleaning Use more explicit errors and warnings data cleaning Nov 8, 2021
@vandalt
Copy link
Contributor Author

vandalt commented Nov 8, 2021

While reviewing this, I added a test for the default behaviour of clean_data. Keeping most keyword arguments to their default None value resulted in many errors in functions called by clean_data. I edited clean_data to do the following:

  • If a keyword argument is None, the corresponding cleaning step is skipped.
  • If a cleaning step is set to True (e.g. apod or sky), but requires a keyword argument set to None, a warning is shown and the step is skipped. This could also raise an error, but I felt like warnings would be best suited for the default behaviour of a function.

I decided to add these changes here directly because they are close to the other changes, which aim to make the cleaning step more transparent for users. Also, I felt like having a clearly defined (and tested) default behaviour will make future changes easier to develop and test, so I wanted to add the changes upstream as quickly as possible.

@vandalt vandalt changed the title Use more explicit errors and warnings data cleaning Use more explicit errors and warnings in data cleaning Nov 8, 2021
@DrSoulain DrSoulain merged commit 5b6e38e into SAIL-Labs:master Nov 8, 2021
DrSoulain added a commit that referenced this pull request Nov 8, 2021
@vandalt vandalt deleted the niriss-clean branch November 8, 2021 20:53
@neutrinoceros neutrinoceros mentioned this pull request Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants