Use more explicit errors and warnings in data cleaning #79

vandalt · 2021-11-06T19:09:04Z

When playing with simulated NIRISS data, I encountered a few corner cases during the cleaning due to the small size (80x80) of the images. The problems were not caught and resulted in error later when the image was re-used:

Using an isz that went out of bounds gave a non-square array and did not warn the user. After discussing with @DrSoulain, we chose to simply raise and error and suggest maximum possible value to the user.
Using a sky subtraction (inner) radius that was out of the image was not caught and resulted in a NaN value for the whole background. Now, there is a warning and the background subraction is skipped (same behaviour as the previous one if an IndexError was caught.
I also added a warning if the outer sky subtraction radius is out of bounds, to let the user know that the outer radius does not restrict the background and that everything beyond r1 is used.

I also added tests for these changes in two new files test_processing.py and test_tools.py.

for more information, see https://pre-commit.ci

codecov · 2021-11-06T19:10:37Z

Codecov Report

Merging #79 (9bcf979) into master (dca5e07) will increase coverage by 2.22%.
The diff coverage is 85.56%.

@@            Coverage Diff             @@
##           master      #79      +/-   ##
==========================================
+ Coverage   48.24%   50.46%   +2.22%     
==========================================
  Files          19       21       +2     
  Lines        3646     3725      +79     
==========================================
+ Hits         1759     1880     +121     
+ Misses       1887     1845      -42

Flag	Coverage Δ
unittests	`50.46% <85.56%> (+2.22%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
amical/data_processing.py	`28.71% <69.56%> (+21.44%)`	⬆️
amical/tests/test_processing.py	`100.00% <100.00%> (ø)`
amical/tests/test_tools.py	`100.00% <100.00%> (ø)`
amical/tools.py	`33.08% <100.00%> (+2.41%)`	⬆️
amical/tests/test_extraction.py	`97.43% <0.00%> (-2.57%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dca5e07...9bcf979. Read the comment docs.

…clean

vandalt · 2021-11-06T19:44:26Z

I also added x and y limits to the show_clean_params function, to restrict the plot to the actual image even if the background radii go out of bounds.

neutrinoceros

Mostly nits. In general I'm totally on board with this and I strongly believe clear warnings and errors are the bread and butter to build a satisfying UX.

neutrinoceros · 2021-11-06T19:52:29Z

amical/tests/test_tools.py

+    img_size = 80  # Same size as NIRISS images
+    img = np.ones((img_size, img_size))
+    xmax, ymax = 17, 57
+    img[ymax, xmax] = img.max() * 5  # Add off-centered max pixel


this looks like a very particular case, maybe it'd make a more robust test if the image was random. You can then manually set the center to the lowest value since it's important for this test that the max be off-centered

Do you mean making the whole image random instead of ones and one value > 1 ? Or do you mean making the max position random? I'm not sure what you by "manually set the center to the lowest value". Also, after looking into it a bit more, it should not matter that the location is off-centered, as long as the cropped size is bigger than the distance to the edges.

I made the test random in the last commit. Should I use pytest-repeat to repeat the test a few times to make it more robust ?

amical/tests/test_tools.py

amical/data_processing.py

Co-authored-by: Clément Robert <cr52@protonmail.com>

for more information, see https://pre-commit.ci

vandalt · 2021-11-08T04:51:46Z

While reviewing this, I added a test for the default behaviour of clean_data. Keeping most keyword arguments to their default None value resulted in many errors in functions called by clean_data. I edited clean_data to do the following:

If a keyword argument is None, the corresponding cleaning step is skipped.
If a cleaning step is set to True (e.g. apod or sky), but requires a keyword argument set to None, a warning is shown and the step is skipped. This could also raise an error, but I felt like warnings would be best suited for the default behaviour of a function.

I decided to add these changes here directly because they are close to the other changes, which aim to make the cleaning step more transparent for users. Also, I felt like having a clearly defined (and tested) default behaviour will make future changes easier to develop and test, so I wanted to add the changes upstream as quickly as possible.

vandalt and others added 6 commits November 6, 2021 13:52

Add error when cropped image is out of bounds

6786dbc

Add test for crop_max

b35b32d

Add tests for sky subtraction boundary problems

dcfdd8e

Handle when sky radii out of bounds and show warnings

c3d6b46

Fix matplotlib vmin/vmax deprecation warnings

f01b4e2

[pre-commit.ci] auto fixes from pre-commit.com hooks

6e1decb

for more information, see https://pre-commit.ci

vandalt mentioned this pull request Nov 6, 2021

Enhancement: Enable more flexible background/sky subraction #80

Closed

vandalt added 2 commits November 6, 2021 15:40

Limit cleaning plot to image dimensions

7b95326

Merge branch 'niriss-clean' of github.com:vandalt/AMICAL into niriss-…

70d34c5

…clean

neutrinoceros approved these changes Nov 6, 2021

View reviewed changes

vandalt and others added 10 commits November 6, 2021 16:01

Remove dot in error message

cc79f2a

Co-authored-by: Clément Robert <cr52@protonmail.com>

Also remove dot from message in code (not just test)

7642d2b

Fix max size suggestion: equal number of points on each side

56b9b36

Keep verbose option in sky_correction

0ac9e9f

Fix indexng in isz_max calculation

b64edf8

Make max location random in test_crop_max

a8e62d8

[pre-commit.ci] auto fixes from pre-commit.com hooks

e9764bc

for more information, see https://pre-commit.ci

Add failing test for clean_data without kwargs

18a176e

Make clean_data work with default kwargs

4c86814

[pre-commit.ci] auto fixes from pre-commit.com hooks

9bcf979

for more information, see https://pre-commit.ci

vandalt changed the title ~~Use more explicit errors and warnings for image boundary problems in cleaning~~ Use more explicit errors and warnings data cleaning Nov 8, 2021

vandalt changed the title ~~Use more explicit errors and warnings data cleaning~~ Use more explicit errors and warnings in data cleaning Nov 8, 2021

DrSoulain merged commit 5b6e38e into SAIL-Labs:master Nov 8, 2021

DrSoulain added a commit that referenced this pull request Nov 8, 2021

Use maximum isz as implemented by #79

e2950be

vandalt deleted the niriss-clean branch November 8, 2021 20:53

neutrinoceros mentioned this pull request Jul 5, 2022

MNT: upgrade GHA #138

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use more explicit errors and warnings in data cleaning #79

Use more explicit errors and warnings in data cleaning #79

vandalt commented Nov 6, 2021 •

edited

Loading

codecov bot commented Nov 6, 2021 •

edited

Loading

vandalt commented Nov 6, 2021

neutrinoceros left a comment

neutrinoceros Nov 6, 2021

vandalt Nov 6, 2021

vandalt Nov 6, 2021

vandalt commented Nov 8, 2021

Use more explicit errors and warnings in data cleaning #79

Use more explicit errors and warnings in data cleaning #79

Conversation

vandalt commented Nov 6, 2021 • edited Loading

codecov bot commented Nov 6, 2021 • edited Loading

Codecov Report

vandalt commented Nov 6, 2021

neutrinoceros left a comment

Choose a reason for hiding this comment

neutrinoceros Nov 6, 2021

Choose a reason for hiding this comment

vandalt Nov 6, 2021

Choose a reason for hiding this comment

vandalt Nov 6, 2021

Choose a reason for hiding this comment

vandalt commented Nov 8, 2021

vandalt commented Nov 6, 2021 •

edited

Loading

codecov bot commented Nov 6, 2021 •

edited

Loading