Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix edge case bug where grayscale image has an alpha channel, minor rtd fix #1313

Merged
merged 17 commits into from
Jan 30, 2024

Conversation

mathieuboudreau
Copy link
Contributor

@mathieuboudreau mathieuboudreau commented Jan 29, 2024

Checklist

GitHub

  • I've given this PR a concise, self-descriptive, and meaningful title
  • I've linked relevant issues in the PR body
  • I've applied the relevant labels to this PR
  • I've assigned a reviewer

PR contents

Description

Linked issues

#1312

@mathieuboudreau mathieuboudreau marked this pull request as draft January 29, 2024 15:54
@coveralls
Copy link

coveralls commented Jan 29, 2024

Pull Request Test Coverage Report for Build 7714334803

  • 0 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.009%) to 74.278%

Totals Coverage Status
Change from base Build 7185147916: 0.009%
Covered Lines: 4678
Relevant Lines: 6298

💛 - Coveralls

Copy link
Member

@joshuacwnewton joshuacwnewton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, there are the following cases:

  1. Grayscale (1-channel) -> [H, W] or [H, W, D]
  2. Grayscale + alpha (2-channel) -> [H, W, 2] or [H, W, D, 2]
  3. RGB (3-channel) -> [H, W, 3] or [H, W, D, 3]
  4. RGB + alpha (4-channel) -> [H, W, 4] or [H, W, D, 4]

(And any of these cases could have an addition batch axis with N images.)

The code assumed that we would only have to deal with cases 1, 3, 4:

# 2. the colorspace is one of: binary, gray, RGB, RGBA (not aliasing ones like YUV or CMYK)

But we neglected to consider case 2 (grayscale + alpha)? (Hence the value error mentioning (744,1154,2), i.e. the last dim denotes a 2-channel image.)

ivadomed/loader/segmentation_pair.py Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
@mathieuboudreau mathieuboudreau changed the title Fix edge case bug where grayscale image has an alpha channel Fix edge case bug where grayscale image has an alpha channel, minor rtd fix Jan 29, 2024
@mathieuboudreau mathieuboudreau marked this pull request as ready for review January 29, 2024 21:23
@mathieuboudreau mathieuboudreau added bug category: fixes an error in the code documentation category: readthedocs or user doc labels Jan 29, 2024
@mathieuboudreau
Copy link
Contributor Author

This PR is mostly ready for review, except for a test (and test image?) that we should add to cover line 377 which was the original bug source/fix.

Copy link
Member

@joshuacwnewton joshuacwnewton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nitpicks for now :)

ivadomed/loader/segmentation_pair.py Outdated Show resolved Hide resolved
ivadomed/loader/segmentation_pair.py Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
mathieuboudreau and others added 2 commits January 29, 2024 18:10
Co-authored-by: Joshua Newton <joshuacwnewton@gmail.com>
Co-authored-by: Joshua Newton <joshuacwnewton@gmail.com>
Copy link
Contributor

@kanishk16 kanishk16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk why but it currently feels like a script to me rather than a module. Hence the suggested changes... More or less, the assert statements would not be required once we follow a flat hierarchy something along the lines:

colorspace_idx = 2

if _img.ndim <= colorspace_idx:
    pass
elif _img.shape[colorspace_idx] == 2:
    # gray + alpha
     ...
elif _img.shape[colorspace_idx] == 3:
    # RGB
     ...
  
  

ivadomed/loader/segmentation_pair.py Outdated Show resolved Hide resolved
ivadomed/loader/segmentation_pair.py Outdated Show resolved Hide resolved
ivadomed/loader/segmentation_pair.py Outdated Show resolved Hide resolved
ivadomed/loader/segmentation_pair.py Outdated Show resolved Hide resolved
@mathieuboudreau
Copy link
Contributor Author

Idk why but it currently feels like a script to me rather than a module.

@kanishk16 not quite sure what you mean by this, is it because multiple steps aren't combined anymore? Personally, I prefer clarity at the cost of a few more lines, but I would really like this PR merged ASAP to help out our user in #1310 start training their data.

Are the changes you suggested mostly style preferences, or do you think they'll impact the performance? If it's the first part, can we push this through so the code becomes functional again now, and then open a new issue to discuss these stylistic details for this block of code?

If this isn't compromise you're willing to agree to, then I'd suggest we revert all the changes and use my single line bug fix for expediency: 340295d

@mathieuboudreau
Copy link
Contributor Author

There's actually a second user currently wanting to do training using these datasets (the other one posted directly to ADS: axondeepseg/axondeepseg#783), so this PR is fairly high priority to get merged ASAP.

mathieuboudreau and others added 2 commits January 30, 2024 09:58
Co-authored-by: Kanishk Kalra <36276423+kanishk16@users.noreply.github.com>
@mathieuboudreau
Copy link
Contributor Author

@kanishk16 @joshuacwnewton I just reverted to my initial solution 340295d, because as I was doing the refactoring requested after @kanishk16 's point-by-point suggestions I noticecd that it simply was just converging back to this old code. The only additional change I maddewas the omission of the pointless "is_batch*.

Let me know if this is good for you

Copy link
Contributor

@kanishk16 kanishk16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mathieuboudreau I didn't know it was a high-priority PR... I'm going to pre-approve this PR irrespective of the solution you choose which can be discussed later...

Copy link
Member

@joshuacwnewton joshuacwnewton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with reverting my suggestion! This has the benefit of making the diff of this PR smaller, making the changes easier to understand. :)

Pre-approving as well, apart from some nitpicks.

ivadomed/loader/segmentation_pair.py Outdated Show resolved Hide resolved
ivadomed/loader/segmentation_pair.py Outdated Show resolved Hide resolved
@joshuacwnewton
Copy link
Member

More or less, the assert statements would not be required

Side note: I agree that the assertions definitely weren't required.

Thoughts on assertions

I like adding assertions as a way of clarifying the internal state of the code to anyone reading it. (Basically, testing the assumptions we're making about internal state. The assertions should be relevant mainly to developers, but generally should never be seen by users.) In other words, the assertions should never fire if our assumptions are correct? But, if our assumptions are incorrect (as they were with grayscale+alpha images), then the code will fail in an unambiguous way to make debugging easier.

As a specific example, over in SCT, we've started using assertions whenever there is a bare else: case:

# Anti-pattern: We assume that argument will be either option_1/2, but what if it isn't?
# Code will continue then probably crash at some unintuitive line of code
if argument == "option_1":
   pass
elif argument == "option_2":
   pass

# Slightly better: All cases are handled thanks to 'else:'
# However, the internal state is now ambiguous -- what is the value of 'argument' inside the 'else:'?
if argument == "option_1":
   pass
else:
   pass

# Preferred by SCT: All cases handled *and* assumptions + internal state are both clear
# If assumptions are correct, assertion statement will do nothing
# If assumptions are false, then the code will fail in a clear/unambiguous way
if argument == "option_1":
   pass
else:
   assert argument == "option_2", f"Unexpected value for argument: {argument}"
   pass

I think it might just be a personal style thing -- when I write code, much like my writing in general, I tend to be pretty verbose? 😅

@kanishk16
Copy link
Contributor

Out of all the comments, maintaining a flat hierarchy is significant as it usually improves performance and in the case of training large data, this could become relevant...

mathieuboudreau and others added 2 commits January 30, 2024 10:30
Co-authored-by: Joshua Newton <joshuacwnewton@gmail.com>
Co-authored-by: Joshua Newton <joshuacwnewton@gmail.com>
@mathieuboudreau mathieuboudreau removed the request for review from hermancollin January 30, 2024 14:31
@mathieuboudreau
Copy link
Contributor Author

Thanks guys for all the feedback and work put into this! We ended up walking in a circle a bit but a little bit of exercise is not always bad =) haha

@kanishk16
Copy link
Contributor

I guess in all this I forgot to ask one important ques, do we want to add the test img to this PR?

@mathieuboudreau
Copy link
Contributor Author

I guess in all this I forgot to ask one important ques, do we want to add the test img to this PR?

Yup! Just did in that repo, will restart the tests now to see if there is increased coverage

@kanishk16
Copy link
Contributor

We'd have to cut a release for it and add the new release tag here:

"url": ["https://github.com/ivadomed/data-testing/archive/r20220328.zip"],

Let me know if you want me to cut a release!

@mathieuboudreau
Copy link
Contributor Author

We'd have to cut a release for it and add the new release tag here:

"url": ["https://github.com/ivadomed/data-testing/archive/r20220328.zip"],

Let me know if you want me to cut a release!

Ah gotcha - right, that's the same workflow we have in ADS. I just did it, let see how the tests go!

@mathieuboudreau
Copy link
Contributor Author

Line 357 got hit per the coverage report

Screenshot 2024-01-30 at 12 57 53 PM

Merging - thanks for the help @joshuacwnewton and @kanishk16 !

@mathieuboudreau mathieuboudreau merged commit 881dc68 into master Jan 30, 2024
12 checks passed
@mathieuboudreau mathieuboudreau deleted the mb/broadcast_bug branch January 30, 2024 16:59
@joshuacwnewton
Copy link
Member

Thank you for taking care of the PR, @mathieuboudreau & @kanishk16. ♥️

((And, my apologies for taking up time on this PR by proposing a rewrite! Thank you for your patience.))

@kanishk16
Copy link
Contributor

Thank you @mathieuboudreau, for all the patience while I stalled the PR & @joshuacwnewton for helping along.


I'd learn smthg.

I like adding assertions as a way of clarifying the internal state of the code to anyone reading it. (Basically, testing the assumptions we're making about internal state. The assertions should be relevant mainly to developers, but generally should never be seen by users.) In other words, the assertions should never fire if our assumptions are correct? But, if our assumptions are incorrect (as they were with grayscale+alpha images), then the code will fail in an unambiguous way to make debugging easier.

I completely agree with the above idea regarding assertions, but ... let's tackle it as per the case:

# Anti-pattern: We assume that argument will be either option_1/2, but what if it isn't?
# Code will continue then probably crash at some unintuitive line of code
if argument == "option_1":
   pass
elif argument == "option_2":
   pass

On point! ✔️

# Slightly better: All cases are handled thanks to 'else:'
# However, the internal state is now ambiguous -- what is the value of 'argument' inside the 'else:'?
if argument == "option_1":
   pass
else:
   pass

✔️

# Preferred by SCT: All cases handled *and* assumptions + internal state are both clear
# If assumptions are correct, assertion statement will do nothing
# If assumptions are false, then the code will fail in a clear/unambiguous way
if argument == "option_1":
   pass
else:
   assert argument == "option_2", f"Unexpected value for argument: {argument}"
   pass

This is where I feel assertion isn't the right way to go... WDYT about this:

if argument == "option_1":
   pass
elif argument == "option_2"
   pass
else:
  raise ValueError(f"Unexpected value for argument: {argument}")

...
# some in-place operations on argument
...
# now use assert to clarify the internal state before using it for some other operations

I think it might just be a personal style thing -- when I write code, much like my writing in general, I tend to be pretty verbose? 😅

Nah, I didn't mean in that sense... think about it this way... pretty much in ivadomed there were too few comments, and suddenly, there's code along with comments...

I like your writing style, and I'd like to believe being verbose is your strength that brings your thoughts alive ❤️

@joshuacwnewton joshuacwnewton added this to the new release milestone Mar 12, 2024
@joshuacwnewton joshuacwnewton removed the documentation category: readthedocs or user doc label Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug category: fixes an error in the code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants