Skip to content
This repository has been archived by the owner on Feb 23, 2024. It is now read-only.

printing data frames with validation function information #115

Merged
merged 23 commits into from
Feb 12, 2024

Conversation

viktorpm
Copy link
Collaborator

@viktorpm viktorpm commented Jan 31, 2024

Description

Implementing a new way of printing validation function results

What is this PR

  • Bug fix
  • Addition of a new feature
  • Other

Why is this PR needed?

Currently, only the lists of valid and invalid atlases are printed. One atlas can be in both lists as it might pass one validation function but not the other.

What does this PR do?

To get more information on why an atlas is invalid and which validation function it did not pass additional information is stored and printed in data frames.

How has this PR been tested?

It was tested locally and on the HPC

Is this a breaking change?

No

Does this PR require an update to the documentation?

No

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality (unit & integration)
  • The documentation has been updated to reflect any changes
  • The code has been formatted with pre-commit

Copy link

codecov bot commented Jan 31, 2024

Codecov Report

Attention: 30 lines in your changes are missing coverage. Please review.

Comparison is base (a4acdb4) 0.00% compared to head (56686e1) 0.00%.
Report is 3 commits behind head on main.

❗ Current head 56686e1 differs from pull request most recent head dea87e4. Consider uploading reports for the commit dea87e4 to get more accurate results

Files Patch % Lines
bg_atlasgen/validate_atlases.py 0.00% 30 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff          @@
##            main    #115   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         24      24           
  Lines       1943    1952    +9     
=====================================
- Misses      1943    1952    +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@viktorpm viktorpm self-assigned this Jan 31, 2024
Copy link
Member

@alessandrofelder alessandrofelder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you motivate the reason for needing better printing more (in the "Why is this PR needed" section of the description)?

pass additional information is stored

I don't understand what extra information is stored? I think all the info required is in our output now (the outputs are dictionaries, not lists, now, I think) but it may not be displayed nicely when printing.

Furthermore, I am not sure adding pandas as an extra dependency is necessary what you'd like achieve here.

We may want to move away from printing to writing a structured text file as output anyway - what do you think?

@viktorpm
Copy link
Collaborator Author

Thanks @alessandrofelder for the quick review. I'm a bit confused. I'm on the main branch and as far as I understand our outputs at the moment are the valid_atlases and the invalid_atlases lists: print(valid_atlases) and print(invalid_atlases).
They were defined earlier in the code as valid_atlases = [] and invalid_atlases = []
We also have access to the successful_validations and failed_validations dictionaries but we are not printing those. I wanted to have two tabular outputs with the list of atlases and the functions they passed/failed. Something like this:

Atlas Function
allen_mouse_100um validate_atlas_files
allen_mouse_100um validate_mesh_matches_image_extents

I just made this branch to help me debug the test functions as I don't fully understand their behaviour and it would be nice to see in a lookup table what validation functions pass or fail on each atlas.

@viktorpm
Copy link
Collaborator Author

Also, writing a structured text file is a great idea! I'm fully on board 🙂

@alessandrofelder
Copy link
Member

alessandrofelder commented Jan 31, 2024

They were defined earlier in the code as valid_atlases = [] and invalid_atlases = []

Sorry, yes, you're right - they are lists (of dictionaries).

We also have access to the successful_validations and failed_validations dictionaries but we are not printing those.

We are, indirectly, right, because we append them to the aforementioned lists?

Maybe we actually want one object, containing all the necessary info to be able to generate:

Atlas Function Passed
allen_mouse_100um validate_atlas_files True
allen_mouse_100um validate_mesh_matches_image_extents False
... ... ...

viktorpm and others added 2 commits February 1, 2024 16:25
…y the name of the successful and failed functions (not the function object) to lists in validate_atlases function
@viktorpm
Copy link
Collaborator Author

viktorpm commented Feb 1, 2024

Looks like it works as expected.
successful_validations.json
failed_validations.json

@alessandrofelder, is there a convention on where to save output files? For now, they are in the bg-atlasgen folder.

@alessandrofelder
Copy link
Member

I think they should go in ~/.brainglobe/atlases/validation or something along those lines in the BrainGlobe user data folder?
This would be in line with our aspirations detailed in brainglobe/BrainGlobe#26

@viktorpm viktorpm marked this pull request as ready for review February 2, 2024 16:03
Copy link
Member

@alessandrofelder alessandrofelder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two tiny comments for you to consider:

  • discussion point for a possible refactoring we could add on here
  • I think we can delete some variables that we've stopped using?

Looks great otherwise!

bg_atlasgen/validate_atlases.py Outdated Show resolved Hide resolved
bg_atlasgen/validate_atlases.py Outdated Show resolved Hide resolved
Comment on lines 163 to 165
successful_validations[atlas_name].append(
validation_function.__name__
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
successful_validations[atlas_name].append(
validation_function.__name__
)
validations[atlas_name].append(
validation_function.__name__, None
)

(:arrow_up: Just a sketch... and just a discussion point)

Should we combine the two lists while we're refactoring this, and have them have the same format (str(error) if failed, and None if valid)? The advantage would be simplicity

  • just one file
  • shorter and easier to read code

But it's entirely possible there's value in keeping the things separate? What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this suggestion! I was also thinking about doing this but wanted to have a chat first, as I wasn't sure how to implement it. I think it would be much better to have only one file with all the necessary information.

Copy link
Collaborator Author

@viktorpm viktorpm Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alessandrofelder, I implemented these changes and tested them locally and on the HPC. Could you check them, please? 🙂
I think it's ready to be merged if you approve the changes.

result file:
validation_results.json

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @viktorpm - As per our developer's guide around PRs, you have the liberty to merge if your reviewer approves with optional comments without asking for another round of review :) I have had another look though and looks great!

viktorpm and others added 3 commits February 5, 2024 14:27
Co-authored-by: Alessandro Felder <alessandrofelder@users.noreply.github.com>
removing unused variables

Co-authored-by: Alessandro Felder <alessandrofelder@users.noreply.github.com>
@alessandrofelder alessandrofelder merged commit 03392f3 into main Feb 12, 2024
7 checks passed
@alessandrofelder alessandrofelder deleted the cosmetics branch February 12, 2024 10:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants