Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

preprocess labels in dataloading step to remove labels by area threshold #37

Open
rbavery opened this issue Mar 29, 2022 · 1 comment
Open
Assignees
Labels

Comments

@rbavery
Copy link
Collaborator

rbavery commented Mar 29, 2022

some labels are non-informative due to tiling, like this example where natural seep is identified in the upper left corner but appears as a small dot where there isn't much signal.

Screenshot from 2022-03-29 12-40-39
Screenshot from 2022-03-29 12-40-25

@rbavery
Copy link
Collaborator Author

rbavery commented May 6, 2022

Now that we have the stats defined in #44 , we can look at the distribution of area by each category and come up with an area threshold we think will address most of the artifacts introduced due to tiling. We could also potentially get fancier with this filter and filter out small area annotations that occur at the scene edges. @lillythomas assigning you for now but we can discuss when you're back

I think we should apply this filter after parsing the coco dataset by the icevision trainer and fastai trainers. so the function to do this should operate on a numpy array, with another function to handle dealing with the icevision record or fastai2 sample.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants