Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a validation suite #17

Open
1 of 2 tasks
iparask opened this issue Aug 29, 2018 · 17 comments
Open
1 of 2 tasks

Create a validation suite #17

iparask opened this issue Aug 29, 2018 · 17 comments
Assignees
Labels
enhancement New feature or request

Comments

@iparask
Copy link
Member

iparask commented Aug 29, 2018

The use case requires a validation suite that allows us to run the pipelines and find out whether there are differences or not.

  • Brad ToDo: Provide link with images and necessary output for validation
  • Giannis ToDo: Create a script to run the pipeline and validate the output.
@iparask iparask added the enhancement New feature or request label Aug 29, 2018
@iparask iparask added this to the Execute last year's dataset milestone Sep 6, 2018
@bspitzbart
Copy link
Contributor

The predicted_shapefiles output needed for validation are usually found in a subdirectory of the input images. (given by --test_dir argument) It looks like:
$test_dir/WideResnetCount/predicted_shapefiles/
They should be on the same directory level as the csv output.

@iparask
Copy link
Member Author

iparask commented Sep 14, 2018

That would be where I find them, do you have some precalcualted that I can use for validation?

@iparask
Copy link
Member Author

iparask commented Oct 19, 2018

There is a point on the project drive that has the validation input as well as the validation output for Seals

@iparask
Copy link
Member Author

iparask commented Nov 1, 2018

This is blocking by #24 and #34

@iparask
Copy link
Member Author

iparask commented Jan 4, 2019

We have to finalize this aspect of the project.

@bspitzbart
Copy link
Contributor

I added 5 images to bridges to use for validation:
/pylon5/mc3bggp/bspitz/Seals
WV03_20170301144508_104001002A465700_17MAR01144508-P1BS-501556087040_01_P001_u08rf3031.tif
WV03_20170204000736_104001002678E600_17FEB04000736-P1BS-501513717060_01_P001_u08rf3031.tif
WV03_20170203044432_1040010028C45900_17FEB03044432-P1BS-501172834040_01_P001_u08rf3031.tif
WV03_20170202221638_104001002983B700_17FEB02221638-P1BS-501174626090_01_P002_u08rf3031.tif
WV03_20170202190457_1040010028965400_17FEB02190457-P1BS-501172778010_01_P004_u08rf3031.tif

@iparask
Copy link
Member Author

iparask commented Jan 10, 2019

Can you move them to a folder named validation? I think this is the same folder as the one used for experiments, right?

Furthermore, I think we should find a set of unclassified images that we can use for validation. We will need to run them multiple times as we develop the ICEBERG framework.

@bspitzbart
Copy link
Contributor

Ok. Moved to /pylon5/mc3bggp/bspitz/Validation
What do you mean by unclassified?

@iparask
Copy link
Member Author

iparask commented Jan 10, 2019

Images that we can have in a public web server with open access.

Let's say we need the space those images are occupying for another use case ( I know it sounds extreme, but this is hypothetical) and we remove them. At the same time, I make a change in the EnTK scripts because the API changed. I execute the validation suit as the last level of testing and it fails. We will probably get to an investigation of trying to find a bug that does not exist, because the input was not present.

I know this is an extreme case, but I think it is going to happen at some point in the lifetime of the software.

@bspitzbart
Copy link
Contributor

Yes, I see. Because of the Digital Globe licensing that does not exist. But we should be able to create synthetic data that satisfies the validation criteria and can be public.

@iparask
Copy link
Member Author

iparask commented Jan 10, 2019

That would be perfect!

@iparask iparask removed this from the Execute last year's dataset milestone Jan 11, 2019
@iparask
Copy link
Member Author

iparask commented Jan 11, 2019

Hello Brad, I pushed a feature/validation_suite branch. Can you push in there the output or do you prefer to have it somewhere else?

@bspitzbart
Copy link
Contributor

@iparask Thank you for the restructure. The scripts are in place. We just need to run merge_shapefiles.py on the output from the validation data then test_sealnet.py to compare with the ground truths in test_shapefiles (no GPU needed).

@iparask
Copy link
Member Author

iparask commented Jan 17, 2019

Alright, I'm getting there!

@iparask
Copy link
Member Author

iparask commented Jan 29, 2019

May I ask what is a good tolerance value? In addition the shape files under validation_suite/test_shapefiles provide the following output:

ground-truth_count precision predicted_count recall scene
0 165.0 0.0 90.0 0.0 WV03_20170301144508_104001002A465700_17MAR01144508-P1BS-501556087040_01_P001_u08rf3031.tif
1 22.0 0.0 39.0 0.0 WV03_20170204000736_104001002678E600_17FEB04000736-P1BS-501513717060_01_P001_u08rf3031.tif
2 351.0 0.0 369.0 0.0 WV03_20170203044432_1040010028C45900_17FEB03044432-P1BS-501172834040_01_P001_u08rf3031.tif
3 16.0 0.0 18.0 0.0 WV03_20170202221638_104001002983B700_17FEB02221638-P1BS-501174626090_01_P002_u08rf3031.tif

Is this expected? Is there a chance that the test shapefiles in the repo are outdated?

@iparask
Copy link
Member Author

iparask commented Feb 8, 2019

@bentocg @bspitzbart can you provide me with instructions on how to run the validation scripts? Also can you update on git the test_shapefiles?

@bspitzbart
Copy link
Contributor

@bentocg Are the correct validation inputs and shp files in https://drive.google.com/drive/folders/1UQKjGcIjeYmci0i9CpsMm5Yia1cbaOZC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants