Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow of Housing Passports on Dominica #9

Open
piligab opened this issue Oct 25, 2023 · 6 comments
Open

Workflow of Housing Passports on Dominica #9

piligab opened this issue Oct 25, 2023 · 6 comments

Comments

@piligab
Copy link

piligab commented Oct 25, 2023

@developmentseed/data-team is going to annotate the building properties and building parts 馃殌.

The specific classes and subclasses are 馃憞:

Building properties

Classes Subclasses
Building completeness - complete
- incomplete
Building material - brick_or_cement-concrete_block
- plaster
- wood_polished
- wood_crude_plank
- adobe
- corrugated_metal
- stone_with_mud_ashlar_with_lime_or_cement
- container_trailer
- mix_other_unclear
Building use - residential
- mixed
- commercial聽
- critical_infrastructure
Building security - unsecured
- secured
Building condition - fair
- poor
- good

Building parts

Classes
window
door
garage
disaster_mitigation

Resources

cc. @srmsoumya

@piligab
Copy link
Author

piligab commented Oct 26, 2023

@srmsoumya, we did the sample annotation, in total we have reviewed 850 images, and we have annotated in 158. The low number of labeled images is mainly due to images collected on roads where there are no buildings, or buildings whose bases are not seen.

The CVAT task links to review for the WB team are:

Here are some notes from the images and labeling of the sample subset:

  • As per the labeling guidelines from previous phases, we do not label the images where the building base is not seen. In such cases, we simply skip them. For instance, when there are barriers, walls, or other objects in front of the building obstructing the base, it is impossible to determine where to begin drawing the bounding box.
    That's why in some images like the one below there are some unlabeled buildings because only the base of the fence is visible and not the base of the building itself.
    CWTPpXVg

  • Tilted buildings: We are finding and labeling images where buildings are tilted, like the below examples. Will this be effective for machine learning?
    Selection_689
    Selection_690

  • Car roof in the images: At the bottom of the images, a part of the car is visible. Apparently, this happens in almost all the images.
    Selection_691

cc. @karitotp, @ediyes

@srmsoumya
Copy link
Member

srmsoumya commented Oct 27, 2023

Thanks for the detailed notes @piligab

Tilted buildings: We are finding and labeling images where buildings are tilted, like the below examples. Will this be effective for machine learning?

Yes, it is okay to label buildings that are tilted, they will be useful for model-training. Also, this is something we will see in future data collection as well, so the model should understand this concept.

Car roof in the images: At the bottom of the images, a part of the car is visible. Apparently, this happens in almost all the images.

Ah okay, we did share this feedback with the WB team after looking at the first sample of data collection. They made some adjustments to the camera to reduce the space occupied by the car top.
@nualacowan this is probably a learning for next phase of data collection.

@nualacowan
Copy link

nualacowan commented Oct 30, 2023

thanks all for the detailed notes. The car roof issues were noted early on, and despite much readjustment, this was the best we could do with the basic rig. We are looking at getting a extender for the camera mount, in the hope that this would remove the roof from further images.

Question on the tilting. Was this to do with how the 360 images were split, or, something to do with the capture angle of the camera? Would like to know for future image capture teams,

Note on the low number of buildings per images searched. As our camera was essentially "piggybacking" on an existing survey focused on road surface condition - we had to take what road stretches we could get - many of which were rural in nature. When we return to take more images, we plan to focus on more "built-up areas"
Will the low number of labels affect the model development/performance? Do you think you will have enough?

@piligab
Copy link
Author

piligab commented Oct 31, 2023

Question on the tilting. Was this to do with how the 360 images were split, or, something to do with the capture angle of the camera? Would like to know for future image capture teams,

It is something to do with the capture angle of the camera, because we have reviewed the original image, and we can see in the original image that the buildings are tilting, so after the split, the building is also tilted, as we can see in the examples below 馃憞.

original
original image

530662765915005_left
lef image
530662765915005_right
right image

Will the low number of labels affect the model development/performance? Do you think you will have enough?

If we get a percentage with this information --> we have reviewed 850 images, and we have annotated in 158, the total annotated images would only be 19% of the total images reviewed. Making a calculation with this information, if we reviewed all the images that are around 26k we will have 5k images annotated.
@srmsoumya, the total images annotated it will be 5k approximately, so will it be enough for the model?

cc. @nualacowan

@srmsoumya
Copy link
Member

Thank you @piligab

@nualacowan the tilt in the images may be due to the inclination of the road.

Will the low number of labels affect the model development/performance? Do you think you will have enough?

Having only 5k images may not be sufficient for training the model, as we need to further split them into train, test, and validation sets.
There are a few potential workarounds:

  • Using images from previous phases, but they may not be the same as the current panoramic 360 images, so the transfer may not be effective.
  • Collecting more data points in the surrounding area, particularly in areas with more built-up structures.

@piligab
Copy link
Author

piligab commented Nov 22, 2023

@srmsoumya, @nualacowan 馃憢.

We have already finished with the annotation 馃殌. The deliverables are available on s3 --> s3://hp-deliverables-v2/.
@yunica, you can use those files for the next steps of postprocessing.

We have annotated in total 9,186 images for building properties and 800 for building parts (we know that in this phase building parts are not a priority, but we made this annotation before it was stated that in this phase we would only focus on building properties).

Here are the stats in detail 馃憞:

Building properties

Building properties total annotated boxes total annotated images
left 6,484 4,902
right 5,549 4,284
total 12,033 9,186

Stats per class

building_completeness total annotated boxes
complete 10,077
incomplete 1,956
building_material total annotated boxes
brick_or_cement-concrete_block 724
plaster 7,636
wood_polished 506
wood_crude-plank 52
adobe 0
corrugated_metal 132
stone_with_mud-ashlar_with_lime_or_cement 70
container-trailer 33
mix-other-unclear 2,880
building_use total annotated boxes
residential 10,277
mixed 821
commercial 811
critical_infrastructure 124
building_security total annotated boxes
unsecured 9,018
secured 3,015
building_condition total annotated boxes
fair 7,506
poor 3,587
good 940

Building parts

Building parts total annotated images
left 800
right 0
total 800

Stats per class

聽Building parts total annotated boxes
disaster_mitigation 20
door 639
garage 140
window 2,634

cc. @Rub21 @ediyes @karitotp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants