Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature : Multimask Training #29

Merged
merged 10 commits into from
Sep 3, 2024
Merged

Feature : Multimask Training #29

merged 10 commits into from
Sep 3, 2024

Conversation

kshitijrajsharma
Copy link
Member

@kshitijrajsharma kshitijrajsharma commented Mar 24, 2024

What does this PR do ?

Previously we used two classes : Building and background , building being 1 and background being 0 . Which is binary masks . It is working great if buildings are separated from each other but this approach struggles when buildings are closely attached with each other. We often observe this scenario in places like slum and city area . Hence to solve this problem I came up with multimask approach instead of binary which will be used to teach model about nature of building boundaries so that it can spearate them better than before .

  • This PR introduces new multimasks labels that are being used in training for RAMP

Consideration

In this multimask labels we use following classes :

"background", - 0
"buildings",- 1
"boundary", - 2
"close_contact" - 3

Those classes are derived from RAMP utils . During implementation this PR introduces two new parameters for preprocessing which is input_contact_spacing and input_boundary_width

Definition

input_contact_spacing : contact_spacing deals with the interaction between two separate building shapes. This concept uses a positive buffer, extending outward from the edges of a building's shape, to see if and where it intersects with the buffer of another building.

input_boundary_width: boundary_width refers to creating a specific type of border or margin around the original shape of the building. This is achieved by applying a "negative buffer" to the building's shape. A negative buffer essentially shrinks the original shape inward by a specified distance, creating a smaller shape within the original. The space between the original shape and this smaller, inwardly adjusted shape forms the boundary.

Why those options ?

Two main approach here , one : how to distinguish its a building from background with accurate tracing of edges (boundary) and two : how to make sure they are delinated correctly when they are very close to each other (contact)

Visualization :

Color Band Value Class Name
Black 0 background
White 1 footprint
Blue 2 boundary
Red 3 contact

image

What is pixel unit ?

Technically when working with rasters even when we specify units in meter we need to do calculation interms of pixels . Meter unit differs based on the resolution of image . Each TMS will have different resolution and pixel width differes based on zoom levels , that's the exact reason why input parameter is in pixel to maintain consistency between different zoom levels

Formula :

$$Real-world width (in meters)= Pixel width×Resolution (meters per pixel)$$

Screenshot

Predictions with -

Binary masks training :

image

Multimasks training :
image

You can see clear separation of buildings in second screenshot , Now model is being able to distinguish buildings accurately than before as it has knowledge how boundary of buildings looks like

How to test ?

Find related model here : https://fair-dev.hotosm.org/start-mapping/121

Publish two different trainings and compare outputs

  • Training with multimasks : Training 438
  • Training with binary masks : Training 425

What next ?

I still haven't worked on inference yet . You should see difference in binary inference method too because at this point model gives extra classes yet should be able to distinguish building footprints like binary masks do (both approach maintain same band value for buildings which is 1 and burn value is 255). It would be nice to have mutlimask prediction too , it would help to compare with different classes.

With this multi-masks approach future integration of models which supports multimasks like YOLO becomes easier , Checkout the development going on in another open PR within repo

@kshitijrajsharma kshitijrajsharma added the enhancement New feature or request label Mar 24, 2024
@kshitijrajsharma kshitijrajsharma marked this pull request as ready for review March 25, 2024 06:14
@kshitijrajsharma kshitijrajsharma merged commit c876f9e into master Sep 3, 2024
1 check passed
@kshitijrajsharma kshitijrajsharma deleted the feature/multimasks branch September 3, 2024 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

1 participant