Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running SAM backbone on frontend #6019

Merged
merged 23 commits into from
May 11, 2023
Merged

Running SAM backbone on frontend #6019

merged 23 commits into from
May 11, 2023

Conversation

bsekachev
Copy link
Member

@bsekachev bsekachev commented Apr 13, 2023

Motivation and context

Resolved #5984
Resolved #6049
Resolved #6041

  • Compatible only with sam_vit_h_4b8939.pth weights. Need to re-export ONNX mask decoder with some custom model changes (see below) to support other weights (or just download them using links below)
  • Need to redeploy the serverless function because its interface has been changed: ./deploy_gpu.sh pytorch/facebookresearch/sam/nuclio/

Decoders for other weights:
sam_vit_l_0b3195.pth: Download
sam_vit_b_01ec64.pth: Download

Changes done in ONNX part:

git diff scripts/export_onnx_model.py
diff --git a/scripts/export_onnx_model.py b/scripts/export_onnx_model.py
index 8441258..18d5be7 100644
--- a/scripts/export_onnx_model.py
+++ b/scripts/export_onnx_model.py
@@ -138,7 +138,7 @@ def run_export(

     _ = onnx_model(**dummy_inputs)

-    output_names = ["masks", "iou_predictions", "low_res_masks"]
+    output_names = ["masks", "iou_predictions", "low_res_masks", "xtl", "ytl", "xbr", "ybr"]

     with warnings.catch_warnings():
         warnings.filterwarnings("ignore", category=torch.jit.TracerWarning)
bsekachev@DESKTOP-OTBLK26:~/sam$ git diff segment_anything/utils/onnx.py
diff --git a/segment_anything/utils/onnx.py b/segment_anything/utils/onnx.py
index 3196bdf..85729c1 100644
--- a/segment_anything/utils/onnx.py
+++ b/segment_anything/utils/onnx.py
@@ -87,7 +87,15 @@ class SamOnnxModel(nn.Module):
         orig_im_size = orig_im_size.to(torch.int64)
         h, w = orig_im_size[0], orig_im_size[1]
         masks = F.interpolate(masks, size=(h, w), mode="bilinear", align_corners=False)
-        return masks
+        masks = torch.gt(masks, 0).to(torch.uint8)
+        nonzero = torch.nonzero(masks)
+        xindices = nonzero[:, 3:4]
+        yindices = nonzero[:, 2:3]
+        ytl = torch.min(yindices).to(torch.int64)
+        ybr = torch.max(yindices).to(torch.int64)
+        xtl = torch.min(xindices).to(torch.int64)
+        xbr = torch.max(xindices).to(torch.int64)
+        return masks[:, :, ytl:ybr + 1, xtl:xbr + 1], xtl, ytl, xbr, ybr

     def select_masks(
         self, masks: torch.Tensor, iou_preds: torch.Tensor, num_points: int
@@ -132,7 +140,7 @@ class SamOnnxModel(nn.Module):
         if self.return_single_mask:
             masks, scores = self.select_masks(masks, scores, point_coords.shape[1])

-        upscaled_masks = self.mask_postprocessing(masks, orig_im_size)
+        upscaled_masks, xtl, ytl, xbr, ybr = self.mask_postprocessing(masks, orig_im_size)

         if self.return_extra_metrics:
             stability_scores = calculate_stability_score(
@@ -141,4 +149,4 @@ class SamOnnxModel(nn.Module):
             areas = (upscaled_masks > self.model.mask_threshold).sum(-1).sum(-1)
             return upscaled_masks, scores, stability_scores, areas, masks

-        return upscaled_masks, scores, masks
+        return upscaled_masks, scores, masks, xtl, ytl, xbr, ybr

How has this been tested?

Checklist

  • I submit my changes into the develop branch
  • I have added a description of my changes into the CHANGELOG file
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • I have linked related issues (see GitHub docs)
  • I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@bsekachev bsekachev marked this pull request as ready for review May 9, 2023 14:04
@bsekachev bsekachev changed the title [WIP] Running SAM backbone on frontend [Do not merge] Running SAM backbone on frontend May 9, 2023
@bsekachev bsekachev requested a review from azhavoro as a code owner May 10, 2023 08:35
@bsekachev bsekachev requested a review from mdacoca as a code owner May 11, 2023 07:54
@bsekachev bsekachev changed the title [Do not merge] Running SAM backbone on frontend Running SAM backbone on frontend May 11, 2023
@bsekachev bsekachev merged commit 0712d7d into develop May 11, 2023
@d710055071
Copy link

Does this implement operations related to point prompt, box prompt, point+box prompt, and mask

@bsekachev bsekachev deleted the bs/sam_ui branch May 16, 2023 08:54
@azhavoro azhavoro mentioned this pull request May 18, 2023
nmanovic added a commit that referenced this pull request May 18, 2023
### Added
- Introduced a new configuration option for controlling the invocation of Nuclio functions.
  (<#6146>)

### Changed
- Relocated SAM masks decoder to frontend operation.
  (<#6019>)
- Switched `person-reidentification-retail-0300` and `faster_rcnn_inception_v2_coco` Nuclio functions with `person-reidentification-retail-0277` and `faster_rcnn_inception_resnet_v2_atrous_coco` respectively.
  (<#6129>)
- Upgraded OpenVINO-based Nuclio functions to utilize the OpenVINO 2022.3 runtime.
  (<#6129>)

### Fixed
- Resolved issues with tracking multiple objects (30 and more) using the TransT tracker.
  (<#6073>)
- Addressed azure.core.exceptions.ResourceExistsError: The specified blob already exists.
  (<#6082>)
- Corrected image scaling issues when transitioning between images of different resolutions.
  (<#6081>)
- Fixed inaccurate reporting of completed job counts.
  (<#6098>)
- Allowed OpenVINO-based Nuclio functions to be deployed to Kubernetes.
  (<#6129>)
- Improved skeleton size checks after drawing.
  (<#6156>)
- Fixed HRNet CPU serverless function.
  (<#6150>)
- Prevented sending of empty list of events.
  (<#6154>)
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Jul 1, 2023
<!-- Raise an issue to propose your change
(https://github.com/opencv/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution
guide](https://opencv.github.io/cvat/docs/contributing/). -->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
Resolved cvat-ai#5984 
Resolved cvat-ai#6049
Resolved cvat-ai#6041

- Compatible only with ``sam_vit_h_4b8939.pth`` weights. Need to
re-export ONNX mask decoder with some custom model changes (see below)
to support other weights (or just download them using links below)
- Need to redeploy the serverless function because its interface has
been changed.

Decoders for other weights:
sam_vit_l_0b3195.pth:
[Download](https://drive.google.com/file/d/1Nb5CJKQm_6s1n3xLSZYso6VNgljjfR-6/view?usp=sharing)
sam_vit_b_01ec64.pth:
[Download](https://drive.google.com/file/d/17cZAXBPaOABS170c9bcj9PdQsMziiBHw/view?usp=sharing)

Changes done in ONNX part:
```
git diff scripts/export_onnx_model.py
diff --git a/scripts/export_onnx_model.py b/scripts/export_onnx_model.py
index 8441258..18d5be7 100644
--- a/scripts/export_onnx_model.py
+++ b/scripts/export_onnx_model.py
@@ -138,7 +138,7 @@ def run_export(

     _ = onnx_model(**dummy_inputs)

-    output_names = ["masks", "iou_predictions", "low_res_masks"]
+    output_names = ["masks", "iou_predictions", "low_res_masks", "xtl", "ytl", "xbr", "ybr"]

     with warnings.catch_warnings():
         warnings.filterwarnings("ignore", category=torch.jit.TracerWarning)
bsekachev@DESKTOP-OTBLK26:~/sam$ git diff segment_anything/utils/onnx.py
diff --git a/segment_anything/utils/onnx.py b/segment_anything/utils/onnx.py
index 3196bdf..85729c1 100644
--- a/segment_anything/utils/onnx.py
+++ b/segment_anything/utils/onnx.py
@@ -87,7 +87,15 @@ class SamOnnxModel(nn.Module):
         orig_im_size = orig_im_size.to(torch.int64)
         h, w = orig_im_size[0], orig_im_size[1]
         masks = F.interpolate(masks, size=(h, w), mode="bilinear", align_corners=False)
-        return masks
+        masks = torch.gt(masks, 0).to(torch.uint8)
+        nonzero = torch.nonzero(masks)
+        xindices = nonzero[:, 3:4]
+        yindices = nonzero[:, 2:3]
+        ytl = torch.min(yindices).to(torch.int64)
+        ybr = torch.max(yindices).to(torch.int64)
+        xtl = torch.min(xindices).to(torch.int64)
+        xbr = torch.max(xindices).to(torch.int64)
+        return masks[:, :, ytl:ybr + 1, xtl:xbr + 1], xtl, ytl, xbr, ybr

     def select_masks(
         self, masks: torch.Tensor, iou_preds: torch.Tensor, num_points: int
@@ -132,7 +140,7 @@ class SamOnnxModel(nn.Module):
         if self.return_single_mask:
             masks, scores = self.select_masks(masks, scores, point_coords.shape[1])

-        upscaled_masks = self.mask_postprocessing(masks, orig_im_size)
+        upscaled_masks, xtl, ytl, xbr, ybr = self.mask_postprocessing(masks, orig_im_size)

         if self.return_extra_metrics:
             stability_scores = calculate_stability_score(
@@ -141,4 +149,4 @@ class SamOnnxModel(nn.Module):
             areas = (upscaled_masks > self.model.mask_threshold).sum(-1).sum(-1)
             return upscaled_masks, scores, stability_scores, areas, masks

-        return upscaled_masks, scores, masks
+        return upscaled_masks, scores, masks, xtl, ytl, xbr, ybr
```

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- [x] I have added a description of my changes into the
[CHANGELOG](https://github.com/opencv/cvat/blob/develop/CHANGELOG.md)
file
- [ ] I have updated the documentation accordingly
- [ ] I have added tests to cover my changes
- [x] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))
- [x] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning))

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants