Skip to content

spacewalk01/nanosam-cpp

Repository files navigation

NanoSAM C++

This repo provides a TensorRT C++ implementation of Nvidia's NanoSAM, a distilled segment-anything model, for real-time inference on GPU.

Getting Started

  1. There are two ways to load engines:

    1. Load engines built by trtexec:

      #include "nanosam/nanosam.h"
      
      NanoSam nanosam(
         "resnet18_image_encoder.engine",
         "mobile_sam_mask_decoder.engine"
      );
    2. Build engines directly from onnx files:

      NanoSam nanosam(
         "resnet18_image_encoder.onnx",
         "mobile_sam_mask_decoder.onnx"
      );
  2. Segment an object using a prompt point:

    Mat image = imread("assets/dog.jpg");
    // Foreground point
    vector<Point> points = { Point(1300, 900) };
    vector<float> labels = { 1 }; 
    
    Mat mask = nanosam.predict(image, points, labels);
    Input Output
  3. Create masks from bounding boxes:

    Mat image = imread("assets/dogs.jpg");
    // Bounding box top-left and bottom-right points
    vector<Point> points = { Point(100, 100), Point(750, 759) };
    vector<float> labels = { 2, 3 }; 
    
    Mat mask = nanosam.predict(image, points, labels);
    Input Output
    Notes The point labels may be
    Point Label Description
    0 Background point
    1 Foreground point
    2 Bounding box top-left
    3 Bounding box bottom-right

    Performance

    The inference time includes the pre-preprocessing time and the post-processing time:

    Device Image Shape(WxH) Model Shape(WxH) Inference Time(ms)
    RTX4090 2048x1365 1024x1024 14

    Installation

    1. Download the image encoder: resnet18_image_encoder.onnx
    2. Download the mask decoder: mobile_sam_mask_decoder.onnx
    3. Download the TensorRT zip file that matches the Windows version you are using.
    4. Choose where you want to install TensorRT. The zip file will install everything into a subdirectory called TensorRT-8.x.x.x. This new subdirectory will be referred to as <installpath> in the steps below.
    5. Unzip the TensorRT-8.x.x.x.Windows10.x86_64.cuda-x.x.zip file to the location that you chose. Where:
    • 8.x.x.x is your TensorRT version
    • cuda-x.x is CUDA version 11.8 or 12.0
    1. Add the TensorRT library files to your system PATH. To do so, copy the DLL files from <installpath>/lib to your CUDA installation directory, for example, C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y\bin, where vX.Y is your CUDA version. The CUDA installer should have already added the CUDA path to your system PATH.
    2. Ensure that the following is present in your Visual Studio Solution project properties:
    • <installpath>/lib has been added to your PATH variable and is present under VC++ Directories > Executable Directories.
    • <installpath>/include is present under C/C++ > General > Additional Directories.
    • nvinfer.lib and any other LIB files that your project requires are present under Linker > Input > Additional Dependencies.
    1. Download and install any recent OpenCV for Windows.

    Acknowledgement

    This project is based on the following projects:

    • NanoSAM - The distilled Segment Anything (SAM).
    • TensorRTx - Implementation of popular deep learning networks with TensorRT network definition API.
    • TensorRT - TensorRT samples and api documentation.
    • ChatGPT - some of the simple functions were generated by ChatGPT :D