Skip to content

Latest commit

 

History

History
49 lines (30 loc) · 4.6 KB

Week_6.md

File metadata and controls

49 lines (30 loc) · 4.6 KB

# Week-6 Summary

PREVIEW

Total Hours Spent: 18 hours 🟩🟩🟩🟩🟩🟩
Commits: 127
Pull Requests: 1
Project Status: 50%
  • Now I had to work on the Demo.py file which needed to load the trained .ONNX model and use the modules: pre_process, infer, post_process from the NanoDet.py to run inference on test images.

  • I also had to provide webcam support for users in the demo.py file which was done using cv2.capture() The following modules were completed this week before the mid-term evaluation object_detection_nanodet in the gsoc_nanodet branch. This branch only includes work pertaining to NanoDet model.

    • NanoDet.py

      • Pre_process: This module in Nanodet.py should be able to pre-process a given image by scaling the image to the parameters of Nanodet-m-plus-1.5x_416, this included resizing images to a resolution of 416 x 416 then creating input blobs of the image data using cv2.dnn.blobFromImage() module from cv.dnn framework. This module then returns the blob data of the inference image to the infer module.

      • Infer: This module then sets the input blob as input for the onnx formatted model that was loaded using cv2.dnn.readNetFromONNX() or cv2.dnn.readNet() with net.setInput(). After setting the input, we need to execute the model's layers which is done using net.forward(net.getUnconnectedOutLayersNames()) where net is the variable that contains the loaded model. This module from cv2.dnn framework will return predictions in the form of a single multi-dimensional numpy array that includes scores, bounding-box coordinates, and class labels. This out needs to be processed so that we can get formatted outputs of individual components such as scores, bounding-box coordinates, and class labels. This is done by the post_process module.

      • Post_process: This module inputs the output from net.forward(net.getUnconnectedOutLayersNames()) and processes the output to return scores, bounding-box coordinates, and class labels of predicitons that have confidence value more than the threshold set for detection (by default prob_threshold=0.35, iou_threshold=0.6). This module parses the output multi-dimensional array to filter out predictions with confidence values higher than the threshold and returns unscaled values for bounding-box coordinates along with the scores and class labels of the detections in the image.

    • Demo.py

      • Input: The demo.py supports two type of input format for model inference namely an input image or a user's webcam for video inference. We establish an argument parser to get inputs for test inference which include parameters such as input type, input data, confidence threshold, IOU threshold and the option to save the inference result. Then we process the input image or frame from webcam by converting the captured data to RGB scale using cv2.cvtColor(image, cv2.COLOR_BGR2RGB) since the model requires input to be in this particular format. Then we run the model inference by calling the infer module which pre-processes the image and returns the post-processed outputs.

      • Output_Visualization: This module in demo.py is used to scale the bounding box coordinates back to the images resolution, and then visualize the predicitons by plotting boxes using cv2.rectangle(). This module is required so that the bounding boxes are formatted properly for each individual image allowing the model to inference images with different resolutions.

  • I also had to work on allowing users to save the results they test on the models to their local dir in which they had cloned the repo. I used the cv2.VideoWriter() for performing this functionality and integrated to Nanodet's repo in model zoo.

WEEK6 TASKS

  • Complete NanoDet pull request and complete model inference using opencv.
  • Add webcam support.
  • Add feature for save that will allow users to save results.
  • Remove class labels file in directory and add it directly into demo.py for faster inference.
  • Complete GSoC contributor Mid-term evaluations.

BRANCH

PULL REQUEST

COMMITS

Note

🟩 - 3 hours of coding (working days: Monday - Saturday)