Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On-premise near real-time video inference #1867

Merged
merged 33 commits into from
Oct 27, 2022
Merged

Conversation

agunapal
Copy link
Collaborator

@agunapal agunapal commented Sep 21, 2022

Description

This PR includes the following

  • Near real-time Video Inference with TorchServe Batching
  • Near real-time Video Inference with Client-Side Batching
  • An example showing how to send asynchronous http requests to TorchServe in python

Problem

Consider a use-case where we have cameras connected to edge devices. These devices are connected to a compute cluster where TorchServe is running. Each edge device has a computer vision pipeline running, where we read frames from the camera and we need to perform tasks as Image Classification, Pose Estimation, Activity Recognition etc on the read frames. In order to make efficient use of hardware resources, we might want to do batching of the frames for efficient inference

This example shows how this can be achieved using TorchServe with the following 2 approaches

  • TorchServe Batching
  • Client-Side Batching

The architecture diagram is shown below

Screen Shot 2022-09-27 at 4 37 28 PM

Screen Shot 2022-09-27 at 4 47 02 PM

With TorchServe batching

On the client side, we have one thread for reading frames from a video source and another thread which sends the read frames as http request to TorchServe for image classification inference. We are using an asynchronous http requests as we want to make use of TorchServe batching.
We send one frame in each request and let TorchServe handle the batching
TorchServe is setup to process batch size of 4 in this example.
TorchServe receives individual requests, batches the requests to make a single inference request and sends out individual responses to the requests received.

Create a resnet-18 eager mode model archive, register it on TorchServe and run inference on a real time video

Run the commands given in following steps from the parent directory of the root of the repository. For example, if you cloned the repository into /home/my_path/serve, run the steps from /home/my_path

python examples/image_classifier/near_real_time_video/create_mar_file_batch.py

torchserve --start --model-store model_store --models resnet-18=resnet-18.mar --ts-config examples/image_classifier/near_real_time_video/config.properties

python examples/image_classifier/near_real_time_video/request_ts_batching.py

The default batch size is 4.
On the client side, we should see the following output

With Batch Size 4, FPS at frame number 20 is 24.7
{
  "tabby": 0.5186409950256348,
  "tiger_cat": 0.29040342569351196,
  "Egyptian_cat": 0.10797449946403503,
  "lynx": 0.01395314373075962,
  "bucket": 0.006002397276461124
}
{
  "tabby": 0.5186409950256348,
  "tiger_cat": 0.29040342569351196,
  "Egyptian_cat": 0.10797449946403503,
  "lynx": 0.01395314373075962,
  "bucket": 0.006002397276461124
}
{
  "tabby": 0.5186409950256348,
  "tiger_cat": 0.29040342569351196,
  "Egyptian_cat": 0.10797449946403503,
  "lynx": 0.01395314373075962,
  "bucket": 0.006002397276461124
}
{
  "tabby": 0.5186409950256348,
  "tiger_cat": 0.29040342569351196,
  "Egyptian_cat": 0.10797449946403503,
  "lynx": 0.01395314373075962,
  "bucket": 0.006002397276461124
}

With Client-Side batching

On the client side, we have one thread for reading frames from a video source and another thread which batches(size n) the read frames and sends the request to TorchServe for image classification inference.
To send the batched data, we create a json payload of n frames.
On the TorchServe side, we read the json payload and preprocess the n frames. The postprocess function in the handler returns the output as a list of length 1.

Create a resnet-18 eager mode model archive, register it on TorchServe and run inference on a real time video

Run the commands given in following steps from the parent directory of the root of the repository. For example, if you cloned the repository into /home/my_path/serve, run the steps from /home/my_path

python examples/image_classifier/near_real_time_video/create_mar_file.py --client-batching

torchserve --start --model-store model_store --models resnet-18=resnet-18.mar

python examples/image_classifier/near_real_time_video/request.py --client-batching

The default batch size is 4.
On the client side, we should see the following output

With Batch Size 4, FPS at frame number 20 is 26.3
[
  {
    "tabby": 0.5211764574050903,
    "tiger_cat": 0.2896695137023926,
    "Egyptian_cat": 0.10781702399253845,
    "lynx": 0.013975325040519238,
    "bucket": 0.006072630640119314
  },
  {
    "tabby": 0.521255373954773,
    "tiger_cat": 0.28875237703323364,
    "Egyptian_cat": 0.10762253403663635,
    "lynx": 0.0139595502987504,
    "bucket": 0.005917856469750404
  },
  {
    "tabby": 0.5212978720664978,
    "tiger_cat": 0.28904619812965393,
    "Egyptian_cat": 0.10735585540533066,
    "lynx": 0.013928638771176338,
    "bucket": 0.005905763246119022
  },
  {
    "tabby": 0.521538496017456,
    "tiger_cat": 0.28848880529403687,
    "Egyptian_cat": 0.10753455013036728,
    "lynx": 0.013951676897704601,
    "bucket": 0.005931478925049305
  }
]

Fixes #(issue)

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@codecov
Copy link

codecov bot commented Sep 21, 2022

Codecov Report

Merging #1867 (fd0b993) into master (13f2e78) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #1867   +/-   ##
=======================================
  Coverage   44.95%   44.95%           
=======================================
  Files          63       63           
  Lines        2609     2609           
  Branches       56       56           
=======================================
  Hits         1173     1173           
  Misses       1436     1436           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is awesome, I'd recommend writing a blog post about this application so you can do things like add a video stream screenshot

Left some minor nit feedback, the only main one is that we need a test to make sure this example doesnt break later

# if the image is a list
image = torch.FloatTensor(image)

images.append(image)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can torch.stack images directly without an intermediate list

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we do this? torch.stack needs multiple elements of the same size? I copied this from the image_classifier handler

ts_scripts/spellcheck_conf/spellcheck.yaml Outdated Show resolved Hide resolved
@msaroufim msaroufim added the p1 mid priority label Sep 30, 2022
@agunapal agunapal changed the title Streaming video inference with client-side batching Streaming video inference Oct 6, 2022
Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor feedback, please address before merge otherwise LGTM!

examples/image_classifier/streaming_video/README.md Outdated Show resolved Hide resolved
test/pytest/conftest.py Show resolved Hide resolved
@agunapal agunapal changed the title Streaming video inference Edge video inference Oct 24, 2022
@agunapal agunapal changed the title Edge video inference On-premise near real-time video inference Oct 26, 2022
@lxning
Copy link
Collaborator

lxning commented Oct 27, 2022

@agunapal could you please attach regression test log to check pytest?

@agunapal
Copy link
Collaborator Author

@lxning Updated with regression logs. The test is being skipped from the regression suite, to avoid installing other client libraries ( so that binary size doesn't increase)

@msaroufim msaroufim merged commit 942d0e8 into master Oct 27, 2022
jagadeeshi2i pushed a commit to jagadeeshi2i/serve that referenced this pull request Nov 1, 2022
jagadeeshi2i pushed a commit to jagadeeshi2i/serve that referenced this pull request Nov 3, 2022
@agunapal agunapal deleted the examples/streaming_video branch November 9, 2023 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
example p1 mid priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants