Async Support for TensorFlow Backend #407

shubhanshu02 · 2021-05-20T08:18:50Z

Patch Set Description

This patchset is a part of necessary deliverables in the GSoC project Async Support for TensorFlow Backend in FFmpeg.

Objective: Asynchronous Support for TensorFlow backend
Parts under this deliverable:
- Switch the execution mode to TFRequestItem based inference.
- Implementing a standard asynchronous inference module DNNAsyncExecModule for use across the TF and Native backends.
- Implement async mode in the TensorFlow backend.

Earlier Merged Patches in this patchset

The below patches move the TaskItem and InferenceItem from the OpenVINO backend to the dnn_backend_common and adjust them for shared use across the three backends. Then we define the TFRequestItem with execution parameters and switch the execution mechanism in the TensorFlow backend to TFRequestItem based inference.

f5ab890 lavfi/dnn: Extract TaskItem and InferenceItem from OpenVino Backend
446b4f7 lavfi/dnn: Convert output_name to char** in TaskItem
9675ebb lavfi/dnn: Add nb_output to TaskItem
6b961f7 lavfi/dnn: Use uint8_t for async and do_ioproc in TaskItems
5509235 lavfi/dnn: Fill Task using Common Function
68cf14d lavfi/dnn_backend_tf: TaskItem Based Inference
a4de605 lavfi/dnn_backend_tf: Add TFInferRequest and TFRequestItem
08d8b3b lavfi/dnn_backend_tf: Request-based Execution
b849228 lavfi/dnn_backend_tf: Separate function for filling RequestItem
84e4e60 lavfi/dnn_backend_tf: Separate function for Completion Callback
6f9570a lavfi/dnn_backend_tf: Error Handling

Final Patches

The below-mentioned patches implement the DNNAsyncExecModule and use them in the TensorFlow backend for adding the async mode. The methodology behind the DNNAsyncExecModule being to execute a number of TFRequestItem's (which can also be set using the backend configuration parameter nireq) concurrently along the main FFmpeg execution thread so that the inference requests can be executed in an asynchronous fashion.

Each TFRequestItem has its instance of DNNAsyncExecModule, which corresponds to a single thread. When TF_SessionRun returns, the thread routine also returns with relevant exit code, and the TFRequestItem is pushed back to the request_queue. This return status is caught when next time the same TFRequestItem is used for the execution. In case the previous execution failed, the error message is already printed, and we cancel all further executions by returning DNN_ERROR.

86f0a4f lavfi/dnn: Add Async Execution Mechanism and Documentation
c716578 lavfi/dnn: Common Function to Get Async Result in DNN Backends
e6ae8fc lavfi/dnn_backend_tf: TFInferRequest Execution and Documentation
0985e92 lavfi/dnn: Async Support for TensorFlow Backend
a3db9b5 lavfi/dnn_backend_tf: Error Handling for execute_model_tf
4d627ac lavfi/dnn_backend_tf: Add TF_Status to TFRequestItem
009b2e5 lavfi/dnn: Extract Common Parts from get_output functions
371e567 lavfi/dnn_backend_tf: Error Handling for tf_create_inference_request
2063745 lavfi/dnn: DNNAsyncExecModule Execution Failure Handling

shubhanshu02

@guoyejun I have these two questions regarding these changes.

libavfilter/dnn/dnn_backend_common.h

libavfilter/dnn/dnn_backend_tf.c

guoyejun · 2021-05-23T10:42:36Z

This PR proposes the following changes:

Extract TaskItem and InferenceItem from the OpenVino backend.

Change output_name to output_names in TaskItem (for other backends).

Common function for filling tasks in ff_dnn_execute_model_<backend> functions

In ff_dnn_free_model_ov, objects popped from inference_queue must have type * InferenceItem

RequestItem based execution in TensorFlow backend.

Description

Each RequestItem contains a pointer to a tf_infer_request instance. The infer_request contains the parameters we need to store the input and output of the TF_SessionRun.

Initially, when the model is loaded in ff_dnn_load_model_tf, the request queue is created, and a total of nireq requests are pushed to it. At this step, we also allocate the space for the tf_infer_request for each request using the function tf_create_inference_request.

After completion of TF_SessionRun, infer_completion_callback is called, and the output frame is fetched. Before pushing the RequestItem back into the Safe Queue, the pointers in tf_infer_request are freed, keeping itself allocated (so we don't need to allocate its space again and again).

it doesn't matter if you leave this blank since there's commit log in each change.
btw, I have something others to do recently, so my review will be a bit slow.

libavfilter/dnn/dnn_backend_common.h

libavfilter/dnn/dnn_backend_openvino.c

shubhanshu02 · 2021-05-23T13:01:27Z

btw, I have something others to do recently, so my review will be a bit slow.

Sure, no problem.

guoyejun

typo in commit log line and body: uint8_32 -> uint8_t

guoyejun · 2021-05-27T14:55:29Z

typo in commit log line and body: uint8_32 -> uint8_t

for the patch: lavfi/dnn: Use uint8_32 for async and do_ioproc in TaskItems

libavfilter/dnn/dnn_backend_common.c

libavfilter/dnn/dnn_backend_openvino.c

libavfilter/dnn/dnn_backend_tf.c

libavfilter/dnn/dnn_backend_common.c

libavfilter/dnn/dnn_backend_tf.c

guoyejun · 2021-07-30T16:10:06Z

I don't have other comments, please fix the two issues and send the 8 (one new patch) patches to community. @Semmer2 please help to verify the patches once they are in the mail list, by running sr/derain/dnn_processing/dnn_detect with tf backend.

shubhanshu02 · 2021-08-04T12:26:04Z

@guoyejun sir, I have added the async status handling patch (b2e78c1) to this pull request. Please have a look at it.

Basically, it checks the status of the previously running thread for the same request and if it is (void *)-1 (named DNN_ASYNC_FAIL in this case), the execution was not successful. The status would be 3 in case no such thread was running.
In case of failure during execution, the async_module->start_inference returns DNN_ERROR, cleans up the execution parameters, and pushes the request back to the queue. On receipt of DNN_ERROR, the thread returns with the above exit value without calling the completion callback. So, the possible memory leak was handled in the start_inference function.

libavfilter/dnn/dnn_backend_common.c

This commit adds an async execution mechanism for common use in the TensorFlow and Native backends. This commit also adds the documentation of typedefs and functions in the async module for common use in DNN backends. Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

This commits refactors the get async result function for common use in all three backends. Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

This commit adds a function for execution of TFInferRequest and documentation for functions related to TFInferRequest. Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

This commit enables async execution in the TensorFlow backend and adds function to flush extra frames. The async execution mechanism executes the TFInferRequests on a separate thread which is joined before the next execution of same TFRequestItem/while freeing the model. The following is the comparison of this mechanism with the existing sync mechanism on TensorFlow C API 2.5 CPU variant. Async Mode: 4m32.846s Sync Mode: 5m17.582s The above was performed on super resolution filter using SRCNN model. Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

This patch adds error handling for cases where the execute_model_tf fails, clears the used memory in the TFRequestItem and finally pushes it back to the request queue. Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

Since requests are running in parallel, there is inconsistency in the status of the execution. To resolve it, we avoid using mutex as it would result in single TF_Session running at a time. So add TF_Status to the TFRequestItem Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

The frame allocation and filling the TaskItem with execution parameters is common in the three backends. This commit shifts this logic to dnn_backend_common. Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

…equest This commit includes the check for the case when the newly created TFInferRequest is NULL. Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

This commit adds the case handling if the asynchronous execution of a request fails by checking the exit status of the thread when joining before starting another execution. On failure, it does the cleanup as well. Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>

guoyejun · 2021-08-08T10:24:40Z

looks good to me, and any other change beside the last patch?

you may send the v3 patches together with is patch, since they are nature to be together.

@Semmer2 please help to verify the patches.

shubhanshu02 · 2021-08-08T10:33:45Z

other change beside the last patch?

I can't think of anything else to make the patchset better for now. I think that's all for this patchset.
I'll send the v3 patchset in a few minutes. (Update: I have sent them)
Thank you.

shubhanshu02 · 2021-08-11T11:11:44Z

Thank you for reviewing the pull request, @guoyejun sir.

shubhanshu02 commented May 22, 2021

View reviewed changes

libavfilter/dnn/dnn_backend_common.h Outdated Show resolved Hide resolved

libavfilter/dnn/dnn_backend_tf.c Outdated Show resolved Hide resolved

guoyejun reviewed May 23, 2021

View reviewed changes

libavfilter/dnn/dnn_backend_common.h Outdated Show resolved Hide resolved

guoyejun reviewed May 23, 2021

View reviewed changes

libavfilter/dnn/dnn_backend_openvino.c Outdated Show resolved Hide resolved

shubhanshu02 force-pushed the ovtasks branch 7 times, most recently from 49e3116 to 3811dcf Compare May 26, 2021 18:02