Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async Support for TensorFlow Backend #407

Closed
wants to merge 9 commits into from

Conversation

shubhanshu02
Copy link
Contributor

@shubhanshu02 shubhanshu02 commented May 20, 2021

Patch Set Description

This patchset is a part of necessary deliverables in the GSoC project Async Support for TensorFlow Backend in FFmpeg.

Objective: Asynchronous Support for TensorFlow backend
Parts under this deliverable:
- Switch the execution mode to TFRequestItem based inference.
- Implementing a standard asynchronous inference module DNNAsyncExecModule for use across the TF and Native backends.
- Implement async mode in the TensorFlow backend.

Earlier Merged Patches in this patchset

The below patches move the TaskItem and InferenceItem from the OpenVINO backend to the dnn_backend_common and adjust them for shared use across the three backends. Then we define the TFRequestItem with execution parameters and switch the execution mechanism in the TensorFlow backend to TFRequestItem based inference.

f5ab890 lavfi/dnn: Extract TaskItem and InferenceItem from OpenVino Backend
446b4f7 lavfi/dnn: Convert output_name to char** in TaskItem
9675ebb lavfi/dnn: Add nb_output to TaskItem
6b961f7 lavfi/dnn: Use uint8_t for async and do_ioproc in TaskItems
5509235 lavfi/dnn: Fill Task using Common Function
68cf14d lavfi/dnn_backend_tf: TaskItem Based Inference
a4de605 lavfi/dnn_backend_tf: Add TFInferRequest and TFRequestItem
08d8b3b lavfi/dnn_backend_tf: Request-based Execution
b849228 lavfi/dnn_backend_tf: Separate function for filling RequestItem
84e4e60 lavfi/dnn_backend_tf: Separate function for Completion Callback
6f9570a lavfi/dnn_backend_tf: Error Handling

Final Patches

The below-mentioned patches implement the DNNAsyncExecModule and use them in the TensorFlow backend for adding the async mode. The methodology behind the DNNAsyncExecModule being to execute a number of TFRequestItem's (which can also be set using the backend configuration parameter nireq) concurrently along the main FFmpeg execution thread so that the inference requests can be executed in an asynchronous fashion.

Each TFRequestItem has its instance of DNNAsyncExecModule, which corresponds to a single thread. When TF_SessionRun returns, the thread routine also returns with relevant exit code, and the TFRequestItem is pushed back to the request_queue. This return status is caught when next time the same TFRequestItem is used for the execution. In case the previous execution failed, the error message is already printed, and we cancel all further executions by returning DNN_ERROR.

86f0a4f lavfi/dnn: Add Async Execution Mechanism and Documentation
c716578 lavfi/dnn: Common Function to Get Async Result in DNN Backends
e6ae8fc lavfi/dnn_backend_tf: TFInferRequest Execution and Documentation
0985e92 lavfi/dnn: Async Support for TensorFlow Backend
a3db9b5 lavfi/dnn_backend_tf: Error Handling for execute_model_tf
4d627ac lavfi/dnn_backend_tf: Add TF_Status to TFRequestItem
009b2e5 lavfi/dnn: Extract Common Parts from get_output functions
371e567 lavfi/dnn_backend_tf: Error Handling for tf_create_inference_request
2063745 lavfi/dnn: DNNAsyncExecModule Execution Failure Handling

Copy link
Contributor Author

@shubhanshu02 shubhanshu02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guoyejun I have these two questions regarding these changes.

libavfilter/dnn/dnn_backend_common.h Outdated Show resolved Hide resolved
libavfilter/dnn/dnn_backend_tf.c Outdated Show resolved Hide resolved
@guoyejun
Copy link
Collaborator

This PR proposes the following changes:

  1. Extract TaskItem and InferenceItem from the OpenVino backend.
  2. Change output_name to output_names in TaskItem (for other backends).
  3. Common function for filling tasks in ff_dnn_execute_model_<backend> functions
  4. In ff_dnn_free_model_ov, objects popped from inference_queue must have type * InferenceItem
  5. RequestItem based execution in TensorFlow backend.

Description

  • Each RequestItem contains a pointer to a tf_infer_request instance. The infer_request contains the parameters we need to store the input and output of the TF_SessionRun.
  • Initially, when the model is loaded in ff_dnn_load_model_tf, the request queue is created, and a total of nireq requests are pushed to it. At this step, we also allocate the space for the tf_infer_request for each request using the function tf_create_inference_request.
  • After completion of TF_SessionRun, infer_completion_callback is called, and the output frame is fetched. Before pushing the RequestItem back into the Safe Queue, the pointers in tf_infer_request are freed, keeping itself allocated (so we don't need to allocate its space again and again).

it doesn't matter if you leave this blank since there's commit log in each change.
btw, I have something others to do recently, so my review will be a bit slow.

@shubhanshu02
Copy link
Contributor Author

btw, I have something others to do recently, so my review will be a bit slow.

Sure, no problem.

@shubhanshu02 shubhanshu02 force-pushed the ovtasks branch 7 times, most recently from 49e3116 to 3811dcf Compare May 26, 2021 18:02
Copy link
Collaborator

@guoyejun guoyejun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in commit log line and body: uint8_32 -> uint8_t

@guoyejun
Copy link
Collaborator

typo in commit log line and body: uint8_32 -> uint8_t

for the patch: lavfi/dnn: Use uint8_32 for async and do_ioproc in TaskItems

@shubhanshu02 shubhanshu02 force-pushed the ovtasks branch 3 times, most recently from a1d1624 to 026a4dd Compare May 27, 2021 20:52
@guoyejun
Copy link
Collaborator

I don't have other comments, please fix the two issues and send the 8 (one new patch) patches to community. @Semmer2 please help to verify the patches once they are in the mail list, by running sr/derain/dnn_processing/dnn_detect with tf backend.

@shubhanshu02 shubhanshu02 force-pushed the ovtasks branch 5 times, most recently from 45b43ee to e62b359 Compare August 4, 2021 11:46
@shubhanshu02
Copy link
Contributor Author

@guoyejun sir, I have added the async status handling patch (b2e78c1) to this pull request. Please have a look at it.

Basically, it checks the status of the previously running thread for the same request and if it is (void *)-1 (named DNN_ASYNC_FAIL in this case), the execution was not successful. The status would be 3 in case no such thread was running.
In case of failure during execution, the async_module->start_inference returns DNN_ERROR, cleans up the execution parameters, and pushes the request back to the queue. On receipt of DNN_ERROR, the thread returns with the above exit value without calling the completion callback. So, the possible memory leak was handled in the start_inference function.

This commit adds an async execution mechanism for common use
in the TensorFlow and Native backends.
This commit also adds the documentation of typedefs and functions in
the async module for common use in DNN backends.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commits refactors the get async result function for common
use in all three backends.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit adds a function for execution of TFInferRequest and documentation
for functions related to TFInferRequest.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit enables async execution in the TensorFlow backend
and adds function to flush extra frames.

The async execution mechanism executes the TFInferRequests on
a separate thread which is joined before the next execution of
same TFRequestItem/while freeing the model.

The following is the comparison of this mechanism with the existing
sync mechanism on TensorFlow C API 2.5 CPU variant.

Async Mode: 4m32.846s
Sync Mode: 5m17.582s

The above was performed on super resolution filter using SRCNN model.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This patch adds error handling for cases where the execute_model_tf
fails, clears the used memory in the TFRequestItem and finally pushes
it back to the request queue.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
Since requests are running in parallel, there is inconsistency in
the status of the execution. To resolve it, we avoid using mutex
as it would result in single TF_Session running at a time. So add
TF_Status to the TFRequestItem

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
The frame allocation and filling the TaskItem with execution
parameters is common in the three backends. This commit shifts
this logic to dnn_backend_common.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
…equest

This commit includes the check for the case when the newly created
TFInferRequest is NULL.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
This commit adds the case handling if the asynchronous execution
of a request fails by checking the exit status of the thread when
joining before starting another execution. On failure, it does the
cleanup as well.

Signed-off-by: Shubhanshu Saxena <shubhanshu.e01@gmail.com>
@guoyejun
Copy link
Collaborator

guoyejun commented Aug 8, 2021

looks good to me, and any other change beside the last patch?

you may send the v3 patches together with is patch, since they are nature to be together.

@Semmer2 please help to verify the patches.

@shubhanshu02
Copy link
Contributor Author

shubhanshu02 commented Aug 8, 2021

other change beside the last patch?

I can't think of anything else to make the patchset better for now. I think that's all for this patchset.
I'll send the v3 patchset in a few minutes. (Update: I have sent them)
Thank you.

@shubhanshu02
Copy link
Contributor Author

Thank you for reviewing the pull request, @guoyejun sir.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants