Support for http generate request cancellation and segfault fix #6591

nnshah1 · 2023-11-16T16:00:30Z

Hooking the request free callback from evhtp to support request cancellation.

From inspection, all request free operations have to go through:

https://github.com/triton-inference-server/third_party/blob/8af7da2ae3f4938a8b497728fcca6460797c402f/libevhtp/libevhtp/evhtp.c#L1233

And connection free frees the request first:

https://github.com/triton-inference-server/third_party/blob/8af7da2ae3f4938a8b497728fcca6460797c402f/libevhtp/libevhtp/evhtp.c#L5151

So when a client disconnects we can trap the request free callback and cancel the request.

There are three objects with lifetimes to maintain:

evhtp_req. Request object for libevhtp. This is created by libevhtp and freed by libevhtp when a connection is freed. We add the request_fini hook to get notified when this object is freed and use it as indication that the client has exited and cancelled the request. The req object can not be used after the request free hook. We remove the hook if the response is complete.
InferRequestClass. InferRequestClass is created by the httpserver to wrap the evhtp_req and the TRITONSERVER_InferenceRequest objects. It is used in the response callbacks to send http responses. The lifetime of the object is between the request creation and the final response for a request.
TRITONSERVER_InferenceRequest. InferenceRequest is created by the httpserver and used to send inputs to the core. It is also used to cancel requests. Its lifetime is between request creation and the final response AND the request release callback (sent from the core).

We use a shared_ptr to manage the lifetime of the InferenceRequest. The shared_ptr is held by the request_release_payload (freed in the request_release_callback) and by the InferRequestClass (freed in the final response callbacks). This enables us to use the request object to cancel requests after it has been released by the core but before the final response is received.

Some minor refactoring to store the inference request in the InferRequestClass.
Moved the Callbacks to the InferRequestClass to make Generate and InferRequestClass similar and to avoid having to make protected members public.

src/http_server.cc

…b.com/triton-inference-server/server into nnshah1-generate-request-cancellation

Tabrizian

Also, please make sure that L0_infer_valgrind tests pass after this change.

src/http_server.cc

qa/L0_http/generate_models/mock_llm/1/model.py

src/http_server.cc

rmccorm4

Looks generally good to me. Left some comments for follow-ups, doesn't need to block cherry pick.

tanmayv25

This looks great! I will introduce RequestReleasePayload class in gRPC frontend too which would fix some lifetime issues that we are still observing.

rmccorm4

LGTM, not too sure if sagemaker server has any unique behavior, so will want to make sure L0_sagemaker passes.

nnshah1 · 2023-11-21T15:28:47Z

Also, please make sure that L0_infer_valgrind tests pass after this change.

Tested and expected tests passing.

updates with initial fix for connection gone

c6d739a

nnshah1 marked this pull request as draft November 16, 2023 16:00

nnshah1 added 5 commits November 16, 2023 08:22

updates to remove debug messages

3ea1fff

remove whitespace

531a0b0

update naming

ea91251

remove whitespace change

b998e70

update response code

ce93dc2

nnshah1 requested review from GuanLuo and tanmayv25 November 16, 2023 16:32

nnshah1 added 3 commits November 16, 2023 08:42

remove assumption that request is valid when processing finish

8976829

updated with tests provided by Guan

0bd30f3

added cancellation handling to test model

7d2abec

nnshah1 marked this pull request as ready for review November 16, 2023 20:40

updates to handle infer endpoint

dc4f4fd

nnshah1 changed the title ~~updates with initial fix for connection gone~~ Support for http generate request cancellation and segfault fix Nov 16, 2023

remove whitespace

3ab0ba3

nnshah1 requested a review from Tabrizian November 16, 2023 21:49

nnshah1 added 2 commits November 16, 2023 14:06

minor updates to sagemaker

73379a8

Merge branch 'main' into nnshah1-generate-request-cancellation

2cc2051

GuanLuo reviewed Nov 17, 2023

View reviewed changes

src/http_server.cc Show resolved Hide resolved

nnshah1 added 2 commits November 17, 2023 06:57

updated to avoid overwriting logs

60a11c4

Merge branch 'nnshah1-generate-request-cancellation' of https://githu…

29a7221

…b.com/triton-inference-server/server into nnshah1-generate-request-cancellation

Tabrizian reviewed Nov 17, 2023

View reviewed changes

nnshah1 added 2 commits November 17, 2023 12:54

updated with error handling and to change to remove need for active

66fbcaf

removed include

e69a8c2

nnshah1 requested review from Tabrizian and GuanLuo November 17, 2023 22:18

rmccorm4 reviewed Nov 17, 2023

View reviewed changes

src/http_server.cc Show resolved Hide resolved

rmccorm4 reviewed Nov 17, 2023

View reviewed changes

src/http_server.cc Show resolved Hide resolved

rmccorm4 previously approved these changes Nov 17, 2023

View reviewed changes

updated to remove request cancellation until later

00a165a

nnshah1 dismissed rmccorm4’s stale review via 00a165a November 18, 2023 06:56

nnshah1 added 2 commits November 17, 2023 22:57

updating comment

8a10553

adding cancellation support

b22e5f6

tanmayv25 previously approved these changes Nov 20, 2023

View reviewed changes

update to use const references for shared pointers

58b0912

nnshah1 dismissed tanmayv25’s stale review via 58b0912 November 20, 2023 18:53

nnshah1 requested review from tanmayv25 and rmccorm4 November 20, 2023 18:56

rmccorm4 previously approved these changes Nov 20, 2023

View reviewed changes

GuanLuo previously approved these changes Nov 20, 2023

View reviewed changes

tanmayv25 mentioned this pull request Nov 21, 2023

Extend request objects lifetime and fixes possible segmentation fault #6620

Merged

made shared pointer const in constructor

74c8ba5

nnshah1 dismissed stale reviews from GuanLuo and rmccorm4 via 74c8ba5 November 21, 2023 15:45

nnshah1 requested a review from rmccorm4 November 21, 2023 15:45

rmccorm4 approved these changes Nov 21, 2023

View reviewed changes

nnshah1 merged commit b876a90 into main Nov 21, 2023
3 checks passed

nnshah1 mentioned this pull request Nov 21, 2023

segmentation fault on /generate_stream when the client closes the connection #6576

Closed

rmccorm4 mentioned this pull request Feb 9, 2024

Fix generation endpoint segfault. Fix #6576 #6590

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for http generate request cancellation and segfault fix #6591

Support for http generate request cancellation and segfault fix #6591

nnshah1 commented Nov 16, 2023 •

edited

Loading

Tabrizian left a comment

rmccorm4 left a comment

tanmayv25 left a comment

rmccorm4 left a comment

nnshah1 commented Nov 21, 2023

Support for http generate request cancellation and segfault fix #6591

Support for http generate request cancellation and segfault fix #6591

Conversation

nnshah1 commented Nov 16, 2023 • edited Loading

Tabrizian left a comment

Choose a reason for hiding this comment

rmccorm4 left a comment

Choose a reason for hiding this comment

tanmayv25 left a comment

Choose a reason for hiding this comment

rmccorm4 left a comment

Choose a reason for hiding this comment

nnshah1 commented Nov 21, 2023

nnshah1 commented Nov 16, 2023 •

edited

Loading