Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up ResponseFactory when a final complete flag is set #358

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/pb_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,7 @@ struct ResponseSenderBase {
bi::managed_external_buffer::handle_t error;
intptr_t request_address;
intptr_t response_factory_address;
bool is_response_factory_cleaned;
};

struct ResponseSendMessage : ResponseSenderBase {
Expand Down
9 changes: 9 additions & 0 deletions src/python_be.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1303,6 +1303,15 @@ ModelInstanceState::ResponseSendDecoupled(
SetErrorForResponseSendMessage(
send_message_payload, WrapTritonErrorInSharedPtr(error), error_message);
}

if (send_message_payload->flags == TRITONSERVER_RESPONSE_COMPLETE_FINAL) {
// Delete response factory
std::unique_ptr<
TRITONBACKEND_ResponseFactory, backend::ResponseFactoryDeleter>
response_factory(reinterpret_cast<TRITONBACKEND_ResponseFactory*>(
send_message_payload->response_factory_address));
send_message_payload->is_response_factory_cleaned = true;
}
}

TRITONSERVER_Error*
Expand Down
18 changes: 12 additions & 6 deletions src/response_sender.cc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions
Expand Down Expand Up @@ -41,16 +41,18 @@ ResponseSender::ResponseSender(
const std::shared_ptr<PbCancel>& pb_cancel)
: request_address_(request_address),
response_factory_address_(response_factory_address), shm_pool_(shm_pool),
closed_(false), pb_cancel_(pb_cancel)
closed_(false), pb_cancel_(pb_cancel), is_response_factory_cleaned_(false)
{
}

ResponseSender::~ResponseSender()
{
std::unique_ptr<Stub>& stub = Stub::GetOrCreateInstance();
stub->EnqueueCleanupId(
reinterpret_cast<void*>(response_factory_address_),
PYTHONSTUB_DecoupledResponseFactoryCleanup);
if (!is_response_factory_cleaned_) {
std::unique_ptr<Stub>& stub = Stub::GetOrCreateInstance();
stub->EnqueueCleanupId(
reinterpret_cast<void*>(response_factory_address_),
PYTHONSTUB_DecoupledResponseFactoryCleanup);
}
}
Comment on lines +52 to +55
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove EnqueueCleanupId (and associated logic on the server side) now that response factory is deleted when it sees the final flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would still need to keep it for rescheduled requests. It's possible that the response sender hasn't sent the final flag yet and the request gets rescheduled. There will be a new response factory created for each request even if the request is a rescheduled one, so I think we would still need to clean up the response factory in the destructor.


void
Expand Down Expand Up @@ -112,6 +114,7 @@ ResponseSender::Send(
send_message_payload->has_error = false;
send_message_payload->is_error_set = false;
send_message_payload->flags = flags;
send_message_payload->is_response_factory_cleaned = false;

std::unique_ptr<IPCMessage> ipc_message =
IPCMessage::Create(shm_pool_, false /* inline_response */);
Expand Down Expand Up @@ -188,6 +191,9 @@ ResponseSender::Send(
}
}

is_response_factory_cleaned_ =
send_message_payload->is_response_factory_cleaned;

if (send_message_payload->has_error) {
if (send_message_payload->is_error_set) {
std::unique_ptr<PbString> error = PbString::LoadFromSharedMemory(
Expand Down
11 changes: 10 additions & 1 deletion src/response_sender.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions
Expand Down Expand Up @@ -46,7 +46,16 @@ class ResponseSender {
intptr_t request_address_;
intptr_t response_factory_address_;
std::unique_ptr<SharedMemoryManager>& shm_pool_;
// The flag to indicate if the response sender is closed. It is set to true
// once the TRITONSERVER_RESPONSE_COMPLETE_FINAL flag is set, meaning that the
// response_sender should not be used anymore. This flag is separate from the
// `is_response_factory_cleaned_` flag because errors might occur after
// complete_final flag is set but before the response_factory gets cleaned up.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on the errors that might occur after sending complete_final flag but before deleting the response_factory? Is it the model that is holding on to the response_sender forever?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the complete final flag is set, it is still possible that some error occurs before the response actually gets sent to the backend. For example, here and here.
In this case, the response_factory will not be automatically deleted in the backend, and can only be deleted within the destructor of response_sender by sending the clean up message to the backend.

I think this bool variable closed_ is just to make sure users are not using the response sender after they set the final flag, It can not be used to determine if the response_factory is cleaned up from the backend side or not.

bool closed_;
std::shared_ptr<PbCancel> pb_cancel_;
// The flag to indicate if the response_factory is already cleaned up in the
// python backend side. If not, the response_factory will be cleaned up in the
// destructor.
bool is_response_factory_cleaned_;
};
}}} // namespace triton::backend::python
Loading