Skip to content

[Hexagon] Fix RPC session close by adding shutdown PackedFunc#12960

Merged
mehrdadh merged 2 commits intoapache:mainfrom
mehrdadh:hexagon/fix_rpc_shutdown_issue
Oct 5, 2022
Merged

[Hexagon] Fix RPC session close by adding shutdown PackedFunc#12960
mehrdadh merged 2 commits intoapache:mainfrom
mehrdadh:hexagon/fix_rpc_shutdown_issue

Conversation

@mehrdadh
Copy link
Member

@mehrdadh mehrdadh commented Sep 30, 2022

This PR adds a shutdown packed function to RPC module so we can force close the RPC session from python side.

cc @areusch @Lunderberg

@mehrdadh mehrdadh force-pushed the hexagon/fix_rpc_shutdown_issue branch 3 times, most recently from d81753f to 8b05730 Compare October 4, 2022 18:37
@mehrdadh mehrdadh changed the title [Draft] Fix RPC session close [Hexagon] Fix RPC session close by adding shutdown PackedFunc Oct 4, 2022
@mehrdadh mehrdadh force-pushed the hexagon/fix_rpc_shutdown_issue branch from 8b05730 to 4acfafd Compare October 4, 2022 18:38
@mehrdadh mehrdadh force-pushed the hexagon/fix_rpc_shutdown_issue branch from 4acfafd to 701823c Compare October 4, 2022 18:40
@mehrdadh mehrdadh marked this pull request as ready for review October 4, 2022 18:41
@github-actions github-actions bot requested review from Lunderberg and areusch October 4, 2022 18:42
Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good, just a couple of changes to naming/documentation.

const char* type_key() const final { return "rpc"; }

PackedFunc GetFunction(const std::string& name, const ObjectPtr<Object>& sptr_to_self) final {
if (name == "Shutdown") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I'm reading it, I think we should rename the exposed function to something more descriptive, such as CloseRPCConnection. I know we initially named it Shutdown because that's what the RPCEndpoint function is called, but it could cause confusion at this level. (e.g. "Shutdown" could also mean to power off the remote device.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it.

RPCCode RPCEndpoint::HandleUntilReturnEvent(bool client_mode, RPCSession::FEncodeReturn setreturn) {
RPCCode code = RPCCode::kCallFunc;

ICHECK(channel_ != nullptr) << "channel is already closed";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this would be the first error message seen by a user/developer who runs into it, we should include more context. Also, since it could result from usage rather than being an internal error, we should use CHECK instead of ICHECK.

CHECK(channel_) << "Expected connection to server " << name_ << " to be active, but the connection was previously closed";

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

/*!
* \brief The server shutdown cleanup function.
*/
void Shutdown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a more descriptive docstring here, and in the destructor. For the destructor, we should say that it closes the connection, if the connection hasn't already been closed. For the Shutdown function, we should

  • Shutdown has no effect if the connection has already been shut down.
  • Shutdown will wait for all output currently queued from the RPC connection (i.e. The user doesn't need to wait for completion before calling Shutdown.)
  • Any further use of objects that depended on the endpoint (e.g. A tvm.nd.array allocated on the remote RPC session) may throw an exception when used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

run_pytest ctypes python-contrib-hexagon tests/python/contrib/test_hexagon
else
run_pytest ctypes python-contrib-hexagon tests/python/contrib/test_hexagon --tx $num_of_devices*popen --dist=load
run_pytest ctypes python-contrib-hexagon tests/python/contrib/test_hexagon -n=$num_of_devices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an unrelated change. Should it be part of this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this will unblock the physical device (HDK) CI.

*/
static std::shared_ptr<RPCSession> Get(int table_index);

virtual void Shutdown() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@mehrdadh mehrdadh force-pushed the hexagon/fix_rpc_shutdown_issue branch from 698658a to 501d503 Compare October 5, 2022 18:07
Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes, and looks good!

@mehrdadh mehrdadh merged commit 2860a50 into apache:main Oct 5, 2022
@mehrdadh mehrdadh deleted the hexagon/fix_rpc_shutdown_issue branch October 5, 2022 21:08
@mehrdadh
Copy link
Member Author

mehrdadh commented Oct 11, 2022

fixed #12779

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
…#12960)

* Add shutdown function to RPC module

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants