Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Runtime][Pipeline Executor] multiple threads management and the data forwarding notification mechanism. #10234

Merged
merged 10 commits into from Feb 24, 2022

Conversation

huajsj
Copy link
Contributor

@huajsj huajsj commented Feb 13, 2022

RFC: https://github.com/apache/tvm-rfcs/blob/main/rfcs/0014-pipeline-executor.md
Tracking issue: #8596

In this patch we create multiple working threads for each runtime of pipeline. the threads would be terminated after the runtime class gets destroyed.

We also add a notification mechanism derived from the 'binding configuration' of the runtime to forward the data notification .

@huajsj
Copy link
Contributor Author

huajsj commented Feb 14, 2022

@comaniac , @masahi , please take a look.

@masahi
Copy link
Member

masahi commented Feb 15, 2022

Thanks @huajsj. This is interesting and tricky code, I'll try review carefully. cc @comaniac

Where does this PR sit in the feature road map?

count : int
The times of pipeline running.
"""
return self._get_pipe_execute_number()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are using the term time or number throughout the code, but please use more specific terms. I think time is more like duration and number is more like elapsed_time?

Please update your naming convention in all files. I won't point out each of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think num_executing_pipeline is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

src/runtime/pipeline/pipeline_executor.cc Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_scheduler.cc Outdated Show resolved Hide resolved
/*!\brief Whether a data is ready or not.*/
volatile bool data_ready_ = false;
/*!\brief Whether the thread should exit or not.*/
volatile bool exit_state_ = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use atomic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like not need to use atomic, because 'data_ready_' already protected by 'mutex_' and for exit_state, there is only a single 'writing' in ExitNotify then no race condition can happen. although atomic is very light weight but it still involved additional perf cost, if not necessary we may need to avoid use it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like not need to use atomic, because 'data_ready_' already protected by 'mutex_' and for exit_state, there is only a single 'writing' in ExitNotify then no race condition can happen

Then why do you need volatile? I'm not really familiar with volatile except that it is generally considered as a bad practice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that volatile is bad practice for 'atomic' guarantee, but in our case we only need to guarantee the data 'visibility' under multiple thread mode , for such requirement volatile should be ok.

about 'atomic' guarantee part, for vairable 'data_ready_' the existing 'mutex_' and conditional variable will help for the atomic guarantee, for variable exit_state_, because each notification only have one parent and one child, under such single consumer and single producer scenario the both reading and writing of 'exit_state' are native atomic then no need additional protection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, but all of SO answers I looked at say volatile is not needed for threading purpose. For example https://stackoverflow.com/questions/4557979/when-to-use-volatile-with-multi-threading

Any thought?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

volatile has nothing to do with multi-threading or memory fence. it's only used in modern c++ to enforce reading from memory and disallow compilers to put the specific data from the register. Note that no memory fence is required to be introduced by volatile, so it doesn't officially guarantee anything about visibility. The rule of thumb is to never use volatile unless for usecases like DMA

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huajsj what do you think about using something like std::promise instead? while I think the use of volatile here is probably okay in a consumer-producer relation, i'm wondering if we can use a more established primitive to enable more folks in the community to work on the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masahi ,@junrushao1994, @areusch , thanks for the follow up, I agree with that the promise is a better primitive and memory fence also be another issue we need to correct handle, after review code, I just realized there is still a memory barrier need to put after set 'exit_state = true' and before 'notify()' to guarantee the code 'ordering' , for such issue I may either use std::atomic or use 'promise + release barrier' and seems like std::atomic should be the better solution with one logic.

in the latest change I removed volatile for data_ready_ because mutex already did the acquire/release memory barrier, and change exit_state_ into a std::atomic, please check.

src/runtime/pipeline/pipeline_struct.h Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_struct.h Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_struct.h Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_struct.h Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_struct.h Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_struct.h Outdated Show resolved Hide resolved
@huajsj
Copy link
Contributor Author

huajsj commented Feb 15, 2022

Thanks @huajsj. This is interesting and tricky code, I'll try review carefully. cc @comaniac

Where does this PR sit in the feature road map?

Thanks @masahi , this PR is part of RFC https://github.com/apache/tvm-rfcs/blob/main/rfcs/0014-pipeline-executor.md, the tracking issue is #8596.

@huajsj huajsj requested a review from masahi February 15, 2022 19:21
data forwarding notification mechanism.

In this patch we create working threads for each runtime of pipeline.
the threads would be terminated once the runtime class gets destroyed.

We also add a notification mechanism derived from the 'binding configuration'
of the runtime to forward the data notification.
@huajsj
Copy link
Contributor Author

huajsj commented Feb 17, 2022

@masahi, all review comments get addressed, please take a look.

Copy link
Contributor

@leeexyz leeexyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your efforts. Hope I can try this excellent feature very soon. :)

src/runtime/pipeline/pipeline_scheduler.cc Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_scheduler.cc Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_executor.h Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_struct.h Outdated Show resolved Hide resolved
src/runtime/pipeline/pipeline_struct.h Outdated Show resolved Hide resolved
@huajsj huajsj requested a review from masahi February 24, 2022 04:49
/*\brief A map including the runtime input index and the notification data structure.*/
std::unordered_map<int, std::shared_ptr<DataNotify>> parents_notify_;
/*\brief The times of using pipeline function. */
uint32_t statistic_pipeline_execute_times_ = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable names and its doc does not make a lot of sense ("the times of ..." is not really a word). In the next PR please improve them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will do that, thanks @masahi.

@masahi masahi merged commit 4102ebf into apache:main Feb 24, 2022
pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022
… forwarding notification mechanism. (apache#10234)

* [Runtime][Pipeline Executor] multiple threads management and the
data forwarding notification mechanism.

In this patch we create working threads for each runtime of pipeline.
the threads would be terminated once the runtime class gets destroyed.

We also add a notification mechanism derived from the 'binding configuration'
of the runtime to forward the data notification.

* address review comments.

* address review comments.

* fix typo.

* fix typo.

* trigger build.

* address review comments.

* address review comments.

* address review comments.

* address review comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants