Fix bug where workunit completion was not reported correctly #10277

gshuflin · 2020-07-07T01:06:40Z

Problem

#10179 had the effect of changing the architecture of the workunit store so that when workunits were started and completed, the methods responsible for these tasks would push them onto a mpsc queue, and they would only actually be added to the store when one of the methods that handed them pulled the workunits off the queue. This is fine when the dynamic UI is running, because the heavy_hitters method will be running constantly, and therefore logging workunit completion messages. However, in the --no-dynamic-ui case, heavy_hitters doesn't run, and therefore nothing handles the workunits on the queue,

Solution

The workunit store is already aware of whether the dynamic UI is running or not, so this commit has the workunit completion method check this flag. If it's set, we send workunits over the mpsc queue, and expect them to be added to the store (a locking operation) by heavy_hitters, which needs to grab the lock anyway. When the dynamic UI is not running, the workunit add method doesn't use the heavy hitters mpsc queue at all, and just emits the display message for the workunit completion. Additionally, a new integration test is added to test that these workunit completion log messages get printed.

Result

Fixes #10274

stuhood · 2020-07-07T21:32:51Z

src/rust/engine/workunit_store/src/lib.rs

@@ -454,30 +462,52 @@ impl WorkunitStore {
      started.log_workunit_state()
    }

-    let sender = self.heavy_hitters_data.msg_tx.lock();
-    sender.send(StoreMsg::Started(started.clone())).unwrap();
+    if self.rendering_dynamic_ui {


Could merge this with the above conditional I think?

stuhood · 2020-07-07T21:33:29Z

src/rust/engine/workunit_store/src/lib.rs

+      let mut inner = self.heavy_hitters_data.inner.lock();
+      HeavyHittersData::add_started_workunit_to_store(started.clone(), &mut inner);


Rather than adding this to the store, could the caller hold onto a copy of the workunit until it was completed?

I think that would make the logic overly complicated for no gain - the store already knows how to hold onto a started workunit, and then mark it as complete when the completed version of the workunit arrives.

Hm. I think that I disagree. Writing to the store here would seem to break the abstraction that we enqueue things to be consumed, but otherwise don't acquire the lock on the store. It adds complexity because it violates that abstraction.

Holding onto the workunit between let workunit = ...start_workunit(); ... complete_workunit(workunit) on the other hand seems to fit into this model just fine, and would likely be less complex?

The locks and queues are concerns internal to WorkunitStore; callers of WorkunitStore::start_workunit() and complete_workunit shouldn't care about what these methods are doing internally. This change would also require the signature of start_workunit and complete_workunit to effectively be different depending on whether the dynamic UI was active or not, if I'm understanding your suggestion correctly.

This change would also require the signature of start_workunit and complete_workunit to effectively be different depending on whether the dynamic UI was active or not, if I'm understanding your suggestion correctly.

I think that you could do this in all cases, and it would remove the dict/map lookup to figure out "which" workunit you are finishing. It might even be more efficient because of that. But also, any caller using with_workunit (which we should recommend!) wouldn't have to worry about it at all.

And it would probably also fix #10249 ...?

stuhood

Thanks! This looks great.

stuhood · 2020-07-10T00:40:41Z

tests/python/pants_test/integration/BUILD

+    'src/python/pants/engine/internals:tests',
+    'src/python/pants/testutil:int-test',


Nit: Neither of these should be necessary anymore! Huzzah.

@Eric-Arellano : maybe this will be a worth a warning at some point... EDIT: ... maybe not though. Complicated by sub-targets I suppose.

Eric-Arellano · 2020-07-10T00:43:51Z

tests/python/pants_test/integration/BUILD

@@ -70,3 +70,14 @@ python_integration_tests(
  sources = ['test_prelude_integration.py'],
  uses_pants_run=True,
 )
+
+python_tests(


This should be python_integration_tests(. Then remove the tag and remove 'src/python/pants/testutil:int-test'. Also, you should be able to remove 'src/python/pants/engine/internals:tests' due to dep inference.

Eric-Arellano · 2020-07-10T00:44:01Z

tests/python/pants_test/integration/BUILD

+
+python_tests(
+  name = 'log_output_integration',
+  sources = [ 'log_output_integration_test.py' ],


Nit

Suggested change

sources = [ 'log_output_integration_test.py' ],

sources = ['log_output_integration_test.py'],

Eric-Arellano · 2020-07-10T00:44:32Z

tests/python/pants_test/integration/log_output_integration_test.py

+
+        return tmpdir_relative
+
+    def test_completed_log_output(self):


Suggested change

def test_completed_log_output(self):

def test_completed_log_output(self) -> None:

add integration test

…ild#10277) ### Problem pantsbuild#10179 had the effect of changing the architecture of the workunit store so that when workunits were started and completed, the methods responsible for these tasks would push them onto a mpsc queue, and they would only actually be added to the store when one of the methods that handed them pulled the workunits off the queue. This is fine when the dynamic UI is running, because the `heavy_hitters` method will be running constantly, and therefore logging workunit completion messages. However, in the `--no-dynamic-ui` case, `heavy_hitters` doesn't run, and therefore nothing handles the workunits on the queue, ### Solution The workunit store is already aware of whether the dynamic UI is running or not, so this commit has the workunit completion method check this flag. If it's set, we send workunits over the mpsc queue, and expect them to be added to the store (a locking operation) by `heavy_hitters`, which needs to grab the lock anyway. When the dynamic UI is not running, the workunit add method doesn't use the heavy hitters mpsc queue at all, and just emits the display message for the workunit completion. Additionally, a new integration test is added to test that these workunit completion log messages get printed. ### Result Fixes pantsbuild#10274

gshuflin requested a review from stuhood July 7, 2020 01:06

gshuflin force-pushed the fix_workunit_start_bug branch from 4789b6a to 74c9f7a Compare July 7, 2020 20:10

stuhood reviewed Jul 7, 2020

View reviewed changes

gshuflin force-pushed the fix_workunit_start_bug branch 2 times, most recently from 4fade71 to 1e130f9 Compare July 10, 2020 00:35

stuhood approved these changes Jul 10, 2020

View reviewed changes

Eric-Arellano reviewed Jul 10, 2020

View reviewed changes

gshuflin force-pushed the fix_workunit_start_bug branch 6 times, most recently from 88db7a9 to caeac0d Compare July 14, 2020 01:56

gshuflin added 5 commits July 14, 2020 10:49

More Workunit refactoring, removing unnecessary pub

c21799c

add integration test

Fix nits

4c33ed4

Up type length limit

eb8f613

Bump timeout

fca20a6

Add check for invalid concrete time dates

48bfe03

gshuflin force-pushed the fix_workunit_start_bug branch from caeac0d to 48bfe03 Compare July 14, 2020 17:51

gshuflin merged commit ee0cba3 into pantsbuild:master Jul 14, 2020

gshuflin mentioned this pull request Sep 11, 2020

[v2] " No previously-started workunit found for id" warnings show up when running pants #10249

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug where workunit completion was not reported correctly #10277

Fix bug where workunit completion was not reported correctly #10277

gshuflin commented Jul 7, 2020 •

edited

stuhood Jul 7, 2020

stuhood Jul 7, 2020

gshuflin Jul 7, 2020

stuhood Jul 7, 2020

gshuflin Jul 7, 2020

stuhood Jul 7, 2020

stuhood Jul 7, 2020

stuhood left a comment

stuhood Jul 10, 2020 •

edited

Eric-Arellano Jul 10, 2020

Eric-Arellano Jul 10, 2020

Eric-Arellano Jul 10, 2020

		let mut inner = self.heavy_hitters_data.inner.lock();
		HeavyHittersData::add_started_workunit_to_store(started.clone(), &mut inner);

		'src/python/pants/engine/internals:tests',
		'src/python/pants/testutil:int-test',

	sources = [ 'log_output_integration_test.py' ],
	sources = ['log_output_integration_test.py'],

	def test_completed_log_output(self):
	def test_completed_log_output(self) -> None:

Fix bug where workunit completion was not reported correctly #10277

Fix bug where workunit completion was not reported correctly #10277

Conversation

gshuflin commented Jul 7, 2020 • edited

Problem

Solution

Result

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood left a comment

Choose a reason for hiding this comment

stuhood Jul 10, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gshuflin commented Jul 7, 2020 •

edited

stuhood Jul 10, 2020 •

edited