[BEAM-2718] Add test to fix partial writeouts after a bundle retry by mariapython · Pull Request #3833 · apache/beam

mariapython · 2017-09-11T06:29:48Z

No description provided.

mariapython · 2017-09-12T23:11:33Z

coveralls · 2017-09-13T00:39:15Z

Coverage decreased (-0.01%) to 69.503% when pulling af94b7b on mariapython:retry_tests into 8e391d9 on apache:master.

charlesccychen

Thanks!

charlesccychen · 2017-09-15T15:01:19Z

sdks/python/apache_beam/pipeline_test.py

+
+      def start_bundle(self):
+        self.step_context = self._execution_context.get_step_context()
+        self.step_context.clear_partial_states()


I think it would be better to have a reset() method on _ExecutionContext that can be called to clear the step context and remove clear_partial_states(). We can then call this in attempt_call().

charlesccychen · 2017-09-15T15:01:19Z

sdks/python/apache_beam/pipeline_test.py

+    from collections import defaultdict
+    from apache_beam.runners.direct.transform_evaluator import _TransformEvaluator
+    from apache_beam.runners.direct.transform_evaluator import _GroupByKeyOnlyEvaluator
+    from apache_beam.runners.direct.evaluation_context import _ExecutionContext


These can be at the top of the file (underscore imports are ok in tests).

charlesccychen · 2017-09-15T15:01:19Z

sdks/python/apache_beam/runners/direct/evaluation_context.py

      self.partial_keyed_state[key] = self.existing_keyed_state[key].clone()
    return self.partial_keyed_state[key]
+
+  def clear_partial_states(self):


I think it would be better to have a reset() method on _ExecutionContext that can be called to clear the step context and remove clear_partial_states(). We can then call this in attempt_call().

mariapython

PTAL

mariapython · 2017-09-18T05:28:55Z

sdks/python/apache_beam/pipeline_test.py

+    from collections import defaultdict
+    from apache_beam.runners.direct.transform_evaluator import _TransformEvaluator
+    from apache_beam.runners.direct.transform_evaluator import _GroupByKeyOnlyEvaluator
+    from apache_beam.runners.direct.evaluation_context import _ExecutionContext


charlesccychen wrote:
These can be at the top of the file (underscore imports are ok in tests).

Done.

mariapython · 2017-09-18T05:28:55Z

sdks/python/apache_beam/runners/direct/evaluation_context.py

      self.partial_keyed_state[key] = self.existing_keyed_state[key].clone()
    return self.partial_keyed_state[key]
+
+  def clear_partial_states(self):


charlesccychen wrote:
I think it would be better to have a reset() method on _ExecutionContext that can be called to clear the step context and remove clear_partial_states(). We can then call this in attempt_call().

Done.

mariapython · 2017-09-18T05:28:55Z

sdks/python/apache_beam/pipeline_test.py

+
+      def start_bundle(self):
+        self.step_context = self._execution_context.get_step_context()
+        self.step_context.clear_partial_states()


charlesccychen wrote:
I think it would be better to have a reset() method on _ExecutionContext that can be called to clear the step context and remove clear_partial_states(). We can then call this in attempt_call().

Done.

mariapython · 2017-09-18T18:28:23Z

Retest this please

mariapython · 2017-09-19T22:40:35Z

Retest this please

coveralls · 2017-09-20T01:02:13Z

Coverage increased (+0.02%) to 69.543% when pulling 3000042 on mariapython:retry_tests into bd0facc on apache:master.

charlesccychen

Thanks!

At sdks/python/apache_beam/runners/direct/evaluation_context.py:344:

          self.existing_keyed_state[key].copy())

This can fit on just one line.

charlesccychen · 2017-09-22T20:55:23Z

sdks/python/apache_beam/runners/direct/executor.py

        side_input_values, scoped_metrics_container)
+    evaluator._execution_context.reset()
+    if hasattr(evaluator, 'step_context'):
+      evaluator.global_state = evaluator.step_context.get_keyed_state(None)


Why do we set the global_state property here? The evaluator should be responsible for doing this.

charlesccychen · 2017-09-22T20:55:23Z

sdks/python/apache_beam/runners/direct/evaluation_context.py

    return self._step_context

+  def reset(self):
+    if self._step_context:


This method can be simplified to:

# Reset step context, which may contain partial state. self._step_context = None

mariapython

At sdks/python/apache_beam/runners/direct/evaluation_context.py:344:

          self.existing_keyed_state[key].copy())

charlesccychen wrote:
This can fit on just one line.

Done.

mariapython · 2017-09-22T23:00:07Z

sdks/python/apache_beam/runners/direct/executor.py

        side_input_values, scoped_metrics_container)
+    evaluator._execution_context.reset()
+    if hasattr(evaluator, 'step_context'):
+      evaluator.global_state = evaluator.step_context.get_keyed_state(None)


charlesccychen wrote:
Why do we set the global_state property here? The evaluator should be responsible for doing this.

Done.

mariapython · 2017-09-22T23:00:07Z

sdks/python/apache_beam/runners/direct/evaluation_context.py

    return self._step_context

+  def reset(self):
+    if self._step_context:


charlesccychen wrote:
This method can be simplified to:

# Reset step context, which may contain partial state. self._step_context = None

Done.

coveralls · 2017-09-23T00:34:11Z

Changes Unknown when pulling 2bc3b24 on mariapython:retry_tests into ** on apache:master**.

charlesccychen

Thanks!

charlesccychen · 2017-09-25T17:50:57Z

sdks/python/apache_beam/runners/direct/evaluation_context.py

    if not self.partial_keyed_state.get(key):
-      self.partial_keyed_state[key] = (
-          self.existing_keyed_state[key].copy())
+      self.partial_keyed_state[key] = (self.existing_keyed_state[key].copy())


No need for extra parentheses.

charlesccychen · 2017-09-25T17:50:57Z

sdks/python/apache_beam/runners/direct/evaluation_context.py

    return self._step_context

+  def reset(self):
+    if self._step_context:


You don't need the if here.

charlesccychen · 2017-09-25T17:50:58Z

sdks/python/apache_beam/pipeline_test.py

+
+      def __init__(self):
+        self._execution_context = _ExecutionContext(None, {})
+        self._execution_context.get_step_context().get_keyed_state(None)


Why do we do this here? Can you clarify or comment?

charlesccychen · 2017-09-25T17:50:59Z

sdks/python/apache_beam/pipeline_test.py

+    from collections import defaultdict
+    from apache_beam.runners.direct.transform_evaluator import _TransformEvaluator
+    from apache_beam.runners.direct.transform_evaluator import _GroupByKeyOnlyEvaluator
+    from apache_beam.runners.direct.evaluation_context import _ExecutionContext


mariapython wrote:
Done.

I don't see the change here.

mariapython

PTAL

mariapython · 2017-09-25T23:26:15Z

sdks/python/apache_beam/runners/direct/evaluation_context.py

    if not self.partial_keyed_state.get(key):
-      self.partial_keyed_state[key] = (
-          self.existing_keyed_state[key].copy())
+      self.partial_keyed_state[key] = (self.existing_keyed_state[key].copy())


charlesccychen wrote:
No need for extra parentheses.

Done.

mariapython · 2017-09-25T23:26:15Z

sdks/python/apache_beam/runners/direct/evaluation_context.py

    return self._step_context

+  def reset(self):
+    if self._step_context:


charlesccychen wrote:
You don't need the if here.

Done.

mariapython · 2017-09-25T23:26:16Z

sdks/python/apache_beam/pipeline_test.py

+
+      def __init__(self):
+        self._execution_context = _ExecutionContext(None, {})
+        self._execution_context.get_step_context().get_keyed_state(None)


charlesccychen wrote:
Why do we do this here? Can you clarify or comment?

Done.

mariapython · 2017-09-25T23:26:17Z

sdks/python/apache_beam/pipeline_test.py

+    from collections import defaultdict
+    from apache_beam.runners.direct.transform_evaluator import _TransformEvaluator
+    from apache_beam.runners.direct.transform_evaluator import _GroupByKeyOnlyEvaluator
+    from apache_beam.runners.direct.evaluation_context import _ExecutionContext


charlesccychen wrote:
I don't see the change here.

Done.

coveralls · 2017-09-26T01:03:36Z

Coverage increased (+0.02%) to 69.545% when pulling be5e425 on mariapython:retry_tests into 352f106 on apache:master.

charlesccychen

Thanks! R: @chamikaramj for merge.

mariapython wrote:
PTAL

Done.

mariapython force-pushed the retry_tests branch from 438e576 to af94b7b Compare September 12, 2017 23:10

mariapython changed the title ~~Add test to fix partial writeouts after a bundle retry~~ [BEAM-2718] Add test to fix partial writeouts after a bundle retry Sep 12, 2017

charlesccychen reviewed Sep 15, 2017

View reviewed changes

mariapython force-pushed the retry_tests branch from af94b7b to 3000042 Compare September 18, 2017 05:28

mariapython commented Sep 18, 2017

View reviewed changes

charlesccychen reviewed Sep 22, 2017

View reviewed changes

mariapython force-pushed the retry_tests branch from 3000042 to 2bc3b24 Compare September 22, 2017 22:59

mariapython commented Sep 22, 2017

View reviewed changes

charlesccychen reviewed Sep 25, 2017

View reviewed changes

mariapython added 5 commits September 25, 2017 15:57

Add test to fix partial writouts after a bundle retry

097fe58

Clear partial keyed states for the Direct Runner

7b9c877

Add review comments

399e6ea

Address review comments

62b9dd3

Add review comments

be5e425

mariapython force-pushed the retry_tests branch from 2bc3b24 to be5e425 Compare September 25, 2017 23:24

mariapython commented Sep 25, 2017

View reviewed changes

charlesccychen approved these changes Sep 26, 2017

View reviewed changes

asfgit closed this in 2aa3b5c Sep 26, 2017

Conversation

mariapython commented Sep 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mariapython commented Sep 12, 2017

Uh oh!

coveralls commented Sep 13, 2017

Uh oh!

charlesccychen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mariapython left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mariapython commented Sep 18, 2017

Uh oh!

mariapython commented Sep 19, 2017

Uh oh!

coveralls commented Sep 20, 2017

Uh oh!

charlesccychen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mariapython left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coveralls commented Sep 23, 2017

Uh oh!

charlesccychen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mariapython left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coveralls commented Sep 26, 2017

Uh oh!

charlesccychen left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

mariapython commented Sep 11, 2017 •

edited

Loading