Skip to content

[BEAM-4268] Improving the separation between Metrics API and Execution#5323

Merged
pabloem merged 3 commits into
apache:masterfrom
pabloem:fix-ss-metrics
May 11, 2018
Merged

[BEAM-4268] Improving the separation between Metrics API and Execution#5323
pabloem merged 3 commits into
apache:masterfrom
pabloem:fix-ss-metrics

Conversation

@pabloem
Copy link
Copy Markdown
Member

@pabloem pabloem commented May 9, 2018

Runners do not import metrics immediately, thus cleaning up the dependency between them.

@pabloem pabloem closed this May 9, 2018
@pabloem pabloem reopened this May 9, 2018
@pabloem
Copy link
Copy Markdown
Member Author

pabloem commented May 10, 2018

Run Python PostCommit

@pabloem pabloem changed the title Improving the separation between Metrics API and Execution [BEAM-4268] Improving the separation between Metrics API and Execution May 10, 2018
@pabloem
Copy link
Copy Markdown
Member Author

pabloem commented May 10, 2018

@pabloem
Copy link
Copy Markdown
Member Author

pabloem commented May 10, 2018

r: @aaltay
This is a refactor of #5304, after working out a few issues.
This is a 2.5.0 blocker, and should also be in the next google import.

Copy link
Copy Markdown
Member

@aaltay aaltay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you try running py streaming wordcournt with direct runner before merging. The following fails for me at head (before this PR):

python -m apache_beam.examples.streaming_wordcount
--input_topic projects/$PROJECT/topics/wordstream-$USER-topic-1
--output_topic projects/$PROJECT/topics/wordstream-$USER-topic-2
--streaming

gcloud alpha pubsub topics publish wordstream-$USER-topic-1 'a message'

with:

  File "apache_beam/metrics/metric.py", line 100, in inc
    container = MetricsEnvironment.current_container()
  File "apache_beam/metrics/execution.py", line 143, in current_container
    sampler = statesampler.get_current_tracker()
NameError: global name 'statesampler' is not defined [while running 'split']

result = DataflowPipelineResult(
self.dataflow_client.create_job(self.job), self)

from apache_beam.runners.dataflow.dataflow_metrics import DataflowMetrics
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment about imports that are not on top, explaining the reasoning (a JIRA issue, a pylint comment). Anything that would help us to fix the issue in the future will be useful.

@pabloem
Copy link
Copy Markdown
Member Author

pabloem commented May 11, 2018

I've verified that the job runs, and added todo items for the imports. I'll merge the change as soon as precommits pass.

@pabloem pabloem merged commit 2a4d2ea into apache:master May 11, 2018
@pabloem pabloem deleted the fix-ss-metrics branch May 11, 2018 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants