[BEAM-4268] Improving the separation between Metrics API and Execution#5323
Conversation
|
Run Python PostCommit |
aaltay
left a comment
There was a problem hiding this comment.
LGTM. Could you try running py streaming wordcournt with direct runner before merging. The following fails for me at head (before this PR):
python -m apache_beam.examples.streaming_wordcount
--input_topic projects/$PROJECT/topics/wordstream-$USER-topic-1
--output_topic projects/$PROJECT/topics/wordstream-$USER-topic-2
--streaming
gcloud alpha pubsub topics publish wordstream-$USER-topic-1 'a message'
with:
File "apache_beam/metrics/metric.py", line 100, in inc
container = MetricsEnvironment.current_container()
File "apache_beam/metrics/execution.py", line 143, in current_container
sampler = statesampler.get_current_tracker()
NameError: global name 'statesampler' is not defined [while running 'split']
| result = DataflowPipelineResult( | ||
| self.dataflow_client.create_job(self.job), self) | ||
|
|
||
| from apache_beam.runners.dataflow.dataflow_metrics import DataflowMetrics |
There was a problem hiding this comment.
Could you add a comment about imports that are not on top, explaining the reasoning (a JIRA issue, a pylint comment). Anything that would help us to fix the issue in the future will be useful.
|
I've verified that the job runs, and added todo items for the imports. I'll merge the change as soon as precommits pass. |
Runners do not import metrics immediately, thus cleaning up the dependency between them.