-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-9524] Fix for ib.show() executing indefinitely #11128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8f2efd8 to
68f2c6d
Compare
68f2c6d to
8cb62c1
Compare
|
R: @pabloem |
8cb62c1 to
3e1d606
Compare
|
retest this please |
sdks/python/apache_beam/runners/interactive/caching/streaming_cache.py
Outdated
Show resolved
Hide resolved
…estart Change-Id: I53aa32a75645086efffa091a53880a076c3a689d
Change-Id: I1ab6e7036172d7e2d07c774778a50e165df6bdca
3e1d606 to
54a7e19
Compare
|
retest this please |
1 similar comment
|
retest this please |
|
|
||
| @staticmethod | ||
| def from_str(r): | ||
| split = r.split('|') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check length of the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think probably a better way is to do the following:
var, version, producer_version, pipeline_id = r.split('|')
return CacheKey(var, version, producer_version, pipeline_id)
|
retest this please |
1 similar comment
|
retest this please |
Change-Id: I247f37cd7acffb6ad796ce0fa8b54b0feff400d1
|
retest this please |
2 similar comments
|
retest this please |
|
retest this please |
The problem was that the StreamingCache needed a way to communicate with the BackgroundCachingJob that it was done, so it knew when to stop tailing the file. This was previously done with capturing the user_pipeline when making the StreamingCache. However, the StreamingCache is used as a global cache for all pipelines. Thus, when the pipeline was redefined the StreamingCache had a stale reference to the previous pipeline and would loop forever.
The fix is to change the cache key of PCollection to be pipeline specific. In this PR, it can be seen that the pipeline id is used in the cache key. This is used to tell the StreamingCache what pipeline it's associated with. The StreamingCache can't be created with the user_pipeline because the pipeline_instrument object is dependent on the cache which creates the user pipeline.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.