-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-11188] Add wrappers to Go Xlang examples, and adjust front-end. #13255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adds wrappers around each test xlang transform used in the examples, which show how xlang transforms are best implemented. Also adjusts the xlang fronte-end, specifically removing the misleadingly named Sink and Source versions of the methods, and adjusting the Input/Output tag naming to match and adding some helpers.
|
Actually don't merge this yet. I ran into an issue where adding a scope (i.e. putting the external transform in a composite) causes all the examples to fail. It seems to be because of a built-in assumption in the cross-language code that the root transform of the protobuf external transform is the external transform itself, but this is false when it's in a composite, because the root transform is the composite instead. I'm still trying to fix that, because this is an issue that would give users trouble immediately and needs to be fixed ASAP. That said, if the rest of the PR looks good I might just comment out the Scope calls with a TODO and a Jira, since that will work. |
This avoids some bugs that were occurring because the existing implementation didn't account for composite transforms.
|
I worked around the scope problem by essentially removing any composite transforms from the proto representation of the external transform, before continuing on with the existing logic. I spent a while trying to more properly fix it so it would work with the composite transform in there, but it seems like there are many places in the existing logic where composites weren't accounted for, and at the end of the day I don't think there's anything gained at the moment by changing the logic to work with the composite in there. @lostluck let me know what you think. Any issues we might hit with this workaround in the future? |
|
We certainly should fix the problem in the future as the composite/hypergraph structure of beam pipelines is very useful for runners like Dataflow, WRT visualizations. Totally reasonable to put a TODO and punt for now. I'd need to take a look myself to have more specific advice. |
lostluck
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It's likely we might want to lightly figure out how to "re-nest" the returned expansion within whatever scope was provided by the user. When you get to running things on Dataflow, you'll see what I mean WRT the visualizations around scoping and composites, though you're probably right that we'll need to strip that before sending things to the expansion services.
|
|
||
| "github.com/apache/beam/sdks/go/pkg/beam/internal/errors" | ||
| jobpb "github.com/apache/beam/sdks/go/pkg/beam/model/jobmanagement_v1" | ||
| "github.com/apache/beam/sdks/go/pkg/beam/model/pipeline_v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please abbreviate the pipeline_v1 package protos as pipepb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| "google.golang.org/grpc" | ||
| ) | ||
|
|
||
| // Expand queries the expansion service to resolve the ExpansionRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to expand the description here, since the expansion request is being created "in house".
// Expand submits the pipeline components and root transform to be expanded by the given expansion service.
//
// Most should call beam.CrossLanguage to access foreign transforms rather than calling this function directly.
The latter segment is probably unnecessary since it won't add things to the pipeline properly.. but we don't currently have too much documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
I just checked the pipeline graph that gets sent to the runner, after external transforms are re-expanded (at universal.go:100), and it seems to still have the composite transform like we'd like. I strip out the composites only from the expansion request right now. When it external transforms get replaced by the expanded versions later it seems to all happen inside the composite transform and that seems to be preserved. Although I'd like to check it in the actual Dataflow visualization at some point just to make sure there aren't any subtle issues. |
|
Ah that makes sense! LGTM for now then, and we can re-evaluate when we see how runner visualizations treat it. |
Adds wrappers around each test xlang transform used in the examples, which show how xlang transforms are best implemented. Also adjusts the xlang fronte-end, specifically removing the misleadingly named Sink and Source versions of the methods, and adjusting the Input/Output tag naming to match and adding some helpers.
Note: Currently the CoGroupByKey, Partition, and MultiInputOutput examples are failing when run manually, however that seems unrelated to this PR. They fail identically if I run them in a clean repo. Still debugging that issue.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.