Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Feature][Go SDK]: Support user use of the Beam Iterable Coder. #24339

Closed
lostluck opened this issue Nov 23, 2022 · 0 comments · Fixed by #24346
Closed

[New Feature][Go SDK]: Support user use of the Beam Iterable Coder. #24339

lostluck opened this issue Nov 23, 2022 · 0 comments · Fixed by #24346
Assignees
Labels
done & done Issue has been reviewed after it was closed for verification, followups, etc. go new feature P2

Comments

@lostluck
Copy link
Contributor

What happened?

There's no supported way for users to generate types that use an iterable coder as the outer most coder type in the Go SDK.
This prevents being able to use certain cross language transforms. In particular, the python Run Inference transform requires using a Beam Standard Iterable coder coded values as input.

There are essentially two options for resolving this.

Option 1: Change slices to use iterable coder instead of the custom coder it currently uses.
The current custom coder adds an extra length prefix to the wrapping, and is also non-portable. Making this change would constitute a breaking change due to the change in coder semantics. However, it's unlikely to cause significant issues unless a pipeline has already started using the State API with slices. This change would also prevent pipeline updates across the versions if a slice coder is being used.

Option 2: Revamp the coder system to enable better access to "standard" coders and similar, special casing them. This means users who wish to use the Iterable coder would need to learn/know/discover this.

This author prefers option 1, because it's much cleaner longer term, and is more obvious a solution. Option 2 would take longer to develop, and impact a much larger surface area of the SDK. Further, option 2 would be less obvious and less natural to Go programmers, who use slices more casually.

Option 1 also requires relaxing the automatic inference that a "KV<k, Iter>" coder is a CoGBK<K, V> coder. This approach was used because previously the only source of Iterable coders was from a side input or the result of a CoGBK. This can be handled at execution time by validating the consumer of a datasource is expecting a CoGBK or a slice.

Making this change arguably opens up using an iterable function instead of a slice for that concrete value (previously this was the other way around for side input values), but that would be a separate extension.

Issue Priority

Priority: 2

Issue Component

Component: sdk-go

@github-actions github-actions bot added the P2 label Nov 23, 2022
@lostluck lostluck self-assigned this Nov 24, 2022
riteshghorse pushed a commit that referenced this issue Nov 29, 2022
)

* Improve debugging strings.

* Switch slices to use iterable coder.

* Handle coderrefs

* tests for new translate code

* fulltype testing

* Add Statebacked iterable case.

* Add unlifted combines.

* Comment catches.

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
@github-actions github-actions bot added this to the 2.44.0 Release milestone Nov 29, 2022
@apilloud apilloud added the done & done Issue has been reviewed after it was closed for verification, followups, etc. label Nov 29, 2022
lostluck added a commit that referenced this issue Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done & done Issue has been reviewed after it was closed for verification, followups, etc. go new feature P2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants