New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-6526] Add ReadFiles transform for AvroIO #7672
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some questions and NeedsRunner tests are failing. Besides that it looks nice, thanks!
* to be used by SQL and by the schema-transform library. | ||
*/ | ||
@Experimental(Kind.SCHEMAS) | ||
public ReadFiles<T> withBeamSchemas(boolean withBeamSchemas) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I like that. It seems to be possible to use it in ParquetIO in the same, right (asking for the sake of future PRs)? People seem to wish it frequently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes sense for ParquetIO too. Worth to fill a JIRA IMO. Notice that I just created that method to be consistent with the existing read()
and readAll
signatures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. fyi, there already is a JIRA: https://issues.apache.org/jira/browse/BEAM-4812.
// 64MB is a reasonable value that allows to amortize the cost of opening files, | ||
// but is not so large as to exhaust a typical runner's maximum amount of output per | ||
// ProcessElement call. | ||
.setDesiredBundleSizeBytes(64 * 1024 * 1024L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it makes sense to extract a constant with the size and give the comment above it? Currently, multiple places use the same value (not only from here) but it's documented only in two of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good idea, I will extract the constant and move the doc there.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions. |
This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
* Update google-api-core dependency * Release 0.32.1
R: @lgajowy