Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-6526] Add ReadFiles transform for AvroIO #7672

Closed
wants to merge 1 commit into from

Conversation

iemejia
Copy link
Member

@iemejia iemejia commented Jan 30, 2019

@iemejia iemejia requested a review from lgajowy January 30, 2019 11:00
Copy link
Contributor

@lgajowy lgajowy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some questions and NeedsRunner tests are failing. Besides that it looks nice, thanks!

* to be used by SQL and by the schema-transform library.
*/
@Experimental(Kind.SCHEMAS)
public ReadFiles<T> withBeamSchemas(boolean withBeamSchemas) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like that. It seems to be possible to use it in ParquetIO in the same, right (asking for the sake of future PRs)? People seem to wish it frequently

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it makes sense for ParquetIO too. Worth to fill a JIRA IMO. Notice that I just created that method to be consistent with the existing read() and readAll signatures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. fyi, there already is a JIRA: https://issues.apache.org/jira/browse/BEAM-4812.

// 64MB is a reasonable value that allows to amortize the cost of opening files,
// but is not so large as to exhaust a typical runner's maximum amount of output per
// ProcessElement call.
.setDesiredBundleSizeBytes(64 * 1024 * 1024L)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it makes sense to extract a constant with the size and give the comment above it? Currently, multiple places use the same value (not only from here) but it's documented only in two of them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good idea, I will extract the constant and move the doc there.

@stale
Copy link

stale bot commented Mar 31, 2019

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Mar 31, 2019
@stale
Copy link

stale bot commented Apr 7, 2019

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@stale stale bot closed this Apr 7, 2019
pl04351820 pushed a commit to pl04351820/beam that referenced this pull request Dec 20, 2023
* Update google-api-core dependency
* Release 0.32.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants