Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add schema aware stream writer #2048

Merged
merged 3 commits into from Apr 8, 2023

Conversation

arturowczarek
Copy link
Contributor

@arturowczarek arturowczarek commented Mar 20, 2023

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #2039 ☕️

If you write sample code, please follow the samples format.

@arturowczarek arturowczarek requested a review from a team as a code owner March 20, 2023 10:12
@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. labels Mar 20, 2023
@arturowczarek
Copy link
Contributor Author

arturowczarek commented Mar 20, 2023

This is a work in progress. Please let me know what you think.

@arturowczarek arturowczarek force-pushed the generic_stream_writer branch 4 times, most recently from b2c069c to 2601bd4 Compare March 20, 2023 12:24
@Neenu1995
Copy link
Contributor

@yirutang PTAL

* order of minutes).
*/
public class SchemaAwareStreamWriter<T> implements AutoCloseable {
private static String streamPatternString =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fields like this should be final and CAPITALIZED, but I leave it this way. Otherwise I'd have to change too much.

* table schema is updated, users will be able to ingest data on the new schema after some time (in
* order of minutes).
*/
public class SchemaAwareStreamWriter<T> implements AutoCloseable {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is intended to be a generic version of JsonStreamWriter

private Descriptor descriptor;
private TableSchema tableSchema;
private boolean ignoreUnknownFields = false;
private boolean reconnectAfter10M = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem to be necessary. Can we remove it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please.

}

/**
* @Deprecated Setter for a reconnectAfter10M, temporaily workaround for omg/48020. Fix for the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do I find it? Can we remove it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this.

@arturowczarek
Copy link
Contributor Author

@yirutang @Neenu1995 Is there a chance for any feedback to this pull request and/or the created issues?

@shollyman shollyman requested a review from yirutang March 30, 2023 15:48
Copy link
Contributor

@yirutang yirutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the addition!

private Descriptor descriptor;
private TableSchema tableSchema;
private boolean ignoreUnknownFields = false;
private boolean reconnectAfter10M = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please.

}

/**
* @Deprecated Setter for a reconnectAfter10M, temporaily workaround for omg/48020. Fix for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this.

@generated-files-bot
Copy link

Warning: This pull request is touching the following templated files:

  • google-cloud-bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/package-info.java

@arturowczarek arturowczarek force-pushed the generic_stream_writer branch 9 times, most recently from ede37e8 to 2258a2a Compare April 4, 2023 09:12
@arturowczarek
Copy link
Contributor Author

I think I've changed everything. The tests for JsonStreamWriter all pass. You can run the CI tests and do the review.

@arturowczarek
Copy link
Contributor Author

arturowczarek commented Apr 5, 2023

Thank you @yirutang for running the tests again.
I've fixed the formatting. Sorry about that. I didn't notice kokoro requires additional environmental variable JOB_TYPE to know what to do.
I'm not sure what to do with clirr errors. It notices that I've introduced some changes to public interfaces: https://github.com/googleapis/java-bigquerystorage/actions/runs/4606032456/jobs/8148652677#step:5:3513
This includes renaming one of the class, removal of reconnectAfter10M, etc. I'm not sure what is the workflow and how do deal with it properly. Can you instruct me what to do? I've noticed @Neenu1995 commit e67e913, where she leaves the old, wrongly named class but leaves in the state that doesn't break the contract. When and how is it going to be removed? Should I also leave the classes and methods and mark them somehow that they are going to be removed?
I've committed 259391b to solve clirr errors. Please have a look if it is ok.
Now the tests on my laptop pass.

@arturowczarek
Copy link
Contributor Author

@Neenu1995 Why didn't you finally use @Deprecated annotation in #2037?

@yirutang yirutang added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 7, 2023
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 7, 2023
@yirutang yirutang added the kokoro:run Add this label to force Kokoro to re-run the tests. label Apr 7, 2023
@yoshi-kokoro yoshi-kokoro removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Apr 7, 2023
@gcf-owl-bot gcf-owl-bot bot requested a review from a team as a code owner April 7, 2023 00:38
@yirutang yirutang added the kokoro:run Add this label to force Kokoro to re-run the tests. label Apr 7, 2023
@yoshi-kokoro yoshi-kokoro removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Apr 7, 2023
@yirutang yirutang merged commit ad136b9 into googleapis:main Apr 8, 2023
18 checks passed
@arturowczarek
Copy link
Contributor Author

How does releasing of this library work? When does the releasing bot activate and release a new version?

gcf-merge-on-green bot pushed a commit that referenced this pull request Apr 13, 2023
🤖 I have created a release *beep* *boop*
---


## [2.35.0](https://togithub.com/googleapis/java-bigquerystorage/compare/v2.34.2...v2.35.0) (2023-04-13)


### Features

* Add public api to stream writer to set the maximum wait time ([#2066](https://togithub.com/googleapis/java-bigquerystorage/issues/2066)) ([1e9a8ca](https://togithub.com/googleapis/java-bigquerystorage/commit/1e9a8cac19c3748515ebff7990d02fd576c7dd23))
* Add sample about processing permanent writer failure ([#2057](https://togithub.com/googleapis/java-bigquerystorage/issues/2057)) ([8eda934](https://togithub.com/googleapis/java-bigquerystorage/commit/8eda9347a90f59ddcf99501f8b71ba17c5f3a143))
* Add schema aware stream writer ([#2048](https://togithub.com/googleapis/java-bigquerystorage/issues/2048)) ([ad136b9](https://togithub.com/googleapis/java-bigquerystorage/commit/ad136b9fa25e774a33d02fc3a82a76fb1152b5c5))


### Dependencies

* Update dependency com.google.cloud:google-cloud-bigquery to v2.24.4 ([#2070](https://togithub.com/googleapis/java-bigquerystorage/issues/2070)) ([ce9e962](https://togithub.com/googleapis/java-bigquerystorage/commit/ce9e96209cbafd5a4daa981c5e5252272dc9811a))
* Update dependency com.google.cloud:google-cloud-shared-dependencies to v3.7.0 ([#2077](https://togithub.com/googleapis/java-bigquerystorage/issues/2077)) ([b5ea788](https://togithub.com/googleapis/java-bigquerystorage/commit/b5ea788df26122dcdf3c7104658cc8fd35a38f72))

---
This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. size: xl Pull request size is extra large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stream writer with custom mapper
4 participants