Skip to content

Comments

[BEAM-1301] Support for BigQuery table description#1821

Merged
asfgit merged 2 commits intoapache:masterfrom
ravwojdyla:tbl_desc
Feb 1, 2017
Merged

[BEAM-1301] Support for BigQuery table description#1821
asfgit merged 2 commits intoapache:masterfrom
ravwojdyla:tbl_desc

Conversation

@ravwojdyla
Copy link
Contributor

@ravwojdyla ravwojdyla commented Jan 23, 2017

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

@asfbot
Copy link

asfbot commented Jan 23, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6750/

Build result: FAILURE

[...truncated 6594 lines...] at hudson.remoting.UserRequest.perform(UserRequest.java:153) at hudson.remoting.UserRequest.perform(UserRequest.java:50) at hudson.remoting.Request$2.run(Request.java:332) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.MojoFailureException: You have 1 Checkstyle violation. at org.apache.maven.plugin.checkstyle.CheckstyleViolationCheckMojo.execute(CheckstyleViolationCheckMojo.java:588) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 more2017-01-23T23:36:59.624 [ERROR] 2017-01-23T23:36:59.624 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-01-23T23:36:59.624 [ERROR] 2017-01-23T23:36:59.624 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-01-23T23:36:59.624 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException2017-01-23T23:36:59.624 [ERROR] 2017-01-23T23:36:59.625 [ERROR] After correcting the problems, you can resume the build with the command2017-01-23T23:36:59.625 [ERROR] mvn -rf :beam-sdks-java-io-google-cloud-platformchannel stoppedSetting status of 6950238 to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6750/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@ravwojdyla ravwojdyla force-pushed the tbl_desc branch 2 times, most recently from 649a690 to 4dcf59e Compare January 24, 2017 05:26
@asfbot
Copy link

asfbot commented Jan 24, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6772/
--none--

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 4dcf59e on ravwojdyla:tbl_desc into ** on apache:master**.

@asfbot
Copy link

asfbot commented Jan 24, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6773/
--none--

@dhalperi
Copy link
Contributor

R: @dhalperi

Sorry, did not see this. Thanks Rav! Will take a look tomorrow.

null /* jsonSchema */,
CreateDisposition.CREATE_IF_NEEDED,
WriteDisposition.WRITE_EMPTY,
null /*tableDescription */,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: /* table (with space)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

.withLabel("Validation Enabled"), true)
.addIfNotNull(DisplayData.item("tableDescription", tableDescription)
.withLabel("Table description"));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop blank line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

switch (jobStatus) {
case SUCCEEDED:
if (tableDescription != null) {
datasetService.patchTableDescription(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self and reviewers: confirmed that you can't set the table description from the load job itself: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.load

ref.getDatasetId(),
ref.getTableId(),
tableDescription);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you probably want this to run after copy job too, because for a very large table we load to temp tables then copy to the final table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

"Unable to patch table description: %s, aborting after %d retries.",
tableId, MAX_RPC_RETRIES),
Sleeper.DEFAULT,
backoff,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you want this feature to interact with per-window tables created in streaming mode? I think the new table description is not applied in those cases.

Copy link
Contributor Author

@ravwojdyla ravwojdyla Jan 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Would it make sense to patch description of each table we are streaming to, precisely right after createdTables.add(tableSpec) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably do it here: https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L2724 based on if this worker successfully created the table (return value != null).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that's not the return type of the createTable API.

Maybe move description into the createTable API proper, where it actually can be embedded in the request we're already sending? https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#description

Copy link
Contributor Author

@ravwojdyla ravwojdyla Jan 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed commit (49f125a) with suggestion above - please let me know what do you think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry missed the comments above. Yes the createTable sounds like best idea! Will go ahead and add that.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 69.631% when pulling e011e9a on ravwojdyla:tbl_desc into b80aac5 on apache:master.

@asfbot
Copy link

asfbot commented Jan 31, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6928/
--none--

@ravwojdyla ravwojdyla force-pushed the tbl_desc branch 2 times, most recently from 49f125a to 89419aa Compare January 31, 2017 23:23
@asfbot
Copy link

asfbot commented Jan 31, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6940/

Build result: FAILURE

[...truncated 5107 lines...] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.compiler.CompilationFailureException: Compilation failure/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java:[1605,38] constructor StreamingWriteFn in class org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.StreamingWriteFn cannot be applied to given types; required: org.apache.beam.sdk.options.ValueProvider<com.google.api.services.bigquery.model.TableSchema>,org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition,java.lang.String,org.apache.beam.sdk.io.gcp.bigquery.BigQueryServices found: ,org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition,org.apache.beam.sdk.io.gcp.bigquery.BigQueryIOTest.FakeBigQueryServices reason: actual and formal argument lists differ in length at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:972) at org.apache.maven.plugin.compiler.TestCompilerMojo.execute(TestCompilerMojo.java:153) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 more2017-01-31T23:28:04.283 [ERROR] 2017-01-31T23:28:04.283 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-01-31T23:28:04.283 [ERROR] 2017-01-31T23:28:04.283 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-01-31T23:28:04.283 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException2017-01-31T23:28:04.283 [ERROR] 2017-01-31T23:28:04.283 [ERROR] After correcting the problems, you can resume the build with the command2017-01-31T23:28:04.283 [ERROR] mvn -rf :beam-sdks-java-io-google-cloud-platformchannel stoppedSetting status of 49f125a to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6940/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@asfbot
Copy link

asfbot commented Jan 31, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6942/

Build result: FAILURE

[...truncated 5108 lines...] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)Caused by: org.apache.maven.plugin.compiler.CompilationFailureException: Compilation failure/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java:[1605,38] constructor StreamingWriteFn in class org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.StreamingWriteFn cannot be applied to given types; required: org.apache.beam.sdk.options.ValueProvider<com.google.api.services.bigquery.model.TableSchema>,org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition,java.lang.String,org.apache.beam.sdk.io.gcp.bigquery.BigQueryServices found: ,org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition,org.apache.beam.sdk.io.gcp.bigquery.BigQueryIOTest.FakeBigQueryServices reason: actual and formal argument lists differ in length at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:972) at org.apache.maven.plugin.compiler.TestCompilerMojo.execute(TestCompilerMojo.java:153) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 31 more2017-01-31T23:34:42.919 [ERROR] 2017-01-31T23:34:42.919 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-01-31T23:34:42.919 [ERROR] 2017-01-31T23:34:42.919 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-01-31T23:34:42.919 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException2017-01-31T23:34:42.919 [ERROR] 2017-01-31T23:34:42.919 [ERROR] After correcting the problems, you can resume the build with the command2017-01-31T23:34:42.919 [ERROR] mvn -rf :beam-sdks-java-io-google-cloud-platformchannel stoppedSetting status of 89419aa to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6942/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@coveralls
Copy link

Coverage Status

Coverage remained the same at 69.631% when pulling af01f1c on ravwojdyla:tbl_desc into b80aac5 on apache:master.

@asfbot
Copy link

asfbot commented Feb 1, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6944/
--none--

@dhalperi
Copy link
Contributor

dhalperi commented Feb 1, 2017

Looks great to me! Just a few minor nits.

I can also fix those in the merge so feel free to leave as-is; I will fixup and merge tomorrow if you haven't made the fixups.

@ravwojdyla
Copy link
Contributor Author

ravwojdyla commented Feb 1, 2017

@dhalperi cleaned up a bit:

  • refactored patchTableDescription method - now simply accepts TableReference instead of project/dataset/table
  • added @Nullable where applicable
  • fixed one typo

I am happy to fix anything that is still not right.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 69.63% when pulling 4444158 on ravwojdyla:tbl_desc into 92021c5 on apache:master.

@asfbot
Copy link

asfbot commented Feb 1, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6959/
--none--

Copy link
Contributor

@dhalperi dhalperi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I did not successfully comment the nits last night. GitHub UI issues... sorry!

/** TableSchema in JSON. Use String to make the class Serializable. */
@Nullable private final ValueProvider<String> jsonTableSchema;

private final String tableDescription;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please annotate member variable @Nullable :)

@Nullable ValueProvider<TableSchema> tableSchema,
Write.CreateDisposition createDisposition,
@Nullable String tableDescription,
BigQueryServices bqServices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: in Beam, we don't align parameters usually. but ok.

BigQueryServices bqServices) {
Write.CreateDisposition createDisposition,
@Nullable String tableDescription,
BigQueryServices bqServices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: in Beam, we don't align parameters usually. but ok.

createDisposition,
tableDescription,
bqServices
)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move to previous line.

@ravwojdyla
Copy link
Contributor Author

@dhalperi there definitely was sth going on with GH yesterday. anyway, no problem. fixed your comments.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.003%) to 69.644% when pulling 8a0e10a on ravwojdyla:tbl_desc into 92021c5 on apache:master.

@asfbot
Copy link

asfbot commented Feb 1, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6966/
--none--

@coveralls
Copy link

Coverage Status

Coverage increased (+0.003%) to 69.644% when pulling 8a0e10a on ravwojdyla:tbl_desc into 92021c5 on apache:master.

@asfbot
Copy link

asfbot commented Feb 1, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/6967/
--none--

@asfgit asfgit merged commit 8a0e10a into apache:master Feb 1, 2017
asfgit pushed a commit that referenced this pull request Feb 1, 2017
@ravwojdyla ravwojdyla deleted the tbl_desc branch February 1, 2017 18:39
@dhalperi
Copy link
Contributor

dhalperi commented Feb 1, 2017

Thanks Rav!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants