Skip to content

Implement missing features and fix found bugs for BigQuery Client Library#2446

Merged
vam-google merged 7 commits intogoogleapis:masterfrom
vam-google:master
Oct 3, 2017
Merged

Implement missing features and fix found bugs for BigQuery Client Library#2446
vam-google merged 7 commits intogoogleapis:masterfrom
vam-google:master

Conversation

@vam-google
Copy link
Contributor

@vam-google vam-google commented Sep 15, 2017

  1. Access query results row cells by name. Now row.get(1) and get("firstColumnName") are supported (currently supported only for results returned by BigQuery.query() and not supported by results returned BigQuery.listTableData() because there is no schema and it would require additional request to the server to get schema (but still can be added in the future easily with cost of an extra network call)). Same syntax is supported for record-type fields (i.e. nested fields can be accessed by name too).
  2. Refactor BigQuery.query() to use jobs.insert/jobs.getQueryResults combination instead of jobs.query. QueryRequest class was removed, now query() method directly accepts QueryJobConfiguration instead. The remaining properties of removed QueryRequest class now are passed in a form of QueryOption vararg argument. QueryOption combines query and waiting options (waiting options are necessary because query() is now waiting for query to complete).
  3. Make BigQuery.query() call synchronous (waits for query completion before return result), with jittered exponential backoff for polling (based on gax.retrying package)
  4. Add client side job id generation (JobId.of())
  5. Replace WaitForOption with RetryOption (based on RetrySettings and is consistent with the rest of the codebase). This affected compute and spanner clients too.
  6. Rewrite Job.wait() to use jittered exponentiall backoff polling (based ongax.retrying package). Polling is performed differently depending on the type of job: for query jobs on each poll getQueryResults() is called, for all other job types getJob() is used instead.
  7. Use standard SQL as default for all queries
  8. Fix wrong query results iteration samples code used all over the place in documentation.
  9. Various smaller changes/refactorings

…rary:

0) Access query results row cells by name. Now both row.get(1) and get("firstColumnName") are supported (currently supported only for results returned by BigQuery.query() and not supported by results returned BigQuery.listTableData(), because there is not schema and it would require additional request to the server to get schema (still can be added in the future easily)).
1) Rewrite BigQuery.query() to use jobs.insert/jobs.getQueryResults combination instead of jobs.query
2) BigQuery.query() call is synchronous, with jittered exponential backoff for polling (based on gax.retrying package)
3) Client side job id generation (JobId.of())
4) Replace WaitForOption with RetryOption (based on RetrySettings and is consistent with the rest of the codebase)
5) Rewrite Job.wait() to use jittered exponentiall backoff polling (based on gax.retrying package)
6) Use standard SQL as default for all queries
7) Various smaller changes/refactorings
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Sep 15, 2017
@vam-google
Copy link
Contributor Author

@vkedia please review spanner's Operation.waitFor(), as it was rewrittent as part of this PR.

Copy link
Contributor

@pongad pongad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little big for me to properly review, but I don't see anything wrong apart from a couple of nits noted.

return option instanceof RetryOption ? (RetryOption) option : null;
}

public static QueryResultsOption[] filterQueryResutlsOptions(QueryOption... options) {

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

String cursor = result.x();
return new PageImpl<>(new TableDataPageFetcher(tableId, serviceOptions, cursor, optionsMap),
cursor, transformTableData(result.y()));
cursor, transformTableData(result.y(), null));

This comment was marked as spam.

This comment was marked as spam.

response = bigquery.getQueryResults(response.getJobId());
}
QueryJobConfiguration queryConfig =
QueryJobConfiguration.newBuilder(query).setUseLegacySql(true).build();

This comment was marked as spam.

This comment was marked as spam.

// [START runQueryWithParameters]
QueryRequest request = QueryRequest.newBuilder(query)
QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query)
.setUseLegacySql(false) // standard SQL is required to use query parameters

This comment was marked as spam.

This comment was marked as spam.

.setPageSize(1000L)
.build();
QueryJobConfiguration queryConfig =
QueryJobConfiguration.newBuilder("SELECT * FROM my_dataset_id.my_table_id").build();

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

if (!(prevThrowable instanceof SpannerException)) {
return ((BaseServiceException) prevThrowable).isRetryable();
}
return ((SpannerException) prevThrowable).getErrorCode() != ErrorCode.NOT_FOUND;

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

this.type = checkNotNull(type);
return this;
public Builder setType(LegacySQLTypeName type, Field... subFields) {
return setType(type, subFields.length > 0 ? Fields.of(subFields) : null);

This comment was marked as spam.

This comment was marked as spam.

ImmutableMap.Builder<String, Integer> nameIndexBuilder = ImmutableMap.builder();
int index = 0;
for (Field field : fields) {
nameIndexBuilder.put(field.getName(), index++);

This comment was marked as spam.

This comment was marked as spam.

@Test
public void testFromPb() {
FieldValue value = FieldValue.fromPb(BOOLEAN_FIELD);
FieldValue value = FieldValue.fromPb(BOOLEAN_FIELD, null);

This comment was marked as spam.

This comment was marked as spam.

return !prevResponse.getDone();
}
if (prevThrowable instanceof BaseServiceException) {
if (!(prevThrowable instanceof SpannerException)) {

This comment was marked as spam.

This comment was marked as spam.

The logic was simplified and modified to match exactly the original one. Note, this does not allow to retry on any retriable BaseServiceExceptions, if they are not SpannerException (on practice here each BaseServiceException is always also a SpannerException).
@vam-google
Copy link
Contributor Author

Addressed review feedback. PTAL.

@vkedia I fixed the retry logic (sorry for for breaking it). Hopefully now it is correct (matches exactly the original one in functionality). Please note that it theoretically may refuse to retry a retriable BaseServiceException, which is retriable but is not a SpannerException. This is only a theoretical concern, since all BaseServiceExceptions in this context will always be SpanneExceptions too.
Also, please note that with this change all unknown unchecked exceptions will also be wrapped in SpannerException (different from the original code). I think it makes sense here. Please let me know if that should be changed too.

return option instanceof RetryOption ? (RetryOption) option : null;
}

public static QueryResultsOption[] filterQueryResutlsOptions(QueryOption... options) {

This comment was marked as spam.

This comment was marked as spam.

* String query = "SELECT distinct(corpus) FROM `bigquery-public-data.samples.shakespeare`";
* QueryJobConfiguration queryConfig = QueryJobConfiguration.of(query);
*
* // To run the legacy syntax queries use the following code instead:

This comment was marked as spam.

This comment was marked as spam.

* may or may not contain related schema. If schema is not provided, the individual cells of the row
* will still be accessible by index but not by name.
*/
public class FieldValues extends AbstractList<FieldValue> implements Serializable {

This comment was marked as spam.

This comment was marked as spam.

* Google BigQuery Table schema fields (columns). Each field has a unique name and index. Fields
* with duplicate names are not allowed in BigQuery schema.
*/
public final class Fields extends AbstractList<Field> implements Serializable {

This comment was marked as spam.

This comment was marked as spam.

@garrettjonesgoogle
Copy link
Member

LGTM

@vam-google vam-google merged commit 1fefc4c into googleapis:master Oct 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants