Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigqueryNullFloat64 add +Inf, -Inf and NaN support #2574

Closed
wants to merge 2 commits into from

Conversation

deankarn
Copy link
Contributor

@deankarn deankarn commented Jul 9, 2020

The following is a PR to fix issue #2575 to enable support for +Inf, -Inf and Nan which BigQuery supports but Go's json package does not.

This is related to PR #2408 but implements the fix slightly differently supporting short form +Inf and -Inf so less bytes have to go across the wire.

@deankarn deankarn requested a review from shollyman as a code owner July 9, 2020 18:06
@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added the cla: no This human has *not* signed the Contributor License Agreement. label Jul 9, 2020
@deankarn
Copy link
Contributor Author

deankarn commented Jul 9, 2020

@googlebot I signed it!

@deankarn
Copy link
Contributor Author

deankarn commented Jul 9, 2020

@googlebot I signed it!

@deankarn
Copy link
Contributor Author

deankarn commented Jul 9, 2020

@shollyman I corrected the original commit not having the proper commit email and ensured the email is also registered with my GitHub account. Not sure why it's still saying CLA is not signed, perhaps needs to be force ran again?

Please let me know if anything else is required that I've missed so I can correct it :)

Copy link
Contributor

@shollyman shollyman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together, apologies on my slowness getting to it.

I left a discussion comment in the change itself. Assuming you agree, we could simplify your list of literals and remove some of the ambiguous matching.

@@ -181,6 +196,12 @@ func (n *NullFloat64) UnmarshalJSON(b []byte) error {
n.Float64 = 0
if bytes.Equal(b, jsonNull) {
return nil
} else if bytes.HasPrefix(b, posInf) || bytes.HasPrefix(b, posInf2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially confused by the HasPrefix, but I'm guessing it's because you're trying to anticipate both the short and long forms (Inf / Infinity)?

Confusingly, BQ returns the string-quoted value "Infinity", "-Infinity", and "Nan" in the api responses for the related float values within row data, but documents the short (and uncapitalized) cases in the sql reference, which wouldn't be captured with this prefix logic: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#floating_point_types

In order to avoid getting into parsing confusion and anticipating all the corner cases, my thinking is that this change should only codify the values that are safe for round-tripping as both input and outputs, e.g. the ones that come out of the row data. You're defining the contract for how NullFloat64 is encoded and decoded, but users can still do their own encoding with a custom type/marshaler if they so choose (and want to rely on BigQuery converting as appropriate).

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @shollyman

I agree that HasPrefix might lead to corner cases and makes sense to change that to equality checks instead :) will do that shortly.

I was also using HasPrefix for two other reasons:

  1. When Go itself prints values for Infinity, -Infinity it does so using the short form +Inf and -Inf (which the BigQuery api accepts happily)
  2. The source of some data I'm parsing was also using the short form that was produced by Go.

To avoid a heap of customization in my, and others, code base' I was hoping to keep the unmarshaling short form support.

What do you think about :

  • Changing this to equality checks only, no HasPrefix to avoid any corned cases.
  • Change to always Marshal the long form of these values Infinity, -Infinity and NaN
  • Keep the Unmarshal support for the short form equality checks as well as the long form. Doing this will avoid people having to create a custom type to support the way Go itself, and other languages, output this kind of data.

I can't see any harm also supporting the short form for unmarshaling purposes only, thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable. It's definitely the case that BQ is following the robustness principle in terms of parsing the variation, so it's reasonable we include that when unmarshalling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks will make the changes and ping you in the PR once done.

- Updated PR for equality checks instead of HasPefix.
- Added dedicated NullFloat64 tests.
@google-cla
Copy link

google-cla bot commented Jul 16, 2020

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Jul 16, 2020
@deankarn
Copy link
Contributor Author

@shollyman I updated the PR based on our conversation and added dedicated tests for NullFloat64

please let me know if there's anything else I can do to get this merged :)

Copy link
Contributor

@shollyman shollyman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. I'll go ahead and get this submitted.

@shollyman shollyman added automerge Merge the pull request once unit tests and other checks pass. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Jul 20, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jul 20, 2020
@shollyman
Copy link
Contributor

One last item: can you edit your PR to allow edits by maintainers? Neither I nor our automation bots can update the branch to merge it currently.

@deankarn
Copy link
Contributor Author

@shollyman I had to create a new PR because GitHub doesn't allow setting edits by maintainers on organization forks, only user forks. (may want to mention this in your contributing guidelines)

Will close this one and the new PR is here just merged them to my user fork #2618

@deankarn deankarn closed this Jul 20, 2020
shollyman added a commit to shollyman/google-cloud-go that referenced this pull request Jul 20, 2020
Adds an explicit mention to enable "Allow edits by maintainers" when
setting up the pull request.  In PR googleapis#2574 (external contribution) it was
discovered this cannot be added after the fact for org-based forks, only
user-based.

As there's not a great reference link for this in the github docs that
doesn't descend into a bunch of organization permissions management, I'm
not calling this specific circumstance explicitly.
gcf-merge-on-green bot pushed a commit that referenced this pull request Jul 20, 2020
This is a new PR for closed PR #2574 because you can only allow edits for maintainers on user-forks and NOT organization forks.
@buyology
Copy link

@shollyman can you please explain why this gets reviewed and merged and not #2408?

I first submitted this change a year ago (https://code-review.googlesource.com/c/gocloud/+/44750) and even then left with no review — and was the first to submit anything when you migrated to GitHub.

Feels pretty demotivating to come up with any further changes if they just get ignored!

shollyman added a commit that referenced this pull request Jul 21, 2020
Adds an explicit mention to enable "Allow edits by maintainers" when
setting up the pull request.  In PR #2574 (external contribution) it was
discovered this cannot be added after the fact for org-based forks, only
user-based.

As there's not a great reference link for this in the github docs that
doesn't descend into a bunch of organization permissions management, I'm
not calling this specific circumstance explicitly.
@shollyman
Copy link
Contributor

@buyology My apologies for overlooking your PR (twice, sadly in this case). I recently modified how I consume github notifications, hence my attention on a newer PR. If I've overlooked other feedback/contributions from you, feel free to reach out via email (same username @google.com) to discuss.

tritone pushed a commit to tritone/google-cloud-go that referenced this pull request Aug 25, 2020
This is a new PR for closed PR googleapis#2574 because you can only allow edits for maintainers on user-forks and NOT organization forks.
tritone pushed a commit to tritone/google-cloud-go that referenced this pull request Aug 25, 2020
Adds an explicit mention to enable "Allow edits by maintainers" when
setting up the pull request.  In PR googleapis#2574 (external contribution) it was
discovered this cannot be added after the fact for org-based forks, only
user-based.

As there's not a great reference link for this in the github docs that
doesn't descend into a bunch of organization permissions management, I'm
not calling this specific circumstance explicitly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Merge the pull request once unit tests and other checks pass. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants