Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Dataflow SDK to 1.7.0 #252

Closed
ravwojdyla opened this issue Sep 14, 2016 · 8 comments
Closed

Update Dataflow SDK to 1.7.0 #252

ravwojdyla opened this issue Sep 14, 2016 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@ravwojdyla
Copy link
Contributor

New Dataflow SDK release.

@ravwojdyla ravwojdyla added the enhancement New feature or request label Sep 14, 2016
@ravwojdyla
Copy link
Contributor Author

Plain bump PR -> #253

@ravwojdyla
Copy link
Contributor Author

ravwojdyla commented Sep 15, 2016

There seems to be some change in the way BQIO handles Integers/Long. For tests using BQIO, expected values are encoded via TableRowJsonCoder - which codes TableRow as Json (Long becomes Integer). This was fine in 1.6.1 as actual (coming from pipelines) values in TableRows were Integer (at least during the matching) as well, after bumping to 1.7.0 this is no longer the case. TableRows coming from pipeline results use Long while TableRows in expected values use Integer (even if input values are Long - the conversion Long -> Integer happens after TableRowJsonCoder encode/decode cycle) thus mismatch. I'm working on fixing this.

@ravwojdyla
Copy link
Contributor Author

ravwojdyla commented Sep 15, 2016

Pushed the change that is fixing tests - by forcing the use of Int on the actual, still not sure where the difference is coming from.

@nevillelyh
Copy link
Contributor

We were using Ints in expected before. Maybe should change them to Long instead?

I suspect they changed equality behavior of table row fields?

@nevillelyh
Copy link
Contributor

You're right TableRowJsonCoder did something funky with expected. I'll leave this alone.

@ravwojdyla
Copy link
Contributor Author

ravwojdyla commented Sep 16, 2016

If I understand things correctly - I don't like how DataflowAssert does encode/decode cycle on expected values (code) but NOT on actual - thus in case of TableRow we end up with this inconsistency (apart from TableRow/Jackson Int/Long "optimization").

I think we should (and will submit PR shortly) enforce encode/decode on actual too to keep it consistent and that would also solve the problem of how TableRow/Jackson handles Int/Long.

Interestingly there is DeserializationFeature.USE_LONG_FOR_INTS maybe TableRow Coder should enforce it - given that in BQ there is no "Integer", only "Long". But that is probably up to DF/BQ.

@ravwojdyla
Copy link
Contributor Author

Updated PR -> #253 with serde cycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants