Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix serialization struggles in SMB transform API #3342

Merged
merged 4 commits into from Sep 23, 2020

Conversation

clairemcginty
Copy link
Contributor

I fixed some of the serialization issues people have been seeing with the transform API when they try to extract the Read into a non lazy variable. I did this by eagerly converting the Read into a BucketedInput right away so we're not passing through non-serializable members of Read such as Schema/CodecFactory.

Tested it out in unit test, DirectRunner & DataflowRunner and the new API passes serialization each time 馃憤

I also updated the transform unit test to use the same assertions that the Sink does -- i.e. checking that the elements have all been correctly rehashed into the new bucketing scheme.

@codecov
Copy link

codecov bot commented Sep 23, 2020

Codecov Report

Merging #3342 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3342   +/-   ##
=======================================
  Coverage   72.70%   72.70%           
=======================================
  Files         234      234           
  Lines        7708     7708           
  Branches      351      345    -6     
=======================================
  Hits         5604     5604           
  Misses       2104     2104           
Impacted Files Coverage 螖
.../smb/syntax/SortMergeBucketScioContextSyntax.scala 31.52% <酶> (酶)

Continue to review full report at Codecov.

Legend - Click here to learn more
螖 = absolute <relative> (impact), 酶 = not affected, ? = missing data
Powered by Codecov. Last update 552aa13...e652a8a. Read the comment docs.

@nevillelyh nevillelyh merged commit 1cd4f85 into master Sep 23, 2020
@nevillelyh nevillelyh deleted the claire/smb_transform_ser branch September 23, 2020 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants