Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SingletonCoder serialization #4691

Merged
merged 1 commit into from
Feb 3, 2023
Merged

Conversation

RustedBones
Copy link
Contributor

If singleton in SingletonCoder was initialized, the coder could not be deserialized.

java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With Execution Info
...
Cause: java.io.InvalidClassException: com.example.MyObject$; no valid constructor

@codecov
Copy link

codecov bot commented Feb 2, 2023

Codecov Report

Merging #4691 (94a35b2) into main (470c80f) will increase coverage by 0.00%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #4691   +/-   ##
=======================================
  Coverage   60.64%   60.65%           
=======================================
  Files         286      286           
  Lines       10453    10453           
  Branches      694      694           
=======================================
+ Hits         6339     6340    +1     
+ Misses       4114     4113    -1     
Impacted Files Coverage Δ
...src/main/scala/com/spotify/scio/coders/Coder.scala 85.84% <ø> (+0.44%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@shnapz
Copy link
Contributor

shnapz commented Feb 2, 2023

Interesting, good catch on LazyCoder!
But what is the use case? I thought Singleton is used only for scala Object

@RustedBones
Copy link
Contributor Author

@shnapz The problem comes when users have object in their model.
This can happen with ADTs, eg:

sealed trait Color
case object Red extends Color
case object Green extends Color
case object Blue extends Color

The disjunction coder for Color will delegate to singleton coders.
The previous code was working in most cases as case objects are serializable. However, it does not when:

  • the object is not a case object
  • the case object extends a class with parameter as explained here

The trick here is to only serialize the supplier in the coder.

In practice, Coder are serialized before being used. So the lazy val won't be initialized, (serialized as null).
But in some test cases, beam serializes the coder after usage, hence the observed failure.

@RustedBones RustedBones merged commit ba79bcd into main Feb 3, 2023
@RustedBones RustedBones deleted the singleton-coder-serialization branch February 3, 2023 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants