Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create sequential serializer failed #1325

Closed
1 of 2 tasks
andyczerwonka opened this issue Jan 9, 2024 · 25 comments · Fixed by #1333
Closed
1 of 2 tasks

Create sequential serializer failed #1325

andyczerwonka opened this issue Jan 9, 2024 · 25 comments · Fixed by #1333
Labels
bug Something isn't working

Comments

@andyczerwonka
Copy link

andyczerwonka commented Jan 9, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

It may be a duplicate of #1176, but the trace suggests it might be a different cause.

Version

0.4.1

Component(s)

Java

Minimal reproduce step

Run the serializer against our object graph. It seems to happen intermittently, so could be a threading issue.

What did you expect to see?

No runtime exception.

What did you see instead?

17:02:55  Exception in thread "fury-jit-compiler-2" java.lang.RuntimeException: Create sequential serializer failed, 
class: class io.citrine.lolo.bags.RegressionBaggerTrainingResult
17:02:55  	at io.fury.serializer.CodegenSerializer.loadCodegenSerializer(CodegenSerializer.java:48)
17:02:55  	at io.fury.resolver.ClassResolver.lambda$getObjectSerializerClass$2(ClassResolver.java:963)
17:02:55  	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
17:02:55  	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
17:02:55  	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
17:02:55  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
17:02:55  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
17:02:55  	at java.base/java.lang.Thread.run(Thread.java:829)
17:02:55  Caused by: java.lang.IllegalArgumentException: Expected AbstractCollectionSerializer but got io.fury.serializer.Serializer
17:02:55  	at io.fury.util.Preconditions.checkArgument(Preconditions.java:78)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.deserializeForCollection(BaseObjectCodecBuilder.java:1210)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.deserializeForNotNull(BaseObjectCodecBuilder.java:1162)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.lambda$readContainerElement$12(BaseObjectCodecBuilder.java:1401)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.readRef(BaseObjectCodecBuilder.java:1093)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.readContainerElement(BaseObjectCodecBuilder.java:1398)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.lambda$readContainerElements$2223955c$1(BaseObjectCodecBuilder.java:1360)
17:02:55  	at io.fury.codegen.Expression$ForLoop.doGenCode(Expression.java:2363)
17:02:55  	at io.fury.codegen.Expression.genCode(Expression.java:102)
17:02:55  	at io.fury.codegen.Expression$ListExpression.doGenCode(Expression.java:181)
17:02:55  	at io.fury.codegen.Expression.genCode(Expression.java:102)
17:02:55  	at io.fury.codegen.ExpressionOptimizer.invokeGenerated(ExpressionOptimizer.java:120)
17:02:55  	at io.fury.codegen.ExpressionOptimizer.invokeGenerated(ExpressionOptimizer.java:68)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.readCollectionCodegen(BaseObjectCodecBuilder.java:1298)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.deserializeForCollection(BaseObjectCodecBuilder.java:1220)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.deserializeForNotNull(BaseObjectCodecBuilder.java:1162)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.deserializeForNotNull(BaseObjectCodecBuilder.java:1122)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.lambda$deserializeFor$7(BaseObjectCodecBuilder.java:1073)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.readRef(BaseObjectCodecBuilder.java:1093)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.deserializeFor(BaseObjectCodecBuilder.java:1073)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.deserializeFor(BaseObjectCodecBuilder.java:1057)
17:02:55  	at io.fury.builder.ObjectCodecBuilder.lambda$deserializeGroup$37fcf467$1(ObjectCodecBuilder.java:522)
17:02:55  	at io.fury.codegen.ExpressionOptimizer.invokeGenerated(ExpressionOptimizer.java:48)
17:02:55  	at io.fury.builder.ObjectCodecOptimizer.invokeGenerated(ObjectCodecOptimizer.java:134)
17:02:55  	at io.fury.builder.ObjectCodecBuilder.deserializeGroup(ObjectCodecBuilder.java:538)
17:02:55  	at io.fury.builder.ObjectCodecBuilder.buildDecodeExpression(ObjectCodecBuilder.java:439)
17:02:55  	at io.fury.builder.BaseObjectCodecBuilder.genCode(BaseObjectCodecBuilder.java:204)
17:02:55  	at io.fury.codegen.CompileUnit.getCode(CompileUnit.java:54)
17:02:55  	at io.fury.codegen.JaninoUtils.toBytecode(JaninoUtils.java:72)
17:02:55  	at io.fury.codegen.JaninoUtils.toBytecode(JaninoUtils.java:64)
17:02:55  	at io.fury.codegen.CodeGenerator.compile(CodeGenerator.java:144)
17:02:55  	at io.fury.builder.CodecUtils.loadOrGenCodecClass(CodecUtils.java:91)
17:02:55  	at io.fury.builder.CodecUtils.loadOrGenObjectCodecClass(CodecUtils.java:42)
17:02:55  	at io.fury.serializer.CodegenSerializer.loadCodegenSerializer(CodegenSerializer.java:45)
17:02:55  	... 7 more

Anything Else?

I might be doing something wrong, but would like to help to triage.

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@andyczerwonka andyczerwonka added the bug Something isn't working label Jan 9, 2024
@andyczerwonka andyczerwonka changed the title llegalArgumentException: Expected AbstractCollectionSerializer but got io.fury.serializer.Serializer Create sequential serializer failed Jan 9, 2024
@chaokunyang
Copy link
Collaborator

chaokunyang commented Jan 10, 2024

Hi @andyczerwonka , thanks for reporting this bug. Can this bug be reproduced locally?

@andyczerwonka
Copy link
Author

Hi @andyczerwonka , thanks for reporting this bug. Can this bug be reproduced locally?

No, we’re seeing it intermittently.

@chaokunyang
Copy link
Collaborator

@andyczerwonka Coud you use lastest snapshot jar? Dose it work for you?

@chaokunyang
Copy link
Collaborator

@andyczerwonka I try to fix it in #1333, but I'm not sure whether they are same issue. Could you try this branch in your environment?

@andyczerwonka
Copy link
Author

@andyczerwonka I try to fix it in #1333, but I'm not sure whether they are same issue. Could you try this branch in your environment?

Which branch?

@chaokunyang
Copy link
Collaborator

https://github.com/chaokunyang/fury/tree/refine_collection_serializer_cast

@chaokunyang
Copy link
Collaborator

chaokunyang commented Jan 12, 2024

Do it address your issue ? @andyczerwonka

@andyczerwonka
Copy link
Author

andyczerwonka commented Jan 12, 2024 via email

chaokunyang added a commit that referenced this issue Jan 14, 2024
Fix nested collection cast for scala, we missed nested scala
collection/map check before

Closes #1325
@andyczerwonka
Copy link
Author

I’m still working on reproduction steps.

@chaokunyang chaokunyang reopened this Jan 14, 2024
@chaokunyang
Copy link
Collaborator

Hi @andyczerwonka, has this issue been addressed?

@andyczerwonka
Copy link
Author

@chaokunyang I have been able to reproduce. I now need instructions as to how to build the jar with your fix to validate the fix. Or you can attach it here if that's easier.

@chaokunyang
Copy link
Collaborator

chaokunyang commented Jan 16, 2024

@andyczerwonka You can just use

<repositories>
  <repository>
    <id>apache</id>
    <url>https://repository.apache.org/snapshots/</url>
    <releases>
      <enabled>false</enabled>
    </releases>
    <snapshots>
      <enabled>true</enabled>
    </snapshots>
  </repository>
</repositories>
<dependency>
  <groupId>org.apache.fury</groupId>
  <artifactId>fury-core</artifactId>
  <version>0.5.0-SNAPSHOT</version>
</dependency>
<!-- row/arrow format support -->
<!-- <dependency>
  <groupId>org.apache.fury</groupId>
  <artifactId>fury-format</artifactId>
  <version>0.5.0-SNAPSHOT</version>
</dependency> -->

@chaokunyang
Copy link
Collaborator

Can your reproduction code be converted into a unit test? Would be great if we can include it into fury unit test

@andyczerwonka
Copy link
Author

I have tested it with the 0.5.0-SNAPSHOT version and it continues to fail.

Exception in thread "fury-jit-compiler-10" java.lang.RuntimeException: Create sequential serializer failed,
class: class io.citrine.lolo.bags.RegressionBaggerTrainingResult
	at io.fury.serializer.CodegenSerializer.loadCodegenSerializer(CodegenSerializer.java:51)
	at io.fury.resolver.ClassResolver.lambda$getObjectSerializerClass$2(ClassResolver.java:966)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:75)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.IllegalArgumentException: Expected AbstractCollectionSerializer but got io.fury.serializer.Serializer
	at io.fury.util.Preconditions.checkArgument(Preconditions.java:81)
	at io.fury.builder.BaseObjectCodecBuilder.deserializeForCollection(BaseObjectCodecBuilder.java:1210)
	at io.fury.builder.BaseObjectCodecBuilder.deserializeForNotNull(BaseObjectCodecBuilder.java:1162)
	at io.fury.builder.BaseObjectCodecBuilder.lambda$readContainerElement$12(BaseObjectCodecBuilder.java:1400)
	at io.fury.builder.BaseObjectCodecBuilder.readRef(BaseObjectCodecBuilder.java:1093)
	at io.fury.builder.BaseObjectCodecBuilder.readContainerElement(BaseObjectCodecBuilder.java:1397)
	at io.fury.builder.BaseObjectCodecBuilder.lambda$readContainerElements$2223955c$1(BaseObjectCodecBuilder.java:1359)
	at io.fury.codegen.Expression$ForLoop.doGenCode(Expression.java:2366)
	at io.fury.codegen.Expression.genCode(Expression.java:105)
	at io.fury.codegen.Expression$ListExpression.doGenCode(Expression.java:184)
	at io.fury.codegen.Expression.genCode(Expression.java:105)
	at io.fury.codegen.ExpressionOptimizer.invokeGenerated(ExpressionOptimizer.java:123)
	at io.fury.codegen.ExpressionOptimizer.invokeGenerated(ExpressionOptimizer.java:71)
	at io.fury.builder.BaseObjectCodecBuilder.readCollectionCodegen(BaseObjectCodecBuilder.java:1297)
	at io.fury.builder.BaseObjectCodecBuilder.deserializeForCollection(BaseObjectCodecBuilder.java:1220)
	at io.fury.builder.BaseObjectCodecBuilder.deserializeForNotNull(BaseObjectCodecBuilder.java:1162)
	at io.fury.builder.BaseObjectCodecBuilder.deserializeForNotNull(BaseObjectCodecBuilder.java:1122)
	at io.fury.builder.BaseObjectCodecBuilder.lambda$deserializeFor$7(BaseObjectCodecBuilder.java:1073)
	at io.fury.builder.BaseObjectCodecBuilder.readRef(BaseObjectCodecBuilder.java:1093)
	at io.fury.builder.BaseObjectCodecBuilder.deserializeFor(BaseObjectCodecBuilder.java:1073)
	at io.fury.builder.BaseObjectCodecBuilder.deserializeFor(BaseObjectCodecBuilder.java:1057)
	at io.fury.builder.ObjectCodecBuilder.lambda$deserializeGroup$37fcf467$1(ObjectCodecBuilder.java:525)
	at io.fury.codegen.ExpressionOptimizer.invokeGenerated(ExpressionOptimizer.java:51)
	at io.fury.builder.ObjectCodecOptimizer.invokeGenerated(ObjectCodecOptimizer.java:137)
	at io.fury.builder.ObjectCodecBuilder.deserializeGroup(ObjectCodecBuilder.java:541)
	at io.fury.builder.ObjectCodecBuilder.buildDecodeExpression(ObjectCodecBuilder.java:442)
	at io.fury.builder.BaseObjectCodecBuilder.genCode(BaseObjectCodecBuilder.java:208)
	at io.fury.codegen.CompileUnit.getCode(CompileUnit.java:57)
	at io.fury.codegen.JaninoUtils.toBytecode(JaninoUtils.java:75)
	at io.fury.codegen.JaninoUtils.toBytecode(JaninoUtils.java:67)
	at io.fury.codegen.CodeGenerator.compile(CodeGenerator.java:147)
	at io.fury.builder.CodecUtils.loadOrGenCodecClass(CodecUtils.java:94)
	at io.fury.builder.CodecUtils.loadOrGenObjectCodecClass(CodecUtils.java:45)
	at io.fury.serializer.CodegenSerializer.loadCodegenSerializer(CodegenSerializer.java:48)
	... 7 more

@andyczerwonka
Copy link
Author

It's late here, so first thing tomorrow I will try to share the code that reproduces the error.

@andyczerwonka
Copy link
Author

@chaokunyang
Copy link
Collaborator

Hi @andyczerwonka , I tested with your code, Fury succeeds in serialization:
image

package org.apache.fury.serializer;
import org.apache.fury.Fury
import org.apache.fury.config.Language

import java.io.ByteArrayInputStream
import java.io.ByteArrayOutputStream
import java.util.Base64
import java.util.zip.GZIPInputStream
import java.util.zip.GZIPOutputStream
import _root_.scala

import _root_.scala.util.{Success, Try}
import _root_.scala.concurrent.Future
import _root_.scala.concurrent.ExecutionContext
import java.util.concurrent.ThreadLocalRandom
import _root_.scala.concurrent.Await
import org.apache.fury.ThreadLocalFury
import org.scalatest.matchers.should.Matchers
import org.scalatest.wordspec.AnyWordSpec

// What makes this fail is the nested collection in the case class. If you change it to
// as 1-dimensional collection, we no longer see the exception
case class SampleData(label: String, data: Seq[Seq[Int]])

class SerdeThreadingTest extends AnyWordSpec with Matchers {
  def threadLocalFury =
    new ThreadLocalFury(classLoader => {
      Fury
        .builder()
        .withLanguage(Language.JAVA)
        .requireClassRegistration(false)
        .withScalaOptimizationEnabled(true)
        .withRefTracking(true)
        .withStringCompressed(true)
        .withLongCompressed(true)
        .withIntCompressed(true)
        .withAsyncCompilation(true)
        .withClassLoader(classLoader)
        .build()
    })

  private val fury = Fury
    .builder()
    .withLanguage(Language.JAVA)
    .requireClassRegistration(false)
    .withScalaOptimizationEnabled(true)
    .withRefTracking(true)
    .withStringCompressed(true)
    .withLongCompressed(true)
    .withIntCompressed(true)
    .withAsyncCompilation(true)
    .buildThreadSafeFury()

  def encode(sampleData: SampleData) = {
    val raw = fury.serialize(sampleData)
    val bos = new ByteArrayOutputStream(raw.length)
    val zos = new GZIPOutputStream(bos)
    zos.write(raw)
    zos.flush()
    zos.close()
    bos.close()
    sleepBetween(500, 1000)
    Base64.getEncoder.encodeToString(bos.toByteArray)
  }

  def decode(encoded: String) =
    Try {
      val bis = new ByteArrayInputStream(Base64.getDecoder.decode(encoded))
      val zis = new GZIPInputStream(bis)
      val uncompressed = zis.readAllBytes()
      val result = fury.deserialize(uncompressed).asInstanceOf[SampleData]
      zis.close()
      bis.close()
      sleepBetween(500, 1000)
      result
    }

  def sleepBetween(min: Int, max: Int) = {
    val sleepTime = ThreadLocalRandom.current().nextInt(min, max)
    Thread.sleep(sleepTime.toLong)
  }
  "fury scala object support" should {
    "testNonThreadedSerde" in {
      val data = SampleData("single sample", Seq.empty)
      val encoded = encode(data)
      val decoded = decode(encoded)
      println(decoded)
      assert(decoded == Success(data))
    }

    "testNestedCollectionThreadedSerde" in {
      import scala.concurrent.duration._
      implicit val ec = ExecutionContext.global

      val tasks = for (i <- 1 to 1) yield Future {
        val data = SampleData(i.toString, Seq.empty)
        val encoded = encode(data)
        encoded
      }

      val decodedFuture = for {
        f <- Future.sequence(tasks)
      } yield for {
        encoded <- f
      } yield {
        val Success(decoded) = decode(encoded)
        decoded
      }

      val result = Await.result(decodedFuture, 20.seconds)
      println(result)
      assert(result.size == 1)
    }
  }
}

@chaokunyang
Copy link
Collaborator

I guess you didn't use latest snapshot version of Fury. The package name has been changed from io.fury to org.apache.fury. And nested collection serialization in case class SampleData(label: String, data: Seq[Seq[Int]]) do be fixed in #1333

@andyczerwonka
Copy link
Author

andyczerwonka commented Jan 17, 2024

Is the nested collection issue fixed in the latest snapshot? Also, can you change the code to run 10-20 threads?

for (i <- 1 to 10) yield Future

@chaokunyang
Copy link
Collaborator

Is the nested collection issue fixed in the latest snapshot?

Yep, it has been fixed in latest snapshot:
image

Please note latest snapshot is released under ASF, and the package name has been renamed to org.apache.fury instead of io.fury

@andyczerwonka
Copy link
Author

Please note latest snapshot is released under ASF, and the package name has been renamed to org.apache.fury instead of io.fury

Which is probably why it didn’t get pulled in. When can we expect a non-snapshot release?

@chaokunyang
Copy link
Collaborator

You need to add apache snapshot repo:

<repositories>
  <repository>
    <id>apache</id>
    <url>https://repository.apache.org/snapshots/</url>
    <releases>
      <enabled>false</enabled>
    </releases>
    <snapshots>
      <enabled>true</enabled>
    </snapshots>
  </repository>
</repositories>
<dependency>
  <groupId>org.apache.fury</groupId>
  <artifactId>fury-core</artifactId>
  <version>0.5.0-SNAPSHOT</version>
</dependency>

The non-snapshot release may need a month before we release it.

@andyczerwonka
Copy link
Author

@chaokunyang I have confirmed that the issue has been resolved in 0.5.0.

Is there a work-around for this issue? Can it be back-ported to something that can be released earlier? We are in a non-working state and can't move forward.

@chaokunyang
Copy link
Collaborator

chaokunyang commented Jan 18, 2024

Could you build a jar and use that jar? We may need some time before we release a new jar. We have some issue like license needs being addressed and release blog before we can release a new version

@andyczerwonka
Copy link
Author

Defect is addressed in the latest snapshot.

chaokunyang added a commit that referenced this issue May 6, 2024
## What does this PR do?
Some collectionSerializer may overwrite write/read method, then clear
element serializer may not got invoked.

This PR clears serializer for collection/map to avoid container use
wrong serializer for nested elements.

## Related issues

#1558
#1455,
#1325 and
#1176.

## Does this PR introduce any user-facing change?

<!--
If any user-facing interface changes, please [open an
issue](https://github.com/apache/incubator-fury/issues/new/choose)
describing the need to do so and update the document if necessary.
-->

- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?


## Benchmark

<!--
When the PR has an impact on performance (if you don't know whether the
PR will have an impact on performance, you can submit the PR first, and
if it will have impact on performance, the code reviewer will explain
it), be sure to attach a benchmark data here.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants