Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37556][SQL] Deser void class fail with Java serialization #34816

Closed
wants to merge 1 commit into from

Conversation

daijyc
Copy link
Contributor

@daijyc daijyc commented Dec 6, 2021

What changes were proposed in this pull request?
Change the deserialization mapping for primitive type void.

Why are the changes needed?
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously map it differently than all other primitive types. Here is the code:

private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}

Spark code is Here is the demonstration:

scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right

It will result in Spark deserialization error if the Spark code contains void primitive type:
java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"

Does this PR introduce any user-facing change?
no

How was this patch tested?
Changed test, also tested e2e with the code results deserialization error and it pass now.

@github-actions github-actions bot added the CORE label Dec 6, 2021
@HyukjinKwon HyukjinKwon changed the title SPARK-37556: Deser void class fail with Java serialization [SPARK-37556][SQL] Deser void class fail with Java serialization Dec 6, 2021
@HyukjinKwon
Copy link
Member

can you fill the JIRA description please? SEe also https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE

@srowen
Copy link
Member

srowen commented Dec 6, 2021

Jenkins test this please

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50430/

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50430/

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Test build #145956 has finished for PR 34816 at commit 7fb333a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen srowen closed this in fb40c0e Dec 7, 2021
srowen pushed a commit that referenced this pull request Dec 7, 2021
**What changes were proposed in this pull request?**
Change the deserialization mapping for primitive type void.

**Why are the changes needed?**
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously [map it](https://github.com/apache/spark/blob/v3.2.0/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala#L80) differently than all other primitive types. Here is the code:
```
private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}
```
Spark code is Here is the demonstration:
```
scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right
```

It will result in Spark deserialization error if the Spark code contains void primitive type:
`java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"`

**Does this PR introduce any user-facing change?**
no

**How was this patch tested?**
Changed test, also tested e2e with the code results deserialization error and it pass now.

Closes #34816 from daijyc/voidtype.

Authored-by: Daniel Dai <jdai@pinterest.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit fb40c0e)
Signed-off-by: Sean Owen <srowen@gmail.com>
srowen pushed a commit that referenced this pull request Dec 7, 2021
**What changes were proposed in this pull request?**
Change the deserialization mapping for primitive type void.

**Why are the changes needed?**
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously [map it](https://github.com/apache/spark/blob/v3.2.0/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala#L80) differently than all other primitive types. Here is the code:
```
private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}
```
Spark code is Here is the demonstration:
```
scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right
```

It will result in Spark deserialization error if the Spark code contains void primitive type:
`java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"`

**Does this PR introduce any user-facing change?**
no

**How was this patch tested?**
Changed test, also tested e2e with the code results deserialization error and it pass now.

Closes #34816 from daijyc/voidtype.

Authored-by: Daniel Dai <jdai@pinterest.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit fb40c0e)
Signed-off-by: Sean Owen <srowen@gmail.com>
srowen pushed a commit that referenced this pull request Dec 7, 2021
**What changes were proposed in this pull request?**
Change the deserialization mapping for primitive type void.

**Why are the changes needed?**
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously [map it](https://github.com/apache/spark/blob/v3.2.0/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala#L80) differently than all other primitive types. Here is the code:
```
private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}
```
Spark code is Here is the demonstration:
```
scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right
```

It will result in Spark deserialization error if the Spark code contains void primitive type:
`java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"`

**Does this PR introduce any user-facing change?**
no

**How was this patch tested?**
Changed test, also tested e2e with the code results deserialization error and it pass now.

Closes #34816 from daijyc/voidtype.

Authored-by: Daniel Dai <jdai@pinterest.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit fb40c0e)
Signed-off-by: Sean Owen <srowen@gmail.com>
@srowen
Copy link
Member

srowen commented Dec 7, 2021

Merged to master/3.2/3.1/3.0

fishcus pushed a commit to fishcus/spark that referenced this pull request Jan 12, 2022
**What changes were proposed in this pull request?**
Change the deserialization mapping for primitive type void.

**Why are the changes needed?**
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously [map it](https://github.com/apache/spark/blob/v3.2.0/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala#L80) differently than all other primitive types. Here is the code:
```
private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}
```
Spark code is Here is the demonstration:
```
scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right
```

It will result in Spark deserialization error if the Spark code contains void primitive type:
`java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"`

**Does this PR introduce any user-facing change?**
no

**How was this patch tested?**
Changed test, also tested e2e with the code results deserialization error and it pass now.

Closes apache#34816 from daijyc/voidtype.

Authored-by: Daniel Dai <jdai@pinterest.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit fb40c0e)
Signed-off-by: Sean Owen <srowen@gmail.com>
catalinii pushed a commit to lyft/spark that referenced this pull request Feb 22, 2022
**What changes were proposed in this pull request?**
Change the deserialization mapping for primitive type void.

**Why are the changes needed?**
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously [map it](https://github.com/apache/spark/blob/v3.2.0/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala#L80) differently than all other primitive types. Here is the code:
```
private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}
```
Spark code is Here is the demonstration:
```
scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right
```

It will result in Spark deserialization error if the Spark code contains void primitive type:
`java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"`

**Does this PR introduce any user-facing change?**
no

**How was this patch tested?**
Changed test, also tested e2e with the code results deserialization error and it pass now.

Closes apache#34816 from daijyc/voidtype.

Authored-by: Daniel Dai <jdai@pinterest.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit fb40c0e)
Signed-off-by: Sean Owen <srowen@gmail.com>
catalinii pushed a commit to lyft/spark that referenced this pull request Mar 4, 2022
**What changes were proposed in this pull request?**
Change the deserialization mapping for primitive type void.

**Why are the changes needed?**
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously [map it](https://github.com/apache/spark/blob/v3.2.0/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala#L80) differently than all other primitive types. Here is the code:
```
private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}
```
Spark code is Here is the demonstration:
```
scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right
```

It will result in Spark deserialization error if the Spark code contains void primitive type:
`java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"`

**Does this PR introduce any user-facing change?**
no

**How was this patch tested?**
Changed test, also tested e2e with the code results deserialization error and it pass now.

Closes apache#34816 from daijyc/voidtype.

Authored-by: Daniel Dai <jdai@pinterest.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit fb40c0e)
Signed-off-by: Sean Owen <srowen@gmail.com>
kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
**What changes were proposed in this pull request?**
Change the deserialization mapping for primitive type void.

**Why are the changes needed?**
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously [map it](https://github.com/apache/spark/blob/v3.2.0/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala#L80) differently than all other primitive types. Here is the code:
```
private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}
```
Spark code is Here is the demonstration:
```
scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right
```

It will result in Spark deserialization error if the Spark code contains void primitive type:
`java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"`

**Does this PR introduce any user-facing change?**
no

**How was this patch tested?**
Changed test, also tested e2e with the code results deserialization error and it pass now.

Closes apache#34816 from daijyc/voidtype.

Authored-by: Daniel Dai <jdai@pinterest.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit fb40c0e)
Signed-off-by: Sean Owen <srowen@gmail.com>
wangyum pushed a commit that referenced this pull request May 26, 2023
**What changes were proposed in this pull request?**
Change the deserialization mapping for primitive type void.

**Why are the changes needed?**
The void primitive type in Scala should be classOf[Unit] not classOf[Void]. Spark erroneously [map it](https://github.com/apache/spark/blob/v3.2.0/core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala#L80) differently than all other primitive types. Here is the code:
```
private object JavaDeserializationStream {
  val primitiveMappings = Map[String, Class[_]](
    "boolean" -> classOf[Boolean],
    "byte" -> classOf[Byte],
    "char" -> classOf[Char],
    "short" -> classOf[Short],
    "int" -> classOf[Int],
    "long" -> classOf[Long],
    "float" -> classOf[Float],
    "double" -> classOf[Double],
    "void" -> classOf[Void]
  )
}
```
Spark code is Here is the demonstration:
```
scala> classOf[Long]
val res0: Class[Long] = long

scala> classOf[Double]
val res1: Class[Double] = double

scala> classOf[Byte]
val res2: Class[Byte] = byte

scala> classOf[Void]
val res3: Class[Void] = class java.lang.Void  <--- this is wrong

scala> classOf[Unit]
val res4: Class[Unit] = void <---- this is right
```

It will result in Spark deserialization error if the Spark code contains void primitive type:
`java.io.InvalidClassException: java.lang.Void; local class name incompatible with stream class name "void"`

**Does this PR introduce any user-facing change?**
no

**How was this patch tested?**
Changed test, also tested e2e with the code results deserialization error and it pass now.

Closes #34816 from daijyc/voidtype.

Authored-by: Daniel Dai <jdai@pinterest.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants