Skip to content

SparkSha2 is not compliant with Spark and does not support Int32 type #16336

Closed
@rishvin

Description

@rishvin

Describe the bug

This ticket is related to #1820 from Comet.

We are working on using Datafusion's Sha2 (SparkSha2) implementation in Comet. However when making changes, found 2 issues (See here for full details),

  • SparkSha2 is not fully compliant with the Apache Spark Sha2 response. The Apache Spark returns Sha2 hex output in lowercase however SparkSha2 returns the response in uppercase. Eg,
Datafusion SparkSha2 response - 2C83E9E8A39D60F7FCD3CFEC29C154260AA069F91CD40C972756F9354C64594E
Spark sha2 response - 2c83e9e8a39d60f7fcd3cfec29c154260aa069f91cd40c972756f9354c64594e
  • SparkSha2 pattern matching expects Sha2 bit-length type to be Uint32, however Spark (and Comet) doesn't support Uint32 type. This is due to the limitation of JVM which only support signed types. Because of which Comet is not able to send UInt32 type for bit-length and hence pattern matching fails.

This ticket is about making two fixes,

  • Make SparkSha2 response compliant with Spark.
  • Add support for Int32

To Reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions