You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Cast string to decimal won't return null for out-of-range floats, it will return the value instead of null or throw an exception.
For example 1.23E+21 cast to decimal(15,-5) will return 1.2300000000000000E+21 on GPU, but null (ansi off) or SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5) (ansi on) on CPU.
Steps/Code to reproduce bug
scala> val df = Seq("1.23E+21").toDF("col")
df: org.apache.spark.sql.DataFrame = [col: string]
scala> df.write.mode("OVERWRITE").parquet("TEMP")
24/05/27 10:35:34 WARN GpuOverrides:
*Exec <DataWritingCommandExec> will run on GPU
*Output <InsertIntoHadoopFsRelationCommand> will run on GPU
! <LocalTableScanExec> cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.LocalTableScanExec
@Expression <AttributeReference> col#4 could run on GPU
scala> import org.apache.spark.sql.types._
import org.apache.spark.sql.types._
scala> spark.conf.set("spark.sql.legacy.allowNegativeScaleOfDecimal", "true")
scala>
scala> spark.conf.set("spark.sql.ansi.enabled", "true")
scala> val decType = DataTypes.createDecimalType(15, -5)
decType: org.apache.spark.sql.types.DecimalType = DecimalType(15,-5)
scala> spark.read.parquet("TEMP").select(col("col").cast(decType)).show(false)
24/05/27 10:36:42 WARN GpuOverrides:
!Exec <CollectLimitExec> cannot run on GPU because the Exec CollectLimitExec has been disabled, and is disabled by default because Collect Limit replacement can be slower on the GPU, if huge number of rows in a batch it could help by limiting the number of rows transferred from GPU to CPU. Set spark.rapids.sql.exec.CollectLimitExec to true if you wish to enable it
@Partitioning <SinglePartition$> could run on GPU
*Exec <ProjectExec> will run on GPU
*Expression <Alias> cast(cast(col#7 as decimal(15,-5)) as string) AS col#13 will run on GPU
*Expression <Cast> cast(cast(col#7 as decimal(15,-5)) as string) will run on GPU
*Expression <Cast> cast(col#7 as decimal(15,-5)) will run on GPU
*Exec <FileSourceScanExec> will run on GPU
+----------------------+
|col |
+----------------------+
|1.2300000000000000E+21|
+----------------------+
scala> spark.conf.set("spark.rapids.sql.enabled", "false")
scala> spark.read.parquet("TEMP").select(col("col").cast(decType)).show(false)
java.lang.AssertionError: assertion failed:
Decimal$DecimalIsFractional
while compiling: <console>
during phase: globalPhase=terminal, enteringPhase=jvm
library version: version 2.12.15
compiler version: version 2.12.15
reconstructed args: -classpath /home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-24.08.0-SNAPSHOT-cuda11.jar -Yrepl-class-based -Yrepl-outdir /tmp/spark-026c13e7-fb35-46ff-a225-22ac468cd6e1/repl-47525fc2-08a4-4b4b-b850-a00c64aa2c3b
last tree to typer: TypeTree(class Byte)
tree position: line 6 of <console>
tree tpe: Byte
symbol: (final abstract) class Byte in package scala
symbol definition: final abstract class Byte extends (a ClassSymbol)
symbol package: scala
symbol owners: class Byte
call site: constructor $eval in object $eval in package $line22
== Source file context for tree position ==
3
4 object $eval {
5 lazy val $result = $line22.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.res5
6 lazy val $print: _root_.java.lang.String = {
7 $line22.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw
8
9 ""
at scala.reflect.internal.SymbolTable.throwAssertionError(SymbolTable.scala:185)
at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1525)
at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1514)
at scala.reflect.internal.Symbols$Symbol.flatOwnerInfo(Symbols.scala:2353)
at scala.reflect.internal.Symbols$ClassSymbol.companionModule0(Symbols.scala:3346)
at scala.reflect.internal.Symbols$ClassSymbol.companionModule(Symbols.scala:3348)
at scala.reflect.internal.Symbols$ModuleClassSymbol.sourceModule(Symbols.scala:3487)
at scala.reflect.internal.Symbols.$anonfun$forEachRelevantSymbols$1$adapted(Symbols.scala:3802)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
at scala.reflect.internal.Symbols.markFlagsCompleted(Symbols.scala:3799)
at scala.reflect.internal.Symbols.markFlagsCompleted$(Symbols.scala:3805)
at scala.reflect.internal.SymbolTable.markFlagsCompleted(SymbolTable.scala:28)
at scala.reflect.internal.pickling.UnPickler$Scan.finishSym$1(UnPickler.scala:324)
at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:342)
at scala.reflect.internal.pickling.UnPickler$Scan.readSymbolRef(UnPickler.scala:645)
at scala.reflect.internal.pickling.UnPickler$Scan.readType(UnPickler.scala:413)
at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$readSymbol$10(UnPickler.scala:357)
at scala.reflect.internal.pickling.UnPickler$Scan.at(UnPickler.scala:188)
at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:357)
at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$run$1(UnPickler.scala:96)
at scala.reflect.internal.pickling.UnPickler$Scan.run(UnPickler.scala:88)
at scala.reflect.internal.pickling.UnPickler.unpickle(UnPickler.scala:47)
at scala.tools.nsc.symtab.classfile.ClassfileParser.unpickleOrParseInnerClasses(ClassfileParser.scala:1186)
at scala.tools.nsc.symtab.classfile.ClassfileParser.parseClass(ClassfileParser.scala:468)
at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$2(ClassfileParser.scala:161)
at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$1(ClassfileParser.scala:147)
at scala.tools.nsc.symtab.classfile.ClassfileParser.parse(ClassfileParser.scala:130)
at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:343)
at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:250)
at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.load(SymbolLoaders.scala:269)
at scala.reflect.internal.Symbols$Symbol.exists(Symbols.scala:1104)
at scala.reflect.internal.Symbols$Symbol.toOption(Symbols.scala:2609)
at scala.tools.nsc.interpreter.IMain.translateSimpleResource(IMain.scala:340)
at scala.tools.nsc.interpreter.IMain$TranslatingClassLoader.findAbstractFile(IMain.scala:354)
at scala.reflect.internal.util.AbstractFileClassLoader.findResource(AbstractFileClassLoader.scala:76)
at java.lang.ClassLoader.getResource(ClassLoader.java:1089)
at java.lang.ClassLoader.getResourceAsStream(ClassLoader.java:1300)
at scala.reflect.internal.util.RichClassLoader$.classAsStream$extension(ScalaClassLoader.scala:89)
at scala.reflect.internal.util.RichClassLoader$.classBytes$extension(ScalaClassLoader.scala:81)
at scala.reflect.internal.util.ScalaClassLoader.classBytes(ScalaClassLoader.scala:131)
at scala.reflect.internal.util.ScalaClassLoader.classBytes$(ScalaClassLoader.scala:131)
at scala.reflect.internal.util.AbstractFileClassLoader.classBytes(AbstractFileClassLoader.scala:41)
at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:70)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.java:40)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:89)
at org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:317)
at org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:8618)
at org.codehaus.janino.UnitCompiler.reclassifyName(UnitCompiler.java:8838)
at org.codehaus.janino.UnitCompiler.reclassifyName(UnitCompiler.java:8529)
at org.codehaus.janino.UnitCompiler.reclassify(UnitCompiler.java:8388)
at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6900)
at org.codehaus.janino.UnitCompiler.access$14600(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$22$2$1.visitAmbiguousName(UnitCompiler.java:6518)
at org.codehaus.janino.UnitCompiler$22$2$1.visitAmbiguousName(UnitCompiler.java:6515)
at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:4429)
at org.codehaus.janino.UnitCompiler$22$2.visitLvalue(UnitCompiler.java:6515)
at org.codehaus.janino.UnitCompiler$22$2.visitLvalue(UnitCompiler.java:6511)
at org.codehaus.janino.Java$Lvalue.accept(Java.java:4353)
at org.codehaus.janino.UnitCompiler$22.visitRvalue(UnitCompiler.java:6511)
at org.codehaus.janino.UnitCompiler$22.visitRvalue(UnitCompiler.java:6490)
at org.codehaus.janino.Java$Rvalue.accept(Java.java:4321)
at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6490)
at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:9110)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:5055)
at org.codehaus.janino.UnitCompiler.access$9100(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$16.visitMethodInvocation(UnitCompiler.java:4482)
at org.codehaus.janino.UnitCompiler$16.visitMethodInvocation(UnitCompiler.java:4455)
at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:5286)
at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:4455)
at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:5683)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2581)
at org.codehaus.janino.UnitCompiler.access$2700(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$6.visitLocalVariableDeclarationStatement(UnitCompiler.java:1506)
at org.codehaus.janino.UnitCompiler$6.visitLocalVariableDeclarationStatement(UnitCompiler.java:1490)
at org.codehaus.janino.Java$LocalVariableDeclarationStatement.accept(Java.java:3712)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1573)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1559)
at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1496)
at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1490)
at org.codehaus.janino.Java$Block.accept(Java.java:2969)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2486)
at org.codehaus.janino.UnitCompiler.access$1900(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1498)
at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1490)
at org.codehaus.janino.Java$IfStatement.accept(Java.java:3140)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1573)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1559)
at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1496)
at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1490)
at org.codehaus.janino.Java$Block.accept(Java.java:2969)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1661)
at org.codehaus.janino.UnitCompiler.access$2000(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$6.visitForStatement(UnitCompiler.java:1499)
at org.codehaus.janino.UnitCompiler$6.visitForStatement(UnitCompiler.java:1490)
at org.codehaus.janino.Java$ForStatement.accept(Java.java:3187)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1573)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1559)
at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1496)
at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1490)
at org.codehaus.janino.Java$Block.accept(Java.java:2969)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1848)
at org.codehaus.janino.UnitCompiler.access$2200(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$6.visitWhileStatement(UnitCompiler.java:1501)
at org.codehaus.janino.UnitCompiler$6.visitWhileStatement(UnitCompiler.java:1490)
at org.codehaus.janino.Java$WhileStatement.accept(Java.java:3245)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1490)
at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1573)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3420)
at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1362)
at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1335)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:807)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:975)
at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:392)
at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:384)
at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1445)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:384)
at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1312)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:833)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:410)
at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:389)
at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:384)
at org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1594)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:384)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:362)
at org.codehaus.janino.UnitCompiler.access$000(UnitCompiler.java:226)
at org.codehaus.janino.UnitCompiler$1.visitCompilationUnit(UnitCompiler.java:336)
at org.codehaus.janino.UnitCompiler$1.visitCompilationUnit(UnitCompiler.java:333)
at org.codehaus.janino.Java$CompilationUnit.accept(Java.java:363)
at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:333)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:235)
at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:464)
at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:314)
at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:237)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:205)
at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:1490)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1587)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1584)
at org.sparkproject.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at org.sparkproject.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
at org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000)
at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1437)
at org.apache.spark.sql.execution.WholeStageCodegenExec.liftedTree1$1(WholeStageCodegenExec.scala:726)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:725)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:340)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:473)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:459)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:48)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3868)
at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2863)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3858)
at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3856)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3856)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2863)
at org.apache.spark.sql.Dataset.take(Dataset.scala:3084)
at org.apache.spark.sql.Dataset.getRows(Dataset.scala:288)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:327)
at org.apache.spark.sql.Dataset.show(Dataset.scala:810)
at org.apache.spark.sql.Dataset.show(Dataset.scala:787)
at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:27)
at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)
at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)
at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:35)
at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:37)
at $line22.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:39)
at $line22.$read$$iw$$iw$$iw$$iw.<init>(<console>:41)
at $line22.$read$$iw$$iw$$iw.<init>(<console>:43)
at $line22.$read$$iw$$iw.<init>(<console>:45)
at $line22.$read$$iw.<init>(<console>:47)
at $line22.$read.<init>(<console>:49)
at $line22.$read$.<init>(<console>:53)
at $line22.$read$.<clinit>(<console>)
at $line22.$eval$.$print$lzycompute(<console>:7)
at $line22.$eval$.$print(<console>:6)
at $line22.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020)
at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568)
at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36)
at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:865)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:733)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:435)
at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:456)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:239)
at org.apache.spark.repl.Main$.doMain(Main.scala:78)
at org.apache.spark.repl.Main$.main(Main.scala:58)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
error: error while loading Decimal, class file '/home/haoyangl/spark-3.3.0-bin-hadoop3.2/jars/spark-catalyst_2.12-3.3.0.jar(org/apache/spark/sql/types/Decimal.class)' is broken
(class java.lang.RuntimeException/error reading Scala signature of Decimal.class: assertion failed:
Decimal$DecimalIsFractional
while compiling: <console>
during phase: globalPhase=terminal, enteringPhase=jvm
library version: version 2.12.15
compiler version: version 2.12.15
reconstructed args: -classpath /home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-24.08.0-SNAPSHOT-cuda11.jar -Yrepl-class-based -Yrepl-outdir /tmp/spark-026c13e7-fb35-46ff-a225-22ac468cd6e1/repl-47525fc2-08a4-4b4b-b850-a00c64aa2c3b
last tree to typer: TypeTree(class Byte)
tree position: line 6 of <console>
tree tpe: Byte
symbol: (final abstract) class Byte in package scala
symbol definition: final abstract class Byte extends (a ClassSymbol)
symbol package: scala
symbol owners: class Byte
call site: constructor $eval in object $eval in package $line22
== Source file context for tree position ==
3
4 object $eval {
5 lazy val $result = res5
6 lazy val $print: _root_.java.lang.String = {
7 $iw
8
9 "" )
24/05/27 10:36:56 ERROR Executor: Exception in task 0.0 in stage 5.0 (TID 5)
org.apache.spark.SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:108)
at org.apache.spark.sql.errors.QueryExecutionErrors.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
24/05/27 10:36:56 WARN TaskSetManager: Lost task 0.0 in stage 5.0 (TID 5) (spark-haoyang executor driver): org.apache.spark.SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:108)
at org.apache.spark.sql.errors.QueryExecutionErrors.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
24/05/27 10:36:56 ERROR TaskSetManager: Task 0 in stage 5.0 failed 1 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 5.0 failed 1 times, most recent failure: Lost task 0.0 in stage 5.0 (TID 5) (spark-haoyang executor driver): org.apache.spark.SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:108)
at org.apache.spark.sql.errors.QueryExecutionErrors.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2672)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2608)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2607)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2607)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1182)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1182)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1182)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2860)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2802)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2791)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:952)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2228)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2249)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2268)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:506)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:459)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:48)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3868)
at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2863)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3858)
at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3856)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3856)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2863)
at org.apache.spark.sql.Dataset.take(Dataset.scala:3084)
at org.apache.spark.sql.Dataset.getRows(Dataset.scala:288)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:327)
at org.apache.spark.sql.Dataset.show(Dataset.scala:810)
at org.apache.spark.sql.Dataset.show(Dataset.scala:787)
... 49 elided
Caused by: org.apache.spark.SparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:108)
at org.apache.spark.sql.errors.QueryExecutionErrors.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:364)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Expected behavior
Results on CPU and GPU should match.
The text was updated successfully, but these errors were encountered:
Describe the bug
Cast string to decimal won't return null for out-of-range floats, it will return the value instead of null or throw an exception.
For example
1.23E+21
cast to decimal(15,-5) will return1.2300000000000000E+21
on GPU, but null (ansi off) orSparkArithmeticException: Decimal(expanded, 1.23E+21, 3, -19) cannot be represented as Decimal(15, -5)
(ansi on) on CPU.Steps/Code to reproduce bug
Expected behavior
Results on CPU and GPU should match.
The text was updated successfully, but these errors were encountered: