Skip to content

[SPARK-30262][SQL] Avoid NumberFormatException when totalSize is empty#26892

Closed
southernriver wants to merge 1 commit intoapache:masterfrom
southernriverchen:SPARK-30262
Closed

[SPARK-30262][SQL] Avoid NumberFormatException when totalSize is empty#26892
southernriver wants to merge 1 commit intoapache:masterfrom
southernriverchen:SPARK-30262

Conversation

@southernriver
Copy link
Contributor

@southernriver southernriver commented Dec 14, 2019

What changes were proposed in this pull request?

We could get the Partitions Statistics Info.But in some specail case, The Info like totalSize,rawDataSize,rowCount maybe empty. When we do some ddls like
desc formatted partition ,the NumberFormatException is showed as below:

spark-sql> desc formatted table1 partition(year='2019', month='10', day='17', hour='23');
19/10/19 00:02:40 ERROR SparkSQLDriver: Failed in [desc formatted table1 partition(year='2019', month='10', day='17', hour='23')]
java.lang.NumberFormatException: Zero length BigInteger
at java.math.BigInteger.(BigInteger.java:411)
at java.math.BigInteger.(BigInteger.java:597)
at scala.math.BigInt$.apply(BigInt.scala:77)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056)

Although we can use 'Analyze table partition ' to update the totalSize,rawDataSize or rowCount, it's unresonable for normal SQL to throw NumberFormatException for Empty totalSize.We should fix the empty case when readHiveStats.

Why are the changes needed?

This is a related to the robustness of the code and may lead to unexpected exception in some unpredictable situation.Here is the case:
image

Does this PR introduce any user-facing change?

No

How was this patch tested?

manual

@dongjoon-hyun
Copy link
Member

Thank you for your first contribution, @southernriver .

@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Dec 14, 2019

Test build #115338 has finished for PR 26892 at commit aff4a33.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-30262][SQL] fix NumberFormatException when totalSize is empty [SPARK-30262][SQL] Fix NumberFormatException when totalSize is empty Dec 15, 2019
val totalSize = properties.get(StatsSetupConst.TOTAL_SIZE).filter(_.nonEmpty).map(BigInt(_))
val rawDataSize = properties.get(StatsSetupConst.RAW_DATA_SIZE).filter(_.nonEmpty)
.map(BigInt(_))
val rowCount = properties.get(StatsSetupConst.ROW_COUNT).filter(_.nonEmpty).map(BigInt(_))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a logical patch, but there is no evidence when this happens.
Could you give me a reproducible procedure which you met?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if you are hitting the other system's bug. Then, you had better fix the root cause.

This is a related to the robustness of the code and may lead to unexpected exception in some unpredictable situation.Here is the case:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you add tests for that?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Since this is a trivial safeguard, I'll merge this to master. Thank you, @southernriver , @srowen , @maropu .

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-30262][SQL] Fix NumberFormatException when totalSize is empty [SPARK-30262][SQL] Avoid NumberFormatException when totalSize is empty Dec 18, 2019
@dongjoon-hyun
Copy link
Member

@southernriver . You are added to the Apache Spark contributor group and SPARK-30262 is assigned to you. Thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants