Skip to content

Comments

[SPARK-42251][SQL] Forbid deicmal type if precision less than 1#39822

Closed
ulysses-you wants to merge 2 commits intoapache:masterfrom
ulysses-you:SPARK-42251
Closed

[SPARK-42251][SQL] Forbid deicmal type if precision less than 1#39822
ulysses-you wants to merge 2 commits intoapache:masterfrom
ulysses-you:SPARK-42251

Conversation

@ulysses-you
Copy link
Contributor

What changes were proposed in this pull request?

throw exception if the decimal precision less than 1.

Why are the changes needed?

Spark does not actually support decimal type with 0 precision. e.g.

– work with in-memory catalog
create table t (c decimal(0, 0)) using parquet;

-- fail with parquet
-- java.lang.IllegalArgumentException: Invalid DECIMAL precision: 0
-- at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:57)
insert into table t values(0);

-- fail with hive catalog
-- Caused by: java.lang.IllegalArgumentException: Decimal precision out of allowed range [1,38]
-- at org.apache.hadoop.hive.serde2.typeinfo.HiveDecimalUtils.validateParameter(HiveDecimalUtils.java:44)
create table t (c decimal(0, 0)) using parquet;

Does this PR introduce any user-facing change?

yes, one main behavior change is: SELECT cast(0 as decimal(0, 0))

  • before: return 0
  • after: throw exception

How was this patch tested?

add test

@ulysses-you
Copy link
Contributor Author

I'm not sure this worth a leagcy config. cc @cloud-fan @viirya @gengliangwang

@cloud-fan
Copy link
Contributor

how does cast(0 as decimal(0, 0)) work?

@cloud-fan
Copy link
Contributor

Another place to check is user-specified schema when reading data sources, e.g. spark.read.schema("c1: decimal(0, 0)").json(...)

@ulysses-you
Copy link
Contributor Author

ulysses-you commented Jan 31, 2023

how does cast(0 as decimal(0, 0)) work?

It's wrapped in Decimal first which internally use longVal to represent the decimal value so it won't fail. The changePrecision method only fail with decimal(0, 0) if it uses decimalVal(BigDecimal). That's the triky place that it may hide bug since some operators do not fail.

Another place to check is user-specified schema when reading data source

This case should be covered. It's similar with create table ... ddl. I added a new test for it.

@github-actions github-actions bot added the DOCS label Jan 31, 2023
@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label May 12, 2023
@github-actions github-actions bot closed this May 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants