Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Division result exceeds numeric precision constraint #3731

Closed
aprokopczyk opened this issue Feb 13, 2023 · 5 comments · Fixed by #3737
Closed

Division result exceeds numeric precision constraint #3731

aprokopczyk opened this issue Feb 13, 2023 · 5 comments · Fixed by #3737

Comments

@aprokopczyk
Copy link

aprokopczyk commented Feb 13, 2023

The problem

We use version 2.1.214, and it fails on the current master as well.
To reproduce the problem, please run this SQL script:

create table TEST (
    ID INT,
    VALUE_SUM INTEGER,
    VALUE_COUNT NUMERIC(100000, 0)
);

insert into TEST VALUES(1, 23116, 11069533);

select sum(VALUE_SUM) *100000 / min(VALUE_COUNT) from TEST;

Last line will trigger following error:

[22001][22001] Value too long for column "NUMERIC": "208.8254310276684662306892260043851895107047424674554924765118817568907378477484... (100003)"; SQL statement: select sum(VALUE_SUM) *100000 / min(VALUE_COUNT) from TEST [22001-214]

I debugged the org.h2.expression.BinaryOperation#optimizeNumeric; it creates type info with precision and scale set to 100000. The produced result will stick to the defined scale, but it fails as the division result is bigger than one.

I see that performing additional rounding in org.h2.value.ValueNumeric#divide could help, but I wonder if it's in line with the desired behavior.
Please let me know if I can help you with the bug fix. I am eager to contribute.

I am unsure how relevant this might be, but PostgreSQL can deal with this script after changing the precision to the accepted maximum value of 1000.

Code

Code to reproduce the issue:

public class PrecisionBug {

  public static final Logger LOGGER = Logger.getLogger(PrecisionBug.class.getName());

  private static final String QUERY = "select sum(VALUE_SUM) *100000 / min(VALUE_COUNT) from TEST;";
  private static final String CREATE = "create table TEST (ID INT, VALUE_SUM INTEGER, VALUE_COUNT NUMERIC(100000, 0));";
  private static final String INSERT = "insert into TEST VALUES(1, 23116, 11069533)";

  public static void main(String... args) throws Exception {
    Class.forName("org.h2.Driver");

    try (Connection connection = DriverManager.getConnection("jdbc:h2:mem:test")) {
      Statement st = connection.createStatement();
      st.execute(CREATE);
      st.execute(INSERT);
      st.executeQuery(QUERY);
    } catch (Exception e) {
      LOGGER.log(Level.SEVERE, "Query error.", e);
    }
  }

}
@aprokopczyk
Copy link
Author

@katzyn, @grandinj could you shed some light on desired behavior in this case?

@katzyn
Copy link
Contributor

katzyn commented Feb 16, 2023

It should be much safer to use numeric data types with sane precision and scale, 100000 is too large for normal use cases.

The SQL Standard doesn't have any requirements for quotient's precision and scale, so database systems may choose any precision and scale for them.

But I think that H2 can try to handle this corner case better, I'll take a look on it this week.

@aprokopczyk
Copy link
Author

Hi @katzyn,
thank you for your reply.

It should be much safer to use numeric data types with sane precision and scale, 100000 is too large for normal use cases.

I checked once again. We converted our database with migration script from version 1.4.199. Initially, the column had DECIMAL(65535) type. After migration to version 2.1.214 it is defined as Numeric. No precision and scale constraints, so I assume some defaults kicked in.
If you change the column definition in the provided test to Numeric, the test will also fail.

But I think that H2 can try to handle this corner case better, I'll take a look on it this week.

Thank you. Please let me know if there is something I can do to help you.

@katzyn
Copy link
Contributor

katzyn commented Feb 18, 2023

SUM(VALUE_SUM) * 100000 produces a BIGINT value.
MIN(VALUE_COUNT) produces a NUMERIC(100000) value.

Due to this, H2 converts dividend to NUMERIC(19, 0) and then tries to find optimal data type of quotient.

H2 determines that NUMERIC(100019, 100000) is good enough, but precision is clamped to the limit, so NUMERIC(100000, 1000000) is incorrectly produced.

Here we may adjust both precision and scale to NUMERIC(100000, 99981) instead, because precision and scale of quotient are implementation-defined.

(We may not do the same for addition, subtraction, and multiplication, the SQL Standard requires a specific scale for them.)

@katzyn
Copy link
Contributor

katzyn commented Feb 18, 2023

Initially, the column had DECIMAL(65535) type. After migration to version 2.1.214 it is defined as Numeric

Column with count in its name usually should had INTEGER or BIGINT data type, you can change it with

ALTER TABLE TEST ALTER COLUMN COUNT SET DATA TYPE INTEGER;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants