Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MCOL 4632 and 4648 and further issues when casting to [U]BIGINT. #2874

Merged

Conversation

phoeinx
Copy link
Contributor

@phoeinx phoeinx commented Jun 22, 2023

This PR fixes MCOL-4632 and MCOL-4648 and some other casting issues that I found while investigating both issues.

Why?

We use uintX_t constants to mark NULL values in result rows. E.g. in storage/columnstore/columnstore/utils/funcexp/funcexp.cpp l.394:

case CalpontSystemCatalog::UBIGINT:
      {
        uint64_t val = expression[i]->getUintVal(row, isNull);

         if (isNull)
          row.setUintField<8>(UBIGINTNULL, expression[i]->outputIndex());
        else
          row.setUintField<8>(val, expression[i]->outputIndex());

        break;
      }

If isNull gets set by getUintVal() we write UBIGINTNULL into the result row. If however val equals UBIGINTNULL (or 2^64 - 2) the output row also gets written with UBIGINTNULL and later on wrongly displayed as NULL and not with the expected numeric value of 2^64 - 2.

At the core of the issue is that we cannot store the whole range of [U]BIGINTs, because we have blocked some values by constant markers. It is not possible to insert values which are assigned constant markers into a BIGINT column (this is covered by a Out of range value for column error), but it is possible that CAST operation results take on the blocked values and are then wrongly/unexpectedly displayed as NULL.

image

Besides the issues from casting DECIMAL->BIGINT and DECIMAL->UBIGINT, I've also been able to find the same issue for - - UBIGINT->BIGINT

  • TEXT/VARCHAR/etc. -> BIGINT/UBIGINT
  • DOUBLE-> BIGINT/UBIGINT.

(All cases are documented in the mcol_4632 and mcol_4648 files in mysql-test/columnstore/bugfixes.)

How?

Main changes are in func_cast.cpp where for Func_cast_signed::getIntVal() and Func_cast_unsigned::getUintVal() I checked and updated every switch case to include a check for unallowed return values which are now encoded as joblist::MIN_MCS_BIGINT and joblist::MAX_MCS_UBIGINT.

Update: The previously described approach would have been the preferred way to ensure consistency. (Float values were already handled that way, meaning that one in the current version cannot cast to neither [U]BIGINTEMPTYROW nor [U]BIGINTNULL. However, as the EMPTYROW constant gets displayed as the correct INT value, there are some regression tests expecting to be able to cast to the EMPTYROW constant. (And some tests actually expecting a cast to NULL, but we are going to fix these at least.)

Therefore the new approach will be to allow casting to EMPTYROW constants from all datatypes (including floats) and disallow casting to NULL constants. All of this again handled in Func_cast_signed::getIntVal() and Func_cast_unsigned::getUintVal() and the respective casting methods per type.

Testing?

All cases are tested with regression tests inmcol_4632 and mcol_4648 files in mysql-test/columnstore/bugfixes.

@phoeinx phoeinx force-pushed the mcol-4632-4648-int-casts branch 2 times, most recently from 26891f8 to 2bd1f98 Compare July 3, 2023 13:29
@drrtuy drrtuy self-requested a review July 3, 2023 14:41
Remove redundant cast.

As C-style casts with a type name in parantheses are interpreted as static_casts this literally just changes the interpretation around (and forces an implicit cast to match the return value of the function).

Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency.

Make consistent with relation between BIGINTNULL and BIGINTEMPTYROW & make adapted cast behaviour due to NULL markers more intuitive. (After this change we can simply block the highest possible uint64_t value and if a cast results in it, print the next lower value (2^64 - 2). Previously, (2^64 - 1) was able to be printed, but (2^64 - 2) as being blocked by the UBIGINTNULL constant was not, making finding the appropiate replacement value to give out more confusing.

Introduce MAX_MCS_UBIGINT and MIN_MCS_BIGINT and adapt casts.

Adapt casting to BIGINT to remove NULL marker error.

Add bugfix regression test for MCOL 4632

Add regression test for mcol_4648

Revert "Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency."

This reverts commit 83eac11.
Due to backwards compatability issues.

Refactor casting to MCS[U]Int to datatype functions.

Update regression tests to include other affected datatypes.

Apply formatting.

Refactor according to PR review

Remove redundant new constant, switch to using already existing constant.

Adapt nullstring casting to EMPTYROW markers for backwards compatability.

Adapt tests for backward compatability behaviour allowing text datatypes to be casted to EMPTYROW constant.

Adapt mcol641-functions test according to bug fix.

Update tests according to new expected behaviour.

Adapt tests to new understanding of issue.

Update comments/documentation for MCOL_4632 test.

Adapt to new cast limit logic.

Make bracketing consistent.

Adapt previous regression test to new expected behaviour.
@drrtuy drrtuy merged commit c97c8b6 into mariadb-corporation:develop Aug 11, 2023
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants