Skip to content

[fix](orc) Round timestamp nanos to micros#64807

Open
xylaaaaa wants to merge 2 commits into
apache:branch-4.1from
xylaaaaa:codex/orc-timestamp-prefix
Open

[fix](orc) Round timestamp nanos to micros#64807
xylaaaaa wants to merge 2 commits into
apache:branch-4.1from
xylaaaaa:codex/orc-timestamp-prefix

Conversation

@xylaaaaa

Copy link
Copy Markdown
Contributor

Proposed changes

Fix ORC timestamp decoding to round nanoseconds to Doris microseconds instead of truncating them. This keeps CAST(timestamp AS VARCHAR) aligned with Hive/Trino prefix expectations for values like 2020-01-02 03:04:05.321.

The same decode path is used by nested timestamps in array/map/struct columns, so this also covers complex type projections.

Problem summary

ORC stores timestamp fractional seconds as nanoseconds, while Doris DATETIMEV2(6) keeps microseconds. The previous conversion truncated nanos with / 1000, so an ORC value such as 320999999ns became .320999 instead of .321000. Prefix predicates like:

CAST(ts AS VARCHAR) LIKE '2020-01-02 03:04:05.321%'

could therefore miss rows created by Hive/Trino ORC writers.

Solution

  • Round ORC nanoseconds to microseconds during timestamp decode.
  • Carry 999999500ns and above into the next second.
  • Apply the same helper to TIMESTAMP and TIMESTAMP_INSTANT decode paths.
  • Add a BE unit test covering rounding and second carry.

Test plan

  • ninja -j 8 doris_be_test
  • ./be/ut_build_RELEASE/test/doris_be_test --gtest_filter='OrcReaderFillDataTest.TestTimestampNanosecondsRoundToMicroseconds'
  • ./be/ut_build_RELEASE/test/doris_be_test --gtest_filter='OrcReaderFillDataTest.*'
  • git diff --check

xylaaaaa added 2 commits June 5, 2026 17:48
Issue Number: close #xxx

Related PR: apache#60915

Problem Summary: Move iceberg_rest_on_hdfs out of the P0 external suite because it requires the dedicated iceberg-rest Docker environment, which is not started by the default community external pipeline.

None

- Test: Manual test
    - Verified the suite is declared only under external_table_p2 with group p2,external
    - Ran git diff --check
    - Regression test not run because the case requires the dedicated iceberg-rest Docker environment
- Behavior changed: No
- Does this need documentation: No
@xylaaaaa xylaaaaa requested a review from yiguolei as a code owner June 25, 2026 02:25
Copilot AI review requested due to automatic review settings June 25, 2026 02:25
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants