Company or project name
No response
Describe what's wrong
I have a parquet file with:
physical_type: INT64
logical_type: Timestamp(isAdjustedToUTC=false, timeUnit=milliseconds, is_from_converted_type=false, force_set_converted_type=false)
There is a mismatch when I read it with and without input_format_parquet_use_native_reader_v3
input_format_parquet_use_native_reader_v3=0
DESCRIBE TABLE file('standalone_timestamp.parquet', 'Parquet')
SETTINGS input_format_parquet_use_native_reader_v3 = 0
┌─name─┬─type────────────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
1. │ ts │ Nullable(DateTime64(3)) │ │ │ │ │ │
└──────┴─────────────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘
input_format_parquet_use_native_reader_v3=1
DESCRIBE TABLE file('standalone_timestamp.parquet', 'Parquet')
SETTINGS input_format_parquet_use_native_reader_v3 = 1
┌─name─┬─type───────────────────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
1. │ ts │ Nullable(DateTime64(3, 'UTC')) │ │ │ │ │ │
└──────┴────────────────────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘
Does it reproduce on the most recent release?
Yes
How to reproduce
ClickHouse version: 25.8.4.13
Settings: input_format_parquet_use_native_reader_v3=1
I wasn't able to get steps strictly with ClickHouse but here I could generate a repro file with pyarrow:
import pyarrow as pa
import pyarrow.parquet as pq
import datetime
import os
OUT_PATH = "test_decimal/standalone_timestamp.parquet"
os.makedirs(os.path.dirname(OUT_PATH), exist_ok=True)
timestamp_type = pa.timestamp('ms', tz=None)
values = [datetime.datetime(2023, 1, 1, 12, 0, 0), None, datetime.datetime(2023, 1, 2, 13, 0, 0)]
arr = pa.array(values, type=timestamp_type)
table = pa.Table.from_arrays([arr], schema=pa.schema([pa.field("ts", timestamp_type, nullable=True)]))
pq.write_table(table, OUT_PATH, version="1.0", compression=None, use_dictionary=False)
Expected behavior
No response
Error message and/or stacktrace
No response
Additional context
No response
Company or project name
No response
Describe what's wrong
I have a parquet file with:
physical_type:
INT64logical_type:
Timestamp(isAdjustedToUTC=false, timeUnit=milliseconds, is_from_converted_type=false, force_set_converted_type=false)There is a mismatch when I read it with and without
input_format_parquet_use_native_reader_v3input_format_parquet_use_native_reader_v3=0
input_format_parquet_use_native_reader_v3=1
Does it reproduce on the most recent release?
Yes
How to reproduce
ClickHouse version:
25.8.4.13Settings:
input_format_parquet_use_native_reader_v3=1I wasn't able to get steps strictly with ClickHouse but here I could generate a repro file with pyarrow:
Expected behavior
No response
Error message and/or stacktrace
No response
Additional context
No response