-
Notifications
You must be signed in to change notification settings - Fork 2.5k
TIMESTAMP_MICROS handling #17222
Copy link
Copy link
Open
Labels
from-jirapriority:criticalProduction degraded; pipelines stalledProduction degraded; pipelines stalledtype:devtaskDevelopment tasks and maintenance workDevelopment tasks and maintenance work
Milestone
Metadata
Metadata
Assignees
Labels
from-jirapriority:criticalProduction degraded; pipelines stalledProduction degraded; pipelines stalledtype:devtaskDevelopment tasks and maintenance workDevelopment tasks and maintenance work
Type
Fields
Give feedbackNo fields configured for issues without a type.
Hi Guys!
I am not able to use timestamp micro columns save with HUDI.
I would like to save it keeping microsec granularity, but it only keeps milisec.
I have set this:
--conf spark.sql.parquet.outputTimestampType=TIMESTAMP_MICROS
and also this in the hoodie:
"hoodie.parquet.outputtimestamptype": "TIMESTAMP_MICROS",
but when I read it back (with pyspark, load api), it's only millisecond precision and unfortunately, I need the microsec in some case, because with this I run into a Schrödinger's cat situation !https://a.slack-edge.com/production-standard-emoji-assets/13.0/google-medium/1f604.png!
So an entity has more than one states in the same time !https://a.slack-edge.com/production-standard-emoji-assets/13.0/google-medium/1f604.png!Can someone enlighten me what should I do?
Before the save, everything is fine! ("ts" column)
Darvi
SLACK Thread: [https://apache-hudi.slack.com/archives/C4D716NPQ/p1652347742173779]
JIRA info