You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 11, 2021. It is now read-only.
I'm continuing with my adventures of writing csv to parquet, but I got stuck with how to write times/dates to parquet.
Specifically, how do I declare the schema (assuming I'm using the text format message schema {})?
I read up on the logical types and their mapping to/from data types, so I tried using i64 for my schema, but I think I'm missing something because I don't know how to map the type to a TIMESTAMP.
I also tried Google, to try look for the format of the schema, but with no luck (for timestamps). Is there some place that documents this?
The text was updated successfully, but these errors were encountered:
Thanks @sadikovi, I was confused by the UTC stuff on the timestamp logical type.
Writing a timestamp now works with message schema {REQUIRED INT64 MyField (TIMESTAMP_MILLIS)}, but I'm unable to read the parquet file back in Pandas or PySpark.
PySpark:
spark.read.parquet("file1.parquet").printSchema()
//thiscorrectlyshowstheschemaasbelow, but .show() throwsanerror
I'm continuing with my adventures of writing csv to parquet, but I got stuck with how to write times/dates to parquet.
Specifically, how do I declare the schema (assuming I'm using the text format
message schema {}
)?I read up on the logical types and their mapping to/from data types, so I tried using
i64
for my schema, but I think I'm missing something because I don't know how to map the type to aTIMESTAMP
.I also tried Google, to try look for the format of the schema, but with no luck (for timestamps). Is there some place that documents this?
The text was updated successfully, but these errors were encountered: