We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To reproduce:
Create parquet in Pyspark:
{code:python}from pyspark.sql import functions as F columns = ["col_A","date_string"] data = [("Java", "2020-01-30"), ("Python", "2020-01-31"), ("Scala", "2020-02-01")] df = spark.createDataFrame(data).toDF(*columns)
df2 = df.select(F.col("col_A"), F.col("date_string"), F.to_date(F.col("date_string"), "yyyy-MM-dd").alias("date_converted"))
df2.write.save('date_for_testing_100643.parquet', format='parquet'){code}
Load parquet in H2O/SW:
{noformat}# Load in Sparkling Water from pysparkling import * hc = H2OContext.getOrCreate()
h2o.import_file('hdfs://mr-0xg10.0xdata.loc:8020/user/neema/date_for_testing_100643.parquet'){noformat}
Returns:
!image-20211029-013413.png|width=867,height=119!
^ date_converted should be a date, not int.
The text was updated successfully, but these errors were encountered:
JIRA Issue Details
Jira Issue: PUBDEV-8397 Assignee: krasinski Reporter: Neema Mashayekhi State: Resolved Fix Version: 3.34.0.5 Attachments: Available (Count: 2) Development PRs: Available
Sorry, something went wrong.
Attachments From Jira
Attachment Name: date_for_testing_100643.parquet.zip Attached By: Neema Mashayekhi File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8397/date_for_testing_100643.parquet.zip
Attachment Name: image-20211029-013413.png Attached By: Neema Mashayekhi File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8397/image-20211029-013413.png
Linked PRs from JIRA
#5884
krasinski
No branches or pull requests
To reproduce:
Create parquet in Pyspark:
{code:python}from pyspark.sql import functions as F
columns = ["col_A","date_string"]
data = [("Java", "2020-01-30"), ("Python", "2020-01-31"), ("Scala", "2020-02-01")]
df = spark.createDataFrame(data).toDF(*columns)
Cast as date and make as new column
df2 = df.select(F.col("col_A"), F.col("date_string"), F.to_date(F.col("date_string"), "yyyy-MM-dd").alias("date_converted"))
Save on hdfs
df2.write.save('date_for_testing_100643.parquet', format='parquet'){code}
Load parquet in H2O/SW:
{noformat}# Load in Sparkling Water
from pysparkling import *
hc = H2OContext.getOrCreate()
Load parquet as H2O frame
h2o.import_file('hdfs://mr-0xg10.0xdata.loc:8020/user/neema/date_for_testing_100643.parquet'){noformat}
Returns:
!image-20211029-013413.png|width=867,height=119!
^ date_converted should be a date, not int.
The text was updated successfully, but these errors were encountered: