You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it possible to change the behaviour of Hudi when specifying the hoodie.datasource.write.partitionpath.field configuration for a table? I notice that the data is partitioned as expected. However, the dataset also contains the columns that were specified in the hoodie.datasource.write.partitionpath.field configuration. This behaviour is different from the native spark.write.partitionBy operation, which will partition the data based on specified columns and remove the aforementioned columns from the data set. Is there a way to match this behaviour?
I am also facing issues in the HUDI version 0.11.1 strange thing is that is occurs in some of the tables sometime but not on all the Hudi tables I am managing.
Hi Hudi Team,
Is it possible to change the behaviour of Hudi when specifying the
hoodie.datasource.write.partitionpath.field
configuration for a table? I notice that the data is partitioned as expected. However, the dataset also contains the columns that were specified in thehoodie.datasource.write.partitionpath.field
configuration. This behaviour is different from the nativespark.write.partitionBy
operation, which will partition the data based on specified columns and remove the aforementioned columns from the data set. Is there a way to match this behaviour?Here is an example of the behaviour I am referring to: https://stackoverflow.com/questions/36164914/prevent-dataframe-partitionby-from-removing-partitioned-columns-from-schema/47104251
Cheers,
Brandon Stanley
The text was updated successfully, but these errors were encountered: