-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
to_parquet when column with only null values #8797
Copy link
Copy link
Closed
Labels
dataframeneeds infoNeeds further information from the userNeeds further information from the userparquet
Description
I did create a file named 'system_data.txt' with the following content :
SYSTEM_DATE,col1
Test1,Test1
Test2,Test2
,Test3
This code produces an error :
import dask.dataframe as dd
df = dd.read_csv('system_data.txt', dtype={"SYSTEM_DATE":'str', 'col1':'str'})
df = df.repartition(npartitions=3)
df.to_parquet('test')
What happened:
RuntimeError: AppendRowGroups requires equal schemas.
I tried with fastparquet, and there is no errors.
What you expected to happen:
The parquet file should be created without errors.
Environment:
- Dask version: 2022.2.1
- Python version: 3.8
- Pyarrow : 7.0.0
- Operating System: Linux
- Install method : pip
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
dataframeneeds infoNeeds further information from the userNeeds further information from the userparquet