Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[SPARK-29188][PYTHON] toPandas (without Arrow) gets wrong dtypes when applied on empty DF #26747
What changes were proposed in this pull request?
An empty Spark DataFrame converted to a Pandas DataFrame wouldn't have the right column types. Several type mappings were missing.
Why are the changes needed?
Empty Spark DataFrames can be used to write unit tests, and verified by converting them to Pandas first. But this can fail when the column types are wrong.
Does this PR introduce any user-facing change?
Yes; the error reported in the JIRA issue should not happen anymore.
How was this patch tested?
Through unit tests in
@srowen This illustrates the current behaviour, where an empty Spark Dataframe with a column of type
When the dataframe is not empty, this is what you see: