-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve Complex Data Types for to_csv #61157
base: main
Are you sure you want to change the base?
Conversation
@Jaspvr is there an existing issue that this PR addresses? If so, could you list it in the description? If not, please create an issue describing the bug or proposed enhancement so it can be reviewed by a team member. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! You may want to review our development documentation, namely this section:
https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#writing-tests
@@ -3858,6 +3859,11 @@ def to_csv( | |||
|
|||
{storage_options} | |||
|
|||
preserve_complex : bool, default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented in the issue, you can use the dtype
argument in read_csv
to read complex values already. I'm negative on this approach.
This PR introduces a useful feature for preserving complex data types like NumPy arrays during CSV serialization/deserialization via the Two minor points for consideration: Serialization Logic ( |
This Pull Request solves the issue outlined in:
#60895
Complex data types like numpy arrays can now be stored in csv format and later recovered.
A new parameter, preserve_complex, is introduces, and when it is set to true in your to_csv function call, the complex data types will be preserved and can be recovered from the csv.
The way this works is by serializing Numpy arrays into JSON format for preserve_complex=True. To get them from the csv, we can set the same parameter in read_csv, and the original Numpy array will be returned.
Please refer to tests in scripts/tests/test_csv.py to see how this is used.
Please refer to the original issue for more information on the problem definition.