-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File mode in to_csv
is ignored, when passing a file object instead of a path
#19827
Comments
Isn't csv sort of fundamentally a text format? The python csv writer, which we call out to, is ultimately what's choking. Certainly could raise a better error.
|
It is, but sometimes you have a filesystem-sort-of-interface (like Azure's Data Lake Store one - here ) that requires files to be written in binary mode, regardless of the format. |
Does passing a ADL's file-like object to |
Hey @TomAugspurger , just tried it. Same error:
yields
As of now the workaround is to create a temporary file and upload it explicitly, so it' not like I'm stuck, but it's just ugly. |
I think it works for s3fs because of this line: Line 326 in abc4ef9
Perhaps you can wrap your I'm not sure what the best way to solve this is generically. |
Perhaps just checking the buffer for a |
I can confirm that this does work:
Thanks for the suggestion |
I suppose we could add a mini-example like this to the docs (io.rst). It is a useful case I think. |
Can anyone think of a reason not to layer a TextIOWrapper over any buffer
opened in binary mode? I wonder why the stdlib CSV writer doesn't do this.
On the one hand, the user has explicitly given a binary-mode buffer to a
thing writing text, which is "wrong". But I think there's no ambiguity to
laying
a TextIOWrapper on top of the binary buffer. We'll be using the same
default encoding as if they user had just passed a text-mode file.
…On Thu, Feb 22, 2018 at 7:56 PM, Jeff Reback ***@***.***> wrote:
I suppose we could add a mini-example like this to the docs (io.rst). It
is a useful case I think.
@colobas <https://github.com/colobas> would you do a PR?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19827 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIlBDeUvismqTo7XrEA_OoYT3LZjCks5tXhq6gaJpZM4SOWoz>
.
|
@TomAugspurger doesn't the EDIT: I had typed |
I assume you meant `to_csv`. In that case the mode still matters for write
vs. append vs. exclusive. It's just the text vs. binary part of `mode` that
would be "ignored".
Given that the current behavior is so unfriendly (raising a not great error
message), is there a harm in ignoring the text vs. binary part?
…On Fri, Feb 23, 2018 at 8:49 AM, Guilherme Pires ***@***.***> wrote:
@TomAugspurger <https://github.com/tomaugspurger> doesn't the mode
parameter in read_csv stop having a purpose in that case?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19827 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIrphQeD3uruekW9zZpicyMhrfCm2ks5tXs_rgaJpZM4SOWoz>
.
|
No I think there would be no harm. I can make a PR during the weekend |
Appending to a file doesn't work for text mode either:
Even though However, appending does work when a file name is passed instead of a file handle:
I think this issue came up only recently, because I think this worked with a previous pandas version (although I didn't check). |
I updated my conda install after probably 6-12 months and this issue crept up for me, so I can confirm that it arose recently. Roughly, I was doing the following (essentially as cbrnr describes):
Looking through the installed versions in anaconda2/pkgs/ I see pandas-0.20.1 and pandas-0.23.1. Previously, the code would produce a single file with all the results produced. Now, the resulting file only contains the final results from the last iteration. The above code would work fine with 0.20, but produces the error as cbrnr describes now. |
Setting |
I've downgraded to 0.23.0 to check, and 0.23.0 works as expected. It appears to be specific to 0.23.1. |
Testing on current dev version (pandas: 0.24.0.dev0+232.g04caa569e) on my machine: append works fine. Wrapping the file with TextIOWrapper inside the with open() block as suggested above doesn't raise an error, but doesn't write to file either. binary write mode appears to be an issue with python3 csv module (see @chris-b1 's comment). Is there a particular fix that needs to happen? I'd be happy to work on it. |
can i work on this issue ? |
`import io df.to_excel(towrite) with open("someother.csv", "wb") as f:
` |
Hi folks, I wrote an article on my blog on how to Support Binary File Objects with |
I'm very confused about this, because passing a |
Code Sample, a copy-pastable example if possible
Problem description
When passing a file opened in binary mode to
df.to_csv
and also passingmode='wb'
, this mode is ignored. I think it's because of these lines: https://github.com/pandas-dev/pandas/blob/master/pandas/io/common.py#L407-L411 and these ones: https://github.com/pandas-dev/pandas/blob/master/pandas/io/formats/format.py#L1660-L1662It seems that
is_text
isn't passed, and so it assumes the default value ofTrue
Expected Output
A file should just be written.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.14.20-1-lts
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.22.0
pytest: 3.4.0
pip: 9.0.1
setuptools: 38.5.1
Cython: 0.27.3
numpy: 1.14.0
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.7.0
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 1.0b10
sqlalchemy: 1.2.3
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: 0.1.2
fastparquet: 0.1.3
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: