New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_clipboard is no longer Excel compatible #12529

Closed
dalito opened this Issue Mar 5, 2016 · 8 comments

Comments

Projects
None yet
3 participants
@dalito

dalito commented Mar 5, 2016

Using pandas-0.18.0rc1-cp27-cp27m-win_amd64.whl from Christoh Gohlkes site:

import pandas as pd
df = pd.DataFrame({u'a':[1,2], u'degC_°C':[3,4]})
df.to_clipboard()
df.to_clipboard(excel=True, sep='\t')
df.to_clipboard(excel=False, sep='\t')
df.to_clipboard(sep=',')

Always gives the same content in the clipboard:

   a  degC_°C
0  1        3
1  2        4

which cannot be pasted to Excel since it does not use tab as separator.
Copy/paste worked fine in pandas 0.17.1 except for unicode characters.

Output of pd.show_versions():

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: AMD64 Family 22 Model 0 Stepping 1, AuthenticAMD
byteorder: little
LC_ALL: None
LANG: de_DE

pandas: 0.18.0rc1
nose: None
pip: 8.0.2
setuptools: 19.6.1
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: None
xarray: None
IPython: 4.1.0rc1
sphinx: 1.3.5
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 8, 2016

Contributor

Copy/paste worked fine in pandas 0.17.1 except for unicode characters.

and the clipboard routines have been updated to use unicode, see #9263

Contributor

jreback commented Mar 8, 2016

Copy/paste worked fine in pandas 0.17.1 except for unicode characters.

and the clipboard routines have been updated to use unicode, see #9263

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 8, 2016

Contributor

AFAICT this didn't work previously. Is the issue that the sep is not getting passed?

Contributor

jreback commented Mar 8, 2016

AFAICT this didn't work previously. Is the issue that the sep is not getting passed?

@dalito

This comment has been minimized.

Show comment
Hide comment
@dalito

dalito Mar 8, 2016

Yes, the new issue in 0.18.0rc1 (vs. 0.17.x) is that the clipboard content does no longer have a separator (only spaces). Even when I explicitly pass a separator by using the keyword argument, that separator is not used.

dalito commented Mar 8, 2016

Yes, the new issue in 0.18.0rc1 (vs. 0.17.x) is that the clipboard content does no longer have a separator (only spaces). Even when I explicitly pass a separator by using the keyword argument, that separator is not used.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 8, 2016

Contributor

ok, certainly could be an issue, can you step thru it and see where its not being passed.

Contributor

jreback commented Mar 8, 2016

ok, certainly could be an issue, can you step thru it and see where its not being passed.

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Mar 8, 2016

Member

Note that this only happens if there are unicode characters, if you leave out the °, it still works as expected using tabs

Member

jorisvandenbossche commented Mar 8, 2016

Note that this only happens if there are unicode characters, if you leave out the °, it still works as expected using tabs

@dalito

This comment has been minimized.

Show comment
Hide comment
@dalito

dalito Mar 8, 2016

OK. I have stepped through the sequence of calls and found out that all works fine if an encoding is specified when calling to_clipboard:

df = pd.DataFrame({u'a':[1,2], u'degC_°C':[3,4]})
df.to_clipboard(encoding='cp1252')

Without specifying an encoding, an UnicodeEncodeError ('ascii' codec can't encode character...) is raised in to_csv which is called from to_clipboard here. The error is suppressed so that the user has no idea what went wrong and that an encoding should be passed as kwarg to "to_clipboard". Another consequence of suppressing the error is that a separator passed as kwarg is ignored.

I am not sure how to fix this best. Either an error could be raised with a message that an encoding is needed or a default encoding could be used (cp1252 for windows utf8 for all other?).

dalito commented Mar 8, 2016

OK. I have stepped through the sequence of calls and found out that all works fine if an encoding is specified when calling to_clipboard:

df = pd.DataFrame({u'a':[1,2], u'degC_°C':[3,4]})
df.to_clipboard(encoding='cp1252')

Without specifying an encoding, an UnicodeEncodeError ('ascii' codec can't encode character...) is raised in to_csv which is called from to_clipboard here. The error is suppressed so that the user has no idea what went wrong and that an encoding should be passed as kwarg to "to_clipboard". Another consequence of suppressing the error is that a separator passed as kwarg is ignored.

I am not sure how to fix this best. Either an error could be raised with a message that an encoding is needed or a default encoding could be used (cp1252 for windows utf8 for all other?).

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 8, 2016

Contributor

ahh on

why don't we catch the encoding error then raise a helpful message that u need to pass an encoding (rather than just going on)

Contributor

jreback commented Mar 8, 2016

ahh on

why don't we catch the encoding error then raise a helpful message that u need to pass an encoding (rather than just going on)

@dalito

This comment has been minimized.

Show comment
Hide comment
@dalito

dalito Mar 8, 2016

Yes that is what I though, too. I'll prepare a PR.

dalito commented Mar 8, 2016

Yes that is what I though, too. I'll prepare a PR.

@jreback jreback added this to the 0.18.1 milestone Mar 8, 2016

@jreback jreback modified the milestones: 0.18.2, 0.18.1 Apr 25, 2016

@jreback jreback modified the milestones: 0.19.0, 0.19.1 Sep 28, 2016

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, 0.19.1 Oct 29, 2016

@jreback jreback closed this in 4a1a330 Nov 18, 2016

@jreback jreback modified the milestones: 0.19.2, 0.20.0 Nov 18, 2016

amolkahat added a commit to amolkahat/pandas that referenced this issue Nov 26, 2016

BUG in clipboard (linux, python2) with unicode and separator (GH13747)
vendered updated version of Pyperclip

closes #13747
closes #14362
closes #12807
closes #12529

Author: Ajay Saxena <ajasaxen@Ajays-MacBook-Pro.local>
Author: Ajay Saxena <aileronajay@gmail.com>

Closes #14599 from aileronajay/master and squashes the following commits:

2aafb66 [Ajay Saxena] moved comment inside test and added github issue labels to test
b74fbc1 [Ajay Saxena] ignore lint test for pyperclip files
9db42d8 [Ajay Saxena] whatsnew conflict
1dca292 [Ajay Saxena] conflict resolution
98b61e8 [Ajay Saxena] merge conflict
cedb690 [Ajay Saxena] merge conflict in whats new file
7af95da [Ajay Saxena] merging lastest changes
ac8ae60 [Ajay Saxena] skip clipboard test if clipboard primitives are absent
b03ed56 [Ajay Saxena] changed whatsnew file
c0aafd7 [Ajay Saxena] Merge branch 'test_branch'
9946fb7 [Ajay Saxena] Merge branch 'master' of https://github.com/pandas-dev/pandas into test_branch
ed1375f [Ajay Saxena] Merge branch 'test_branch'
0665fd4 [Ajay Saxena] fixed linting and test case as per code review
d202fd0 [Ajay Saxena] added test for valid encoding, modified setup.py so that pandas/util/clipboard can be found
dd57ae3 [Ajay Saxena] code review changes and read clipboard invalid encoding test
71d58d0 [Ajay Saxena] testing encoding in kwargs to to_clipboard and test case for the same
02f87b0 [Ajay Saxena] removed duplicate files
825bbe2 [Ajay Saxena] all files related to pyperclip are under pandas.util.clipboard
c5a87d8 [Ajay Saxena] Merge branch 'test_branch' of https://github.com/aileronajay/pandas into test_branch
f708c2e [Ajay Saxena] Merge branch 'master' of https://github.com/aileronajay/pandas
d565b1f [Ajay Saxena] updated pyperclip to the latest version
14d94a0 [Ajay Saxena] changed the pandas util clipboard file to return unicode if the python version is 2, else str
66d8ebf [Ajay Saxena] removed the disabled tag for clipboard test so that we can check if they pass after this change
edb8553 [Ajay Saxena] refactored the new unicode test to be in sync with the rest of the file
c83d000 [Ajay Saxena] added test case for unicode round trip
fb922d6 [Ajay Saxena] changes for GH 13747

jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this issue Dec 14, 2016

BUG in clipboard (linux, python2) with unicode and separator (GH13747)
vendered updated version of Pyperclip

closes #13747
closes #14362
closes #12807
closes #12529

Author: Ajay Saxena <ajasaxen@Ajays-MacBook-Pro.local>
Author: Ajay Saxena <aileronajay@gmail.com>

Closes #14599 from aileronajay/master and squashes the following commits:

2aafb66 [Ajay Saxena] moved comment inside test and added github issue labels to test
b74fbc1 [Ajay Saxena] ignore lint test for pyperclip files
9db42d8 [Ajay Saxena] whatsnew conflict
1dca292 [Ajay Saxena] conflict resolution
98b61e8 [Ajay Saxena] merge conflict
cedb690 [Ajay Saxena] merge conflict in whats new file
7af95da [Ajay Saxena] merging lastest changes
ac8ae60 [Ajay Saxena] skip clipboard test if clipboard primitives are absent
b03ed56 [Ajay Saxena] changed whatsnew file
c0aafd7 [Ajay Saxena] Merge branch 'test_branch'
9946fb7 [Ajay Saxena] Merge branch 'master' of https://github.com/pandas-dev/pandas into test_branch
ed1375f [Ajay Saxena] Merge branch 'test_branch'
0665fd4 [Ajay Saxena] fixed linting and test case as per code review
d202fd0 [Ajay Saxena] added test for valid encoding, modified setup.py so that pandas/util/clipboard can be found
dd57ae3 [Ajay Saxena] code review changes and read clipboard invalid encoding test
71d58d0 [Ajay Saxena] testing encoding in kwargs to to_clipboard and test case for the same
02f87b0 [Ajay Saxena] removed duplicate files
825bbe2 [Ajay Saxena] all files related to pyperclip are under pandas.util.clipboard
c5a87d8 [Ajay Saxena] Merge branch 'test_branch' of https://github.com/aileronajay/pandas into test_branch
f708c2e [Ajay Saxena] Merge branch 'master' of https://github.com/aileronajay/pandas
d565b1f [Ajay Saxena] updated pyperclip to the latest version
14d94a0 [Ajay Saxena] changed the pandas util clipboard file to return unicode if the python version is 2, else str
66d8ebf [Ajay Saxena] removed the disabled tag for clipboard test so that we can check if they pass after this change
edb8553 [Ajay Saxena] refactored the new unicode test to be in sync with the rest of the file
c83d000 [Ajay Saxena] added test case for unicode round trip
fb922d6 [Ajay Saxena] changes for GH 13747

(cherry picked from commit 4a1a330)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment