Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
BUG in clipboard (linux, python2) with unicode and separator #13747
Comments
jreback
closed this
Jul 22, 2016
jreback
added Data IO Unicode Compat
labels
Jul 22, 2016
jreback
added this to the
No action
milestone
Jul 22, 2016
|
OK. Actually, I saw that but thought it was purely windows related. The bug here is an incorrect use of |
|
I get the same on macosx / py2. so your report is prob better here. actually we cannot repro this on the builds anyhow which would be ideal. ok will re-open and make this an xref issue of that one. |
jreback
reopened this
Jul 22, 2016
jreback
modified the milestone: Next Major Release, No action
Jul 22, 2016
jreback
added the
Bug
label
Jul 22, 2016
|
I see now that Apparently, it also solves #12529. So, indeed, these issues are more closely related than I thought. |
jlou2u
commented
Sep 15, 2016
|
+1 to fix this when possible. It's the only test that fails for me (OSX) in the codebase. More of an annoyance but still... I'd also suggest not changing/setting default values for kwargs when to_clipboard is called, seems confusing at best and I think functionality is unchanged (proposed fix in jlou2u/pandas@01277af that includes pijucha's change) I haven't done anything with travis before but looking at .travis.yml it seems that xsel is only added onto python 3 builds, but not python 2 builds. I think pandas.util.clipboard will raise import error if it can't find a clipboard utility and test_clipboard.py will raise nose.SkipTest("no clipboard found") if it can't find a clipboard. That's my best guess at why this can't be reproduced in the builds. |
|
This test is failing for me on OSX, (with the latest code) |
|
i was able to change _copyOSX function in pandas.util.clipboard.py def _copyOSX(text): to make the test pass. The test fails because to_clipboard fails for the data frame and it falls back to the string to clipboard. The to_clipboard fails because, we are trying to encode from ascii to utf8 when we call encode, but the str is already in UTF-8 when we have non ascii characters in the dataframe, hence when it tries to read the non ascii character using ascii, we get a UnicodeDecodeError. By capturing the UnicodeDecodeError exception and passing the string as it (as it is unicode encoded) we can make it work |
|
@aileronajay The thing is the file If I have some time this week I can submit a PR. Unless someone else can do it faster and better. |
aileronajay
referenced
this issue
Nov 7, 2016
Closed
BUG in clipboard (linux, python2) with unicode and separator (GH13747) #14599
|
created a pull request (containing the same change as your commit pijucha@e53dcb0 ), fb922d6 |
jreback
closed this
in 4a1a330
Nov 18, 2016
jreback
modified the milestone: 0.19.2, Next Major Release
Nov 18, 2016
amolkahat
added a commit
to amolkahat/pandas
that referenced
this issue
Nov 26, 2016
|
|
+ amolkahat |
f96e472
|
jorisvandenbossche
added a commit
to jorisvandenbossche/pandas
that referenced
this issue
Dec 14, 2016
|
|
+ jorisvandenbossche |
6f55ab9
|
pijucha commentedJul 22, 2016
•
edited
This is probably a known bug but I couldn't find a github issue.
There is a disabled test
test_clipboard.pywhich fails with the following errorCode Sample, a copy-pastable example if possible
More explicitly (the example from the above test):
Expected Output
output of
pd.show_versions()There are probably 2 issues in the code.
to_clipboardfalls back toto_stringmethod.(In this case, fixing 1 solves the problem. But in general, if something else raises and we fall back here, a separator is ignored.)
I don't know what to do about 2, but 1 seems to be easy.
Part of the code in
util.clipboard.pycallssubprocess.Popen.communicate(), which operates on byte types (bytes in PY3 and strings in PY2). So,encode/decodeare needed only in PY3.I believe this 6d4fdb0 fixes the problem. But for now I tested only one pair of functions (in KDE) and couldn't possibly test it on OS X.