-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 string breaks R-session in Windows 10 #90
Comments
This might require some changes to reticulate unless you can somehow pass a different permutation of the character vector that doesn't trigger the issue. We will investigate and fix this after our next release hits CRAN (should be the next few days). |
Thanks for a response. Actually, many of non-ascii characters would crash an R-session. So it'd be great if you'd fix this issue. |
Hello, I tried yesterday a package using reticulate to read .msg emails thanks to a module written in Python. And for some e-mails that I've tested (and they all seem to have special characters inside like russian letters) the R session is also aborted (cf. package and issue here hrbrmstr/msgxtractr#1). Maybe my problem is linked with this issue (I'm also working on windows 10 but using the "import" function of reticulate) so i'm also interested in fixing it :) |
If you could also send a simple repro of the issue you are running into
that would be appreciated (as then I can ensure that the fix addresses your
scenario/configuration as well)
…On Thu, Aug 24, 2017 at 7:27 AM, Kim A. ***@***.***> wrote:
Hello,
I tried yesterday a package using reticulate to read .msg emails thanks to
a module written in Python. And for some e-mails that I've tested (and they
all seem to have special characters inside like russian letters) the R
session is also aborted (cf. package and issue here hrbrmstr/msgxtractr#1
<hrbrmstr/msgxtractr#1>). Maybe my problem is
linked with this issue (I'm also working on windows 10 but using the
"import" function of reticulate) so i'm also interested in fixing it :)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#90 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAGXx91PJZK7F4PXZn01ZPrW5WxlSvccks5sbYhDgaJpZM4O-0vX>
.
|
OK, I've created 2 emails (.msg) for an example : one contains special characters (the same symbols as amatsuo and a smiley) and the other doesn't. Now the R code :
If you need the session info :
Matrix products: default locale: attached base packages: other attached packages: loaded via a namespace (and not attached): |
Thank you! What's particularly interesting about this example is that the string conversion piece appears to bypasses reticulate entirely. It's almost as if the issue is that the Python runtime embedded by reticulate doesn't know enough about the locale and that might be causing the crash. Will update this when we know more. |
Fixed here: 8917eeb |
Exploring the cause of this issue quanteda/spacyr#69
I found that an R session is terminated when a utf-8 string is handed to r through
r_to_py
in a windows system (Windows 10).This is a minimal example which reproduces the issue
Here is the session info
The version of python:
(downloaded from python.org)
Is there any workaround?
The text was updated successfully, but these errors were encountered: