-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
it shows me this error LibrdataError: Unable to convert string to the requested encoding (invalid byte sequence) #64
Comments
as suggested in the issue template, please include a file (with no sensitive data) so that I can reproduce the issue. If I cannot reproduce the issue I cannot fix it. |
HEDIEH KARACHI has shared a OneDrive for Business file with you. To view it, click the link below.
<https://deakin365-my.sharepoint.com/personal/hkarachi_deakin_edu_au/Documents/Attachments/tip2020.rda>
[https://r1.res.office365.com/owa/prem/images/dc-generic_20.png]<https://deakin365-my.sharepoint.com/personal/hkarachi_deakin_edu_au/Documents/Attachments/tip2020.rda>
tip2020.rda<https://deakin365-my.sharepoint.com/personal/hkarachi_deakin_edu_au/Documents/Attachments/tip2020.rda>
Thanks for reply. Please find the attached file.
Best,
Hedieh
…________________________________
From: Otto Fajardo <notifications@github.com>
Sent: Thursday, January 28, 2021 6:37 PM
To: ofajardo/pyreadr <pyreadr@noreply.github.com>
Cc: HEDIEH KARACHI <hkarachi@deakin.edu.au>; Author <author@noreply.github.com>
Subject: Re: [ofajardo/pyreadr] it shows me this error LibrdataError: Unable to convert string to the requested encoding (invalid byte sequence) (#64)
as suggested in the issue template, please include a file (with no sensitive data) so that I can reproduce the issue. If I cannot reproduce the issue I cannot fix it.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#64 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASUA2CX3M2LBZKK56EQ4ZM3S4EH4TANCNFSM4WWGP7UQ>.
|
I can't access the file, it gives me an error. Please zip it and drag and drop here directly. |
HEDIEH KARACHI has shared a OneDrive for Business file with you. To view it, click the link below.
<https://deakin365-my.sharepoint.com/personal/hkarachi_deakin_edu_au/Documents/Attachments/rdaFile.zip>
[https://r1.res.office365.com/owa/prem/images/dc-zip_20.png]<https://deakin365-my.sharepoint.com/personal/hkarachi_deakin_edu_au/Documents/Attachments/rdaFile.zip>
rdaFile.zip<https://deakin365-my.sharepoint.com/personal/hkarachi_deakin_edu_au/Documents/Attachments/rdaFile.zip>
I hope it works now. As the file is already compressed, when I zip it, it doesn't make it smaller. Let me know if you still can't open it.
Best,
Hedieh
…________________________________
From: Otto Fajardo <notifications@github.com>
Sent: Thursday, January 28, 2021 7:17 PM
To: ofajardo/pyreadr <pyreadr@noreply.github.com>
Cc: HEDIEH KARACHI <hkarachi@deakin.edu.au>; Author <author@noreply.github.com>
Subject: Re: [ofajardo/pyreadr] it shows me this error LibrdataError: Unable to convert string to the requested encoding (invalid byte sequence) (#64)
I can't access the file, it gives me an error. Please zip it and drag and drop here directly.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#64 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASUA2CVGQG6EALEKNHIF7I3S4EMR3ANCNFSM4WWGP7UQ>.
|
After signing in it keeps me giving a permission denied error. Please attach the file here in github (you need to zip it not to reduce the size, but because github accepts zip files) or look for another way to share it. |
Hopefully you can access the file now. I couldn't share in github, as the file is bigger than 10mb.
https://www.dropbox.com/s/650m9kxkb8dzglw/tip2020.rda.zip?dl=0
[https://www.dropbox.com/static/images/spectrum-icons/generated/content/content-zip-large.png]<https://www.dropbox.com/s/650m9kxkb8dzglw/tip2020.rda.zip?dl=0>
tip2020.rda.zip<https://www.dropbox.com/s/650m9kxkb8dzglw/tip2020.rda.zip?dl=0>
Shared with Dropbox
www.dropbox.com
…________________________________
From: Otto Fajardo <notifications@github.com>
Sent: Saturday, January 30, 2021 12:18 AM
To: ofajardo/pyreadr <pyreadr@noreply.github.com>
Cc: HEDIEH KARACHI <hkarachi@deakin.edu.au>; Author <author@noreply.github.com>
Subject: Re: [ofajardo/pyreadr] it shows me this error LibrdataError: Unable to convert string to the requested encoding (invalid byte sequence) (#64)
Reopened #64<#64>.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#64 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASUA2CWHV6MUHDUWESMEU7DS4KYQ5ANCNFSM4WWGP7UQ>.
|
I managed to download the file and reproduce the error. Reading the first bytes of the file I got this:
I think CP1252 is the encoding, meaning Windows-1252. Right now as indicated in the Known limitations section of the README of this repo, pyreadr does not support other encodings different from UTF-8.
That means this file is not supported. This limitation comes from the C backend librdata. Looking at the C source code I have the feeling the error message should be different, so I am going to make an issue there for them to take a look. I will also ask if other encodings could be supported. It may come at some point in the future. If you have control over the generation of the rda files, then try saving them with utf-8 encoding. |
Thanks so much for your help. I really appreciate it.
Best,
Hedieh
…________________________________
From: Otto Fajardo <notifications@github.com>
Sent: Monday, February 1, 2021 7:51 PM
To: ofajardo/pyreadr <pyreadr@noreply.github.com>
Cc: HEDIEH KARACHI <hkarachi@deakin.edu.au>; Author <author@noreply.github.com>
Subject: Re: [ofajardo/pyreadr] it shows me this error LibrdataError: Unable to convert string to the requested encoding (invalid byte sequence) (#64)
I managed to download the file and reproduce the error. Reading the first bytes of the file I got this:
b'RDX3\nX\n\x00\x00\x00\x03\x00\x03\x06\x01\x00\x03\x05\x00\x00\x00\x00\x06**CP1252**\x00'
I think CP1252 is the encoding, meaning Windows-1252. Right now as indicated in the Known limitations section, pyreadr does not support other encodings different from UTF-8.
Cannot read RData or rds files in encodings other than utf-8.
That means this file is not supported.
This limitation comes from the C backend librdata. Looking at the C source code I have the feeling the error message should be different, so I am going to make an issue there for them to take a look. I will also ask if other encodings could be supported. It may come at some point in the future.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#64 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASUA2CQRKWJHA55BNJKQV3TS4ZTSFANCNFSM4WWGP7UQ>.
|
@69hed could you please share the file again? It has been deleted from dropbox. |
@69hed recovered the file and hosted it here: https://github.com/ofajardo/readstat_test_files/blob/master/tip2020.rda for easier sharing with librdata people, who is looking at it. |
I want to open below dataset in python, but it keeps showing me an error. The codes are:
The error:
~/opt/anaconda3/lib/python3.8/site-packages/pyreadr/pyreadr.py in read_r(path, use_objects, timezone)
46 if not os.path.isfile(path):
47 raise PyreadrError("File {0} does not exist!".format(path))
---> 48 parser.parse(path)
49
50 result = OrderedDict()
~/opt/anaconda3/lib/python3.8/site-packages/pyreadr/librdata.pyx in pyreadr.librdata.Parser.parse()
~/opt/anaconda3/lib/python3.8/site-packages/pyreadr/librdata.pyx in pyreadr.librdata.Parser.parse()
LibrdataError: Unable to convert string to the requested encoding (invalid byte sequence) #
How I can fix this?
The text was updated successfully, but these errors were encountered: