'ascii' codec can't decode byte 0xd1 in position 2: ordinal not in range(128) #23444

mau21mau · 2018-11-01T09:57:48Z

Code Sample, a copy-pastable example if possible

dataframe = pd.read_excel(
                StringIO(self.file_stream), sheet, na_values=['undefined', 'NaN'], header=None, keep_default_na=False
            )

Problem description

I'm trying to read a xls file with read_excel() method and it throws the error on the title. If I try to read the file with xlrd lib I can fix the error by providing the parameter encoding_override with the file encoding. I've seen some Stackoverflow answers and all of them recommend using an encoding parameter, which doesn't exist. Why don't the implement an encoding parameter for the read_excel() method, and just use it as encoding_override when reading the file with xlrd?

The text was updated successfully, but these errors were encountered:

gfyoung · 2018-11-01T17:56:39Z

That seems reasonable to me. We have an encoding parameter in read_csv, so adding it to read_excel would be consistent.

…-dev#23444)

WillAyd · 2018-11-05T17:06:59Z

Is this only applicable to files created with Excel 95 and earlier?

https://xlrd.readthedocs.io/en/latest/unicode.html

If so I am -1 here as I can't imagine we support anything else explicitly with that type of age to it

gfyoung · 2018-11-05T17:10:46Z

@WillAyd : Consistency with read_csv is the main reason why I would support this parameter. Our data IO API is quite fragmented, so adding this parameter is a step in the right direction.

WillAyd · 2018-11-05T17:17:17Z

Am I misreading it that all Excel files created in the past 21 years contain an encoding of utf-16-le though? If so while consistency is good the keyword would either be unused or actually confusing / counter-productive to almost every Excel file still out there in the wild.

gfyoung · 2018-11-05T17:26:55Z

Am I misreading it that all Excel files created in the past 21 years contain an encoding of utf-16-le though?

Uncertain.

If so while consistency is good the keyword would either be unused or actually confusing / counter-productive to almost every Excel file still out there in the wild.

Confusing? Not if good documentation is written for it. Would be good then to clarify xlrd docs, and @mau21mau might need more clarification on the type of Excel file you were trying to read.

WillAyd · 2018-11-05T17:31:52Z

My big pushback is on referring to this as encoding because I don't think it covers the same concept as other IO functions. I haven't stepped through the source code of xlrd but I am interpreting it as an intentional disambiguation that they chose the parameter name of encoding_override and not just encoding. Their docs suggest that this is only used in case of missing or incorrect code pages, and therefore may not explicitly determine encoding.

What if we just either changed the intention here to add encoding_override as a parameter or alternately allowed kwargs to go through to read_excel? I'd be fine with either of those, but don't want to mangle concepts with other IO functions

gfyoung · 2018-11-05T17:45:30Z

@WillAyd : I'm not sure I fully understand your argument. The word "encoding" seems to mean the same thing for xlrd as it does for read_csv, even if the determination / assumption of encoding seems predicated on the existence of a record.

mau21mau · 2018-11-07T14:23:38Z

My big pushback is on referring to this as encoding because I don't think it covers the same concept as other IO functions. I haven't stepped through the source code of xlrd but I am interpreting it as an intentional disambiguation that they chose the parameter name of encoding_override and not just encoding. Their docs suggest that this is only used in case of missing or incorrect code pages, and therefore may not explicitly determine encoding.

What if we just either changed the intention here to add encoding_override as a parameter or alternately allowed kwargs to go through to read_excel? I'd be fine with either of those, but don't want to mangle concepts with other IO functions

I don't know exactly what kind of file it is, since it's from one of our users (I don't know how he generated the file). The thing is that xlrd does support it and, thus, I thought that read_excel, since it uses xlrd, should be prepare, being it with encoding parameter or kwargs, to account for that scenario.

sindhuprakasam · 2019-03-22T12:23:44Z

Am facing the same issue when i try to export my pandas dataframe to an excel file, so the issue is still open for that as well ?

gfyoung · 2019-03-22T18:52:32Z

@sindhusubha : Absolutely

mau21mau mentioned this issue Nov 1, 2018

feat: Add support for encoding parameter on read_excel #23448

Closed

4 tasks

gfyoung added Enhancement IO Excel read_excel, to_excel labels Nov 1, 2018

mau21mau added a commit to mau21mau/pandas that referenced this issue Nov 5, 2018

enhancement: Wrapped encoding into kwargs for xlrd lib (closes pandas…

077efe7

…-dev#23444)

WillAyd mentioned this issue Mar 4, 2019

ExcelFile class has no attribute 'encoding'. Is it correct? #25523

Closed

This was referenced Aug 16, 2020

BUG: read_excel not accepting encoding on 1.1.0 #35753

Closed

REGR: re-add encoding for read_excel #35758

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'ascii' codec can't decode byte 0xd1 in position 2: ordinal not in range(128) #23444

'ascii' codec can't decode byte 0xd1 in position 2: ordinal not in range(128) #23444

mau21mau commented Nov 1, 2018 •

edited

gfyoung commented Nov 1, 2018

WillAyd commented Nov 5, 2018

gfyoung commented Nov 5, 2018 •

edited

WillAyd commented Nov 5, 2018

gfyoung commented Nov 5, 2018 •

edited

WillAyd commented Nov 5, 2018

gfyoung commented Nov 5, 2018

mau21mau commented Nov 7, 2018

sindhuprakasam commented Mar 22, 2019

gfyoung commented Mar 22, 2019

'ascii' codec can't decode byte 0xd1 in position 2: ordinal not in range(128) #23444

'ascii' codec can't decode byte 0xd1 in position 2: ordinal not in range(128) #23444

Comments

mau21mau commented Nov 1, 2018 • edited

Code Sample, a copy-pastable example if possible

Problem description

gfyoung commented Nov 1, 2018

WillAyd commented Nov 5, 2018

gfyoung commented Nov 5, 2018 • edited

WillAyd commented Nov 5, 2018

gfyoung commented Nov 5, 2018 • edited

WillAyd commented Nov 5, 2018

gfyoung commented Nov 5, 2018

mau21mau commented Nov 7, 2018

sindhuprakasam commented Mar 22, 2019

gfyoung commented Mar 22, 2019

mau21mau commented Nov 1, 2018 •

edited

gfyoung commented Nov 5, 2018 •

edited

gfyoung commented Nov 5, 2018 •

edited