-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeDecodeError when using cp-1252 file enconding #2612
Comments
On Python 2, this error can also occur when Unicode characters are present in the source code. When Sentry captures an exception, it retrieves the source code surrounding the error. During this process, Sentry checks the length of each line and may truncate it using def strip_string(value, max_length=None):
# type: (str, Optional[int]) -> Union[AnnotatedValue, str]
if not value:
return value
if max_length is None:
max_length = DEFAULT_MAX_VALUE_LENGTH
length = len(value.encode("utf-8"))
if length > max_length:
return AnnotatedValue(
value=value[: max_length - 3] + "...",
metadata={
"len": length,
"rem": [["!limit", "x", max_length - 3, max_length]],
},
)
return value The problematic line here is this one: # -*- coding: utf-8 -*-
print(len(u"••••")) # -> 4
print(len("••••")) # -> 12 The proposed solution was to use print(len(u"••••".encode("utf-8"))) # -> 12 However, this approach causes an issue: if the value is a byte string containing Unicode characters, an exception will be raised: print(len("••••".encode("utf-8"))) # -> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128) To summarize, if a Python 2 project uses Unicode characters in parts of its source code, then Sentry might raise an exception when capturing an event. Sentry's Footnotes |
Hey @stefan-hoelzl and @mcuppi thanks for reporting this. We are currently in the process to drop support for Python 2.x. (#2581) so this bug will probably never be fixed. Sorry. |
@antonpirker; Thank you for the quick reply. I understand that Python 2 is end-of-life and support is being phased out universally, but I think this bug is pretty severe. This bug not only masks the original exception, but it also raises an entirely new one. Currently, The fix is probably as simple as something like this (pseudo code): if isinstance(value, binary_type):
length = len(value)
else:
length = len(value.encode("utf-8")) Footnotes |
Lets see if we can flash out some time in January to address this. |
@antonpirker; Thank you. I'll see if I can submit one soon. |
Hey folks, a fix for this will be out tomorrow in 1.40.0. |
How do you use Sentry?
Sentry Saas (sentry.io)
Version
1.39.1
Steps to Reproduce
When I run the following script
I do not get any Issues reported
I poked around a bit and found out it is because of this line
a
UnicodeDecodeError
gets raised here.I am using Python 2.7
Expected Result
The
RuntimeError
appears as Issue in the web console.Actual Result
No Issue appears
The text was updated successfully, but these errors were encountered: