You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The base bencoding format specifies encoding only for strings of bytes, not Unicode strings. When decoding, there is no way to distinguish if the original data was an UTF-8 string or a byte buffer that appears to look like an UTF-8 string.
The module documentation should clearly specify (and the implementation be tested) how Perl strings with the utf8 flag given in the input to bencode will be handled. Throwing an exception would be an appropriate behavior, in order to force the user of the module to properly encode its data as bytes.
The bdecode function should clearly disallow a string of characters and allow only a string of bytes.
The text was updated successfully, but these errors were encountered:
The UTF8 flag is irrelevant, and whether the string was bytes or characters cannot be known outside of one particular case. Namely, if the string matches /[^\x0-\xff]/, then it contains some wide characters, so it cannot be a (proper) byte string. (It may contain bytes mixed in with the characters if the code constructing it is buggy.) But a string that does not match this could be anything, regardless of whether its UTF8 flag is set.
In any case you are right, the docs should declare how the module handles this issue.
The base bencoding format specifies encoding only for strings of bytes, not Unicode strings. When decoding, there is no way to distinguish if the original data was an UTF-8 string or a byte buffer that appears to look like an UTF-8 string.
The module documentation should clearly specify (and the implementation be tested) how Perl strings with the utf8 flag given in the input to
bencode
will be handled. Throwing an exception would be an appropriate behavior, in order to force the user of the module to properly encode its data as bytes.The
bdecode
function should clearly disallow a string of characters and allow only a string of bytes.The text was updated successfully, but these errors were encountered: