Don't decode bytes (which may not be UTF8) when displaying SecretBytes #8012

alexmojaki · 2023-11-04T19:38:02Z

Change Summary

Removes unneeded .decode().

I couldn't find any justification for decoding the bytes. The code was introduced in this commit in #5674 and I don't see any PR comments explaining it. @adriangb any idea?

There are some encodings where a non-empty byte string can decode to any empty string:

>>> ''.encode('utf16')
b'\xff\xfe'
>>> ''.encode('utf16').decode('utf16')
''
>>> ''.encode('utf16').decode('utf16') == ''
True

but .decode() without arguments should always use UTF8 in Python 3 regardless of locale settings.

I've checked that any non-empty byte string with up to 3 bytes decodes to a non-empty string if it can be decoded at all:

import itertools

count = 0
total = 0
for length in [1, 2, 3]:
    for b in itertools.product(range(256), repeat=length):
        total += 1
        b = bytes(b)
        assert b
        assert len(b) == length > 0
        try:
            s = b.decode()
        except UnicodeDecodeError:
            continue
        assert s
        assert len(s) > 0
        assert b == s.encode()
        count += 1

print(count)  # 2668544
print(total)  # 16843008

I also think that any non-empty byte string should be treated as non-empty when displaying even if it can be decoded to an empty unicode string. Although I'm also not sure about revealing whether any secret value is empty.

Related issue number

Closes #7971

Checklist

The pull request title is a good summary of the changes - it will be used in the changelog
Unit tests for the changes exist
Tests pass on CI
Documentation reflects the changes where applicable
My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @lig

alexmojaki · 2023-11-04T19:51:43Z

please review

alexmojaki · 2023-11-04T19:53:42Z

I couldn't find any justification for decoding the bytes. The code was introduced in this commit in #5674 and I don't see any PR comments explaining it. @adriangb any idea?

Pinging @adriangb because I wrote the above in an edit and I don't know if that triggers a notification.

adriangb

I don’t see any problem with this. In not sure why I did that in that commit, may have been carryover from moving files or something. I don’t see the decode() call in v1

adriangb · 2023-11-04T20:00:10Z

Mind adding a small release note as a bug fix?

alexmojaki · 2023-11-04T20:14:06Z

Mind adding a small release note as a bug fix?

Where would I add that? I can't see anything in https://docs.pydantic.dev/latest/contributing/ and adding to HISTORY.md doesn't look right.

adriangb · 2023-11-06T01:44:53Z

I guess we just use a tag on the PR now

Don't decode bytes (which may not be UTF8) when displaying SecretBytes

0da788a

alexmojaki marked this pull request as ready for review November 4, 2023 19:51

pydantic-hooky bot added the ready for review label Nov 4, 2023

pydantic-hooky bot assigned lig Nov 4, 2023

adriangb approved these changes Nov 4, 2023

View reviewed changes

adriangb added the relnotes-fix Used for bugfixes. label Nov 4, 2023

adriangb merged commit 833faaf into pydantic:main Nov 6, 2023
59 of 61 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't decode bytes (which may not be UTF8) when displaying SecretBytes #8012

Don't decode bytes (which may not be UTF8) when displaying SecretBytes #8012

alexmojaki commented Nov 4, 2023 •

edited by pydantic-hooky bot

alexmojaki commented Nov 4, 2023

alexmojaki commented Nov 4, 2023

adriangb left a comment

adriangb commented Nov 4, 2023

alexmojaki commented Nov 4, 2023

adriangb commented Nov 6, 2023

Don't decode bytes (which may not be UTF8) when displaying SecretBytes #8012

Don't decode bytes (which may not be UTF8) when displaying SecretBytes #8012

Conversation

alexmojaki commented Nov 4, 2023 • edited by pydantic-hooky bot

Change Summary

Related issue number

Checklist

alexmojaki commented Nov 4, 2023

alexmojaki commented Nov 4, 2023

adriangb left a comment

Choose a reason for hiding this comment

adriangb commented Nov 4, 2023

alexmojaki commented Nov 4, 2023

adriangb commented Nov 6, 2023

alexmojaki commented Nov 4, 2023 •

edited by pydantic-hooky bot