Skip to content

Base64decode(x).decode("utf-8") doesn't handle multiple escape characters properly #99022

@funkybunchofoats

Description

@funkybunchofoats

Bug report

if there are multiple escape sequences back to back, the .decode(x) may "eat" trailing escape sequences/special characters.
base64.b64decode(x).decode('utf-8')
for example, when parsing HL7 messages from base64 encoding I had the following case:
b"\\.br\\\n"
this should decode to: "\.br\\n"
but instead is translated to "\.br"

I was able to work around by doing the decode in two steps and using a replacement of b"\\.br\\\n" to b"\n\n" as that what the decoded message should look like.

Your environment

  • CPython versions tested on: 3.10.7
  • Operating system and architecture: Windows Server 2016 64 bit

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions