Skip to content

Reading UTF16 with BOM from Blob #58718

Open
@jimmywarting

Description

@jimmywarting

Version

No response

Platform


Subsystem

No response

What steps will reproduce the bug?

const data = new Uint8Array([
  255, 254,  82, 0, 101, 0, 100, 0,  32, 0,  82, 0,
  111,   0,  99, 0, 107, 0, 115, 0,  32, 0,  40, 0,
   70,   0, 105, 0, 114, 0,  97, 0, 115, 0,  32, 0,
   84,   0,  97, 0, 114, 0, 104, 0, 105, 0, 110, 0,
  105,   0,  32, 0,  82, 0, 101, 0, 109, 0, 105, 0,
  120,   0,  41, 0
])
const type = 'text/plain; charset=utf-16'
const blob = new Blob([data], {type})

blob.text().then(str => {
  console.assert(str.length === 31, 'reading text from utf16 blob should remove BOM')
})

How often does it reproduce? Is there a required condition?

always

What is the expected behavior? Why is that the expected behavior?

Should be 31 character.

What do you see instead?

weird BOM character in the beginning

Additional information

Works in following env:
Firefox: ❌
Chromium : ✅
Safari: ✅
Bun: ✅
Deno: ❌
Node: ❌

I tried doing the same with Response#text they all reported 64 char (includes BOM)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions