Support unicode characters longer than a byte

At the moment the `String.charCodeAt()` method is used to get the user script character codes, and it's return value is stored in a `Uint8Array`.
If the value returned is larger than a byte, as it would happen with some UTF-8 characters, then some of that information is lost and the wrong character is encoded into the hex file.


https://github.com/bbcmicrobit/PythonEditor/blob/3d30f7ff2180c2eea0a9d9737edba7c817a413a5/python-main.js#L84-L92

This is easy to reproduce, simply create a hex file with a UTF-8 character larger than a byte, download the hex, and load it back into the editor.

```
# UFT-8 character longer than a byte: Σ
```

Becomes:

```
# UFT-8 character longer than a byte: £
```

As the chracter `0x03A3` (Σ) has been encoded as `0xA3` (£)

	// add header, pad to multiple of 16 bytes
	data = new Uint8Array(4 + script.length + (16 - (4 + script.length) % 16));
	data[0] = 77; // 'M'
	data[1] = 80; // 'P'
	data[2] = script.length & 0xff;
	data[3] = (script.length >> 8) & 0xff;
	for (var i = 0; i < script.length; ++i) {
	data[4 + i] = script.charCodeAt(i);
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support unicode characters longer than a byte #60

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support unicode characters longer than a byte #60

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions