-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Description
Previous ID | SR-7602 |
Radar | None |
Original Reporter | @weissi |
Type | Bug |
Status | Resolved |
Resolution | Done |
Additional Detail from JIRA
Votes | 26 |
Component/s | Standard Library |
Labels | Bug, AffectsABI |
Assignee | @milseman |
Priority | Medium |
md5: f681e7f0741f98e436f811971add77c3
Sub-Tasks:
- SR-7725 [String] New validity model
Issue Description:
I believe that there are really only one (and a half) encodings that matter today: UTF8 (and its subset ASCII).
Therefore it's important that Swift's fastest String encoding is UTF8.
From what I can tell today the fastest String encodings are UTF16 and ASCII. Everything else will have worse performance.
This also seems to ABI relevant so AFAIK this needs to be fixed very soon.
Requirements:
-
being able to copy UTF-8 encoded bytes from a
String
into a pre-allocated raw buffer must be allocation-free and as fast asmemcpy
can copy them -
creating a String from UTF-8 encoded bytes should just validate the encoding and store the bytes as they are
-
slightly softer but still very strong requirement: currently (even with ASCII) only the stdlib seems to be able to get a pointer to the contiguous ASCII representation (if at all in that form). That works fine if you just want to copy the bytes (
UnsafeMutableBufferPointer(start: destinationStart, count: destinationLength).initialize(from: string.utf8)
which will usememcpy
if in ASCII representation) but doesn't allow you to implement your own algorithms that are only performant on a contiguously stored[UInt8]