-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Algorithmic complexity of instructions #56
Comments
In an ideal world, they'd all be O(1), but the real world isn't always ideal :-) In particular, when a module creates a situation where an encoding change is unavoidable (such as: creating a string from utf8 data, then measuring its wtf16 length, or vice versa), O(n) cost is unavoidable. What we can hope and should aim for is that avoidable O(n) costs are, indeed, avoided. The stringref proposal is very consciously designed around accepting this reality. It aims to give engines the freedom to pick from a variety of fundamental implementation techniques as well as from a large bag of optional optimizations to make as many cases as possible as fast as possible. Lazy conversions and flexible internal representations are key tools there, at least in the long run; to get an implementation off the ground, fairly simple implementations are viable as well. This does lead to the situation that it may well be the case in many engines that either Andy has written up the performance characteristics of the V8 prototype: https://docs.google.com/document/d/1w2jLY7LuMG1grm_u7avtoAqYW1tcvPt4zc5_yNwTRyQ/edit Pragmatically, I'd recommend to start with whatever string representations/operations you already have in Spidermonkey, and only implement additional optimizations/representations over time if the need arises. |
Given the whole point of this proposal is that engines/hosts use different implementations for strings, it's not surprising that different types of strings will have different performance characteristics. Although I do wonder, some languages might well be prepared to compile to handle either kind of string, in which case an instruction for determining the best type of stringview to use would be potentially useful. i.e. Something like: (if
(i32.eq
;; Returns a code for preferred encoding
(stringview.preferred)
(i32.const $CONST_FOR_UTF8)
)
(then
;; Use the utf-8 implementation
)
(else
;; Use other implementation
)
) This could even be parameterized to take a stringref as hosts/engines may well even have multiple kinds of stringref: (stringview.preferred (local.get $theString)) |
What is the desired algorithmic complexity of the following instructions?
Disclaimer: I understand that the spec says nothing about algorithmic complexity of instructions. However the rough algorithmic complexity of instructions does matter in-practice for whether a module can feasibly run on an implementation.
get_stringview_wtf8
stringref
is implemented with a WTF16 encoding, this sounds like it’s O(n) to copy.stringref
is implemented with a WTF8 encoding, this can be a O(1) no-op.get_stringview_wtf16
stringref
is implemented with a WTF16 encoding, this can be a O(1) no-op.stringref
is implemented with a WTF8 encoding, this sounds like it’s O(n) to copy, or compute breadcrumbs.string.concat
string.eq
string.measure_[utf8/wtf8/wtf16]
string.is_usv_sequence
It seems like many operations here are either linear or constant time depending on how the host implements stringref (WTF8/WTF16 backed) (with ropes/without) (hash-consed constants or not).
If this is the case, I’m concerned that this could lead to future incompatibilities where modules are written assuming something is constant time, but it’s in fact linear on a host with a different implementation.
The text was updated successfully, but these errors were encountered: