You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To compare against "abc", we currently emit code to test whether uint16(first two bytes)==uint16("ab") and then whether (third byte)=="c".
I suspect that it'd be more efficient to load the three bytes into the first three bytes of a uint32 (two loads combined with |) and then compare that to uint32("abc\0").
I'm not sure whether we could do this with comparison-combining rewrite rules (note that the loads have side-effects in general, but we're sure in this case they won't fault) or whether it'd be better to fix this in walk.go, where this optimization is generated.
The text was updated successfully, but these errors were encountered:
Tricky one. This transformation could slow down/bloat some code (generally when uint16(first two bytes)==uint16("ab") is unlikely), so while I think the backend could do it I'm not sure it should unless it has likeliness information.
Note that for longer strings we could overlap the loads given they are unaligned anyway (not sure if we do this already - I don't think we do from what I see in walk):
I think that transformation could be applied greedily since it doesn't add any extra instructions to the 'not taken' path. It could also be applied in any situation where we want to check the last 3 characters in a string and we also know the length of the string is greater than 3 (i.e. we can access index -1 in the string and be in bounds). I suspect this crops up occasionally in string switch statement lowering. The overlapping characters could be masked out even when known to avoid double-load related issues in racy code. That said, this would all be quite hard in SSA, but maybe string comparison lowering could do it.
The path to resolution is known, but the work has not been done.
Feedback is required from experts, contributors, and/or the community before a change can be made.
Feb 18, 2020
I just remembered (vaguely) that overlapping loads caused performance problems a few years ago because they interfered with load-store forwarding.
Yeah, this might be a problem. Though since Go programs don't tend to copy strings by value that much we might not see load-hit-store hazards very often. My guess would be that strings are rarely still in the store buffer when we are comparing them, but that is just a guess.