You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are some complications here b/c we'd like to minimize how many times we scan through each unique string, but we also don't necessarily want to allocate and store states at every selected cut point. This is probably what we should do when we get to handling the issue:
Actually confirm that we can get a substantial performance improvement of substr by testing taking substrings of a long string repeatedly; if we can't then there really isn't much of a benefit as with substrings of any substantial size the bulk of the current computation in is substr anyway.
Compute how many ESC sequences there are between the lowest cut-point and the highest cut-point
We will store these ESC sequences with their states
What do we do about sequential ESC sequences? Ideally we only store one entry per sequence of ESC sequences, but that means we have to parse each sequence on the first pass.
This is all under the presumption that there are fewer ESC sequences than distinct cut points
Additionally, one big issue with this is that it only works well for byte encoded strings, as we still have to find the byte that corresponds to a particular width or character between two ESC sequences. THIS COULD BE A DEAL BREAKER.
Once we have the recorded points, we can binary search for them since we'll presumably store them sorted.
We can even keep the most recent search path for beginning and end cut points as those should only be log(N) long and we can possibly re-use them with the subsequent cut points to reduce how many hops we have to do
Alternate:
Compute state at every cut point
TBD whether we need different states for starting and ending cut points
currently
ansi_substr2
does a lot of the work in R.The text was updated successfully, but these errors were encountered: