Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While profiling hxb we came across an unrelated problem with typing passes. Delays for those are currently managed in a big
(int * (unit -> unit)) list
where the integer corresponds to the typing pass. Adding entries browses the list until the priority is equal (or larger fordelay_late
) and then inserts the new element. Flushing a given pass removes elements until the priority is lower.The problem with the addition case is that it has linear cost over the number of entries. For example, any addition of the lowest priority
PFinal
will scroll past all current entries to insert it. This gets even worse becausedelay
isn't tail-recursive, which leads to really deep call stacks.This PR changes it so that we maintain an array of lists, where the array indices correspond to the priorities. Insertion then just has to index into the array and append to the list, while flushing checks all arrays in order until it finds one with a non-empty list, and then recurses after executing its task.
Importantly, flushing only ever handles the first element it finds for a given pass and then starts over. This is because any delay might insert a higher-priority delay which then has to be processed before the rest of our current priority list. In order to not do too many pointless array lookups, we remember the
delayed_min_index
, which is the lowest pass whose list might have entries.While this is clearly the better approach in my head, it would be good to have some profiling data to support this.