New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discrimination trees for instance search #7109
Conversation
0a72ae2
to
545437d
Compare
38e703e
to
2aecf36
Compare
I actually managed to get motivated to finish this today, though not before getting nerd-sniped by a GHC bug.. Anyway, this PR is pretty much finished: the instances pulled in from the scope are now stored in a discrimination tree(-like) data structure (it did not change since last week; the same remarks about the case-tree-like representation being an implementation of path compression apply). Other than wiring it up to the type checker, what did change was the handling of eta-equality. The simplest approach is to just replace everything that might be of an eta-record type by a dummy; this is simpler than the "correct" approach of having an Perhaps more importantly, I added a billion profiling points everywhere. Running with
Then, for each "class", we have a block like this, tallying the performance of the discrimination tree. In order, we have the total number of instance constraints headed by this name; the total number of candidates that the discrimination tree excluded; the number of times that there was only one possible candidate (this includes locals), and the total number of
A refreshingly simple counter, this time for the function which implements sorting candidates for #6955. The 1Lab uses
Back to the useful statistics, we have a tally of what, if anything, made the discrimination tree matcher go exploring. "Exploration" is what I call what happens whenever you have something like instance
foo : X A
bar : X B
⊢ _ : X ?0 Since
The benchmarking for instance search also got an overhaul, since it was pretty hard to tell where time was being spent before; this included splitting off
|
|
Here are some more complete performance numbers for the 1Lab, this patch vs.
While 7 seconds might seem like a bit much spent looking up instances, keep in mind that the 1Lab calls instance search 4,695,879 times --- so, on average, looking up instance candidates takes 0.0016 ms per constraint. Here are some other interesting things I've noticed staring at the per-class statistics. Overlap with sortsThis is now fixed In the 1Lab, we have an instance
Underlying-Type : ∀ {ℓ} → Underlying (Type ℓ)
Underlying-n-Type : ∀ {ℓ n} → Underlying (n-Type ℓ n) In the discrimination tree, class Underlying: attempts 280,570
class Underlying: discarded early 1,232,608
class Underlying: total candidates visited 908,626 While quite a few instances are discarded early, not a single instance constraint had a single candidate after filtering with the discrimination tree. This could easily be remedied by adding a Classes on universe-polymorphic data typesThe discrimination tree hates classes like record Functor {adj : Level → Level} (M : ∀ {ℓ} → Type ℓ → Type (adj ℓ)) : Typeω where
field
map : ∀ {ℓ} {ℓ'} {A : Type ℓ} {B : Type ℓ'} → (A → B) → M A → M B
instance
Functor-List : Functor List
Functor-List = record { map = go } where
go : ∀ {ℓ ℓ'} {A : Type ℓ} {B : Type ℓ'} → (A → B) → List A → List B
go f [] = []
go f (x ∷ xs) = f x ∷ go f xs The "key" generated for
The PathP conundrumThis is now fixed Because instances are added under the normalised form of their types, any
Changing
Every single
|
d7eda4c
to
828fdd7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really awesome. I didn't have the time to really dive into the details of the implementation, but I scrolled through and left a few comments. In any case, since there are good performance improvements and there do not seem to be any regressions in the test suite I am happy to approve this.
I have no idea why but this causes the 1Lab's implementation of "deriving show" for lists to no longer termination-check; this would probably affect other users, too, and possibly even the stdlib/cubical test suites.
2ff567a
to
9a0adb1
Compare
Selfishly bumping to include agda/agda#7109. Go forth and instance search
This PR optimizes instance search by storing global instances in a discrimination tree(-like; see below) data structure, and using this to narrow the list of possible candidates, instead of the previous strategy of linearly trying all instances in scope. The performance gains are significant: see this comment below for some numbers.
The initial lookup in the discrimination tree still has to be followed by a linear narrowing of the candidates using the conversion checker proper, since some (classes of) terms are all bunched together — e.g., all sorts become one Key, and non-constant lambda abstractions become wildcards.
The trade-off here is (essentially) how much work it takes to turn a term into a
Key
, vs. how many candidate checks we expect the extra accuracy to save. For example, an accurate representation of constant lambdas (or reflexive paths) takes more work, but allows us to narrowDec (PathP (λ i → A) x y)
instances based onA
--- since, without this extra bit of work,Dec (PathP (λ i → Bool) x y)
andDec (PathP (λ i → Nat) x y)
look indistinguishable to the discrimination tree.The representation of the tree data structure implemented here looks less like a
Trie
, and more like the abstract machine's fast case trees. Indeed they are essentially the same, with the difference being entirely in how the lookup function uses the tree to traverse the term: rather than blocking on metavariables, we instead explore all the possibilities; and rather than having fallback cases, for when none of the branches matched, there is an "and-also" used to implement overlap. This representation essentially bakes in a form of "path compression" for the instance tree.Consider this instance from the 1Lab:
Naïvely, we could turn the "instance head",
Extensionality ...
, into a list ofKey
s, by repeatedly destructuring the term and noting down the actual rigid symbols we run into. The result is as follows:Here,
_
is used wherever the term wasn't rigid enough. In addition to any variables bound by the type of the instance, they include (e.g.) level arguments. If we use this key to generate a literalTrie
, there would be indirections standing for each of the underscores, including those at the end, which is quite expensive; Moreover, to look up in this trie, we would have to force the type of the instance constraint quite far, to generate the corresponding key.By representing the instance tree through the phrasing of case trees, we simultaneously remove the extra indirections and record exactly what parts of the term need to be forced to look up the instance candidates. From the instance above, we obtain the case tree fragment
Note that the "variable to case on" field uses the same scoping as the AM's case trees, i.e. the term is unpacked into the spine for matching. This case tree precisely records that we have to force (a) the type of the instance meta itself, to find the class name; (b) the argument to
Extensionality
, to find out whether it is a function type; and (c) the domain of the function type, to find out whether it is a Σ.The PR also adds benchmarking points for the various parts of instance search (and adds a benchmarking point for
evalTCM
, used in unquoting tactics), and quite verbose ticky profiling using--profile=instances
; see this comment for information on how to read the output.