-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Imperfect Filtering #408
Comments
I am very much in favour of this. Both imperfect indexing and unification (or maybe some unification-like thing - UWA?) can be implemented very efficiently and concisely, but perfect indexing is much trickier. I'm also not convinced that perfect indexing pays off.
Could you expand on this a little? Maybe an example? I think there's two separate things to try here:
I think both are worth exploring, to be clear. |
Do we know of some papers evaluating indexing techniques in the past? Let's avoid common pitfalls and not try to reinvent the wheel ;) |
I am not suggesting abandoning substitution trees. The suggestion is to combine similar indices (your #2). For example, instead of having a substitution tree that indexes only left-hand sides for demodulation, and another that indexes only left-hand sides for superposition, we have a tree that contains both sets of terms. When performing a query on the index from demodulation, we may receive extra left-hand sides (those suitable for superposition, not demodulation) which will then need filtering out. Looking into discrimination trees and other techniques is also interesting, but an even more radical departure from the current code base. |
I see. This seems like a good idea. My only concern is that the performance win from combining indices (mostly spatial locality and reduced index maintenance, right?) will be lost by increased query time resulting a larger index. But it's worth a try! |
Here are a couple: https://www.sciencedirect.com/science/articl/pii/B978044450813350028X?via%3Dihub |
And of course: |
Another bullet point in this story is the removal of things from indices with AVATAR. If we’re doing imperfect induces/filtering then we can try leaving things in and filtering them. The reason this may help is that it’s likely that some things are removed and re added many times.
Also, keep Michael’s implementation of fingerprint indices in the story.
Finally, we abandoned dismatching constraints in inst gen as substitution trees were too inefficient.
…Sent from my iPhone
On 15 Aug 2022, at 08:10, ibnyusuf ***@***.***> wrote:
And of course:
https://people.mpi-inf.mpg.de/~hillen/compit/evaluation.pdf
—
Reply to this email directly, view it on GitHub<#408 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAJCILWIAIX4AJYHS5FAOLTVZJMX7ANCNFSM56SDWBGQ>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
We should avoid this abstract discussion. The main question with indexing
is always the performance.
1. Have you done profiling on index maintenance and use?
2. If yes, then what is the indexing operation (like forward demodulation)
that is of potential concern?
For AVATAR: indeed, immediately when AVATAR was introduced, there was a
question of what is better:
a. all clauses are kept in the index
b. only current branch clauses are kept in the index
There was an idea to have it as an option. While (b) seemingly can make
index maintenance very expensive, we need experimental evidence of this.
I think what we need is to do thorough profiling on hundreds of thousands
of runs and find out whether ANY index essentially affects runtime on some
problems. This would probably mean using over 50% of the overall runtime.
If we don't do this, we may end up wasting time on discussing the issues
and even more time in implementing variants of indexes.
Best regards,
Andrei
|
Currently, Vampire code base maintains a lot of indices that work as perfect filters (return all and only terms required by theory).
It would be interesting to try replacing these with fewer indices that contain more terms each. Inferences then share indices and carry out post-procesing to filter out unwanted terms.
There is a trade-off in this proposal between time taken on managing indices vs time taken on post processing.
The text was updated successfully, but these errors were encountered: