Imperfect Filtering #408

ibnyusuf · 2022-08-15T12:41:30Z

Currently, Vampire code base maintains a lot of indices that work as perfect filters (return all and only terms required by theory).

It would be interesting to try replacing these with fewer indices that contain more terms each. Inferences then share indices and carry out post-procesing to filter out unwanted terms.

There is a trade-off in this proposal between time taken on managing indices vs time taken on post processing.

MichaelRawson · 2022-08-15T12:50:29Z

I am very much in favour of this. Both imperfect indexing and unification (or maybe some unification-like thing - UWA?) can be implemented very efficiently and concisely, but perfect indexing is much trickier. I'm also not convinced that perfect indexing pays off.

It would be interesting to try replacing these with fewer indices that contain more terms each. Inferences then share indices and carry out post-procesing to filter out unwanted terms.

Could you expand on this a little? Maybe an example? I think there's two separate things to try here:

imperfect indexing techniques like discrimination trees
coalescing multiple indices into one and filtering out unwanted items

I think both are worth exploring, to be clear.

quickbeam123 · 2022-08-15T13:33:40Z

Do we know of some papers evaluating indexing techniques in the past? Let's avoid common pitfalls and not try to reinvent the wheel ;)

ibnyusuf · 2022-08-15T13:35:07Z

I am not suggesting abandoning substitution trees. The suggestion is to combine similar indices (your #2). For example, instead of having a substitution tree that indexes only left-hand sides for demodulation, and another that indexes only left-hand sides for superposition, we have a tree that contains both sets of terms.

When performing a query on the index from demodulation, we may receive extra left-hand sides (those suitable for superposition, not demodulation) which will then need filtering out.

Looking into discrimination trees and other techniques is also interesting, but an even more radical departure from the current code base.

MichaelRawson · 2022-08-15T13:38:56Z

The suggestion, is to combine similar indices (your #2). For example, instead of having a substitution tree that indexes only left-hand sides for demodulation, and another that indexes only left-hand sides for superposition, we have a tree that contains both sets of terms.

I see. This seems like a good idea. My only concern is that the performance win from combining indices (mostly spatial locality and reduced index maintenance, right?) will be lost by increased query time resulting a larger index. But it's worth a try!

ibnyusuf · 2022-08-15T13:44:07Z

Do we know of some papers evaluating indexing techniques in the past? Let's avoid common pitfalls and not try to reinvent the wheel ;)

Here are a couple:

https://www.sciencedirect.com/science/articl/pii/B978044450813350028X?via%3Dihub
https://link.springer.com/content/pdf/10.1007/3-540-45744-5_18.pdf

ibnyusuf · 2022-08-15T15:10:44Z

And of course:
https://people.mpi-inf.mpg.de/~hillen/compit/evaluation.pdf

selig · 2022-08-16T03:21:28Z

Another bullet point in this story is the removal of things from indices with AVATAR. If we’re doing imperfect induces/filtering then we can try leaving things in and filtering them. The reason this may help is that it’s likely that some things are removed and re added many times. Also, keep Michael’s implementation of fingerprint indices in the story. Finally, we abandoned dismatching constraints in inst gen as substitution trees were too inefficient.

…

Sent from my iPhone On 15 Aug 2022, at 08:10, ibnyusuf ***@***.***> wrote: And of course: https://people.mpi-inf.mpg.de/~hillen/compit/evaluation.pdf — Reply to this email directly, view it on GitHub<#408 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAJCILWIAIX4AJYHS5FAOLTVZJMX7ANCNFSM56SDWBGQ>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

easychair · 2022-08-16T09:11:56Z

We should avoid this abstract discussion. The main question with indexing is always the performance. 1. Have you done profiling on index maintenance and use? 2. If yes, then what is the indexing operation (like forward demodulation) that is of potential concern? For AVATAR: indeed, immediately when AVATAR was introduced, there was a question of what is better: a. all clauses are kept in the index b. only current branch clauses are kept in the index There was an idea to have it as an option. While (b) seemingly can make index maintenance very expensive, we need experimental evidence of this. I think what we need is to do thorough profiling on hundreds of thousands of runs and find out whether ANY index essentially affects runtime on some problems. This would probably mean using over 50% of the overall runtime. If we don't do this, we may end up wasting time on discussing the issues and even more time in implementing variants of indexes. Best regards, Andrei

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Imperfect Filtering #408

Imperfect Filtering #408

ibnyusuf commented Aug 15, 2022

MichaelRawson commented Aug 15, 2022

quickbeam123 commented Aug 15, 2022

ibnyusuf commented Aug 15, 2022 •

edited

Loading

MichaelRawson commented Aug 15, 2022

ibnyusuf commented Aug 15, 2022

ibnyusuf commented Aug 15, 2022

selig commented Aug 16, 2022 via email

easychair commented Aug 16, 2022 via email

Imperfect Filtering #408

Imperfect Filtering #408

Comments

ibnyusuf commented Aug 15, 2022

MichaelRawson commented Aug 15, 2022

quickbeam123 commented Aug 15, 2022

ibnyusuf commented Aug 15, 2022 • edited Loading

MichaelRawson commented Aug 15, 2022

ibnyusuf commented Aug 15, 2022

ibnyusuf commented Aug 15, 2022

selig commented Aug 16, 2022 via email

easychair commented Aug 16, 2022 via email

ibnyusuf commented Aug 15, 2022 •

edited

Loading