reimplement foreign audit handling in resolver #240

Gankra · 2022-06-29T11:19:39Z

this code was hacked up in like week 1 and hasn't been touched since.

The main challenge to solve is criteria mapping. Foreign criteria are namespaced, but can be mapped into strict equivalences with local criteria (implies but in both directions). The two builtins should automap, but the user can also specify custom M:N mappings. How to apply that mapping efficiently is a bit of a concern for me. As far as I can tell, you need to do a quadratic amount of work in the worst-case to apply the mappings.

Consider the case of only 2:1 mappings, ([x] is set, [ ] is unset):

mapping 1: [ ][ ][ ][x][x] [ ][ ][ ][ ][ ] == [ ][ ][ ][ ][ ] [x][ ][ ][ ][ ]
mapping 2: [ ][ ][x][ ][ ] [x][ ][ ][ ][ ] == [ ][ ][ ][ ][ ] [ ][x][ ][ ][ ]
mapping 3: [ ][x][ ][ ][ ] [ ][x][ ][ ][ ] == [ ][ ][ ][ ][ ] [ ][ ][x][ ][ ]
mapping 4: [x][ ][ ][ ][ ] [ ][ ][x][ ][ ] == [ ][ ][ ][ ][ ] [ ][ ][ ][x][ ]

Then with this input, we have a cascading series of equalities:

[x][x][x][x][x] [ ][ ][ ][ ][ ]
[x][x][x][x][x] [x][ ][ ][ ][ ]  via mapping 1
[x][x][x][x][x] [x][x][ ][ ][ ]  via mapping 2
[x][x][x][x][x] [x][x][x][ ][ ]  via mapping 3
[x][x][x][x][x] [x][x][x][x][ ]  via mapping 4

At each step we have one criteria that wasn't involved before, and one criteria that only came into existence in the last step. I believe that this shows that you cannot simply do as we currently do with "implies" where you compute some kind of transitive closure ahead of time. Nor can you do a random-access lookup against your currently set criteria for what rules definitely apply.

Seemingly this means that you need to linearly scan over every mapping rule in order and apply them. Then you just need to do it all over again until you hit a fixed-point. Such a process would be O(k * j) with k criteria and j rules. The process will definitely terminate at this point as we are always making progress on each linear scan. Note that this cost would have to be paid for every union_with or set_criteria we do. So if we do g of those, we're talking about O(k * j * g).

One possible way to optimize this would be to have a mapping from individual criteria to all the rules that refer to them. So whenever we apply a rule, we can know what criteria are newly set and only check mappings that involve those criteria. Such a mapping would have O(k * j) size worst-case, and therefore take just as long to compute, but that would be a fixed cost instead of a cost we pay on every operation. Sadly I think you could still tune such an input such that the each entry has O(j) rules so this doesn't gain much in theory. In practice it might gain a lot, as most rules will only refer to a few criteria, and we will go from having to naively check every rule to knowing exactly what rules to check.

This is probably overthinking things but the resolver is a combinatoric time bomb, so shaving off multiplicative factors is worth some consideration. Also I have insomnia and apparently that means algorithms brain.

The text was updated successfully, but these errors were encountered:

Gankra · 2022-06-29T11:31:16Z

See also #16 and #4. Those discussions were a bit hairy because we had such a rough understanding of how the resolver would work and the semantics of criteria. These days I'm pretty confident in the semantics and this seems to make sense.

In the past I was thrown off by wanting to map foreign audit entries into local criteria only, but this proves problematic for dependency-criteria and unmapped criteria. You're punching a lot of semantic holes in their audits and playing with things that look a lot like "variance" on functions with types in the outputs (criteria) and inputs (dependency-criteria), although I think the variance stuff is mostly resolved by making all the mappings "strict equality" instead of only one-way.

I am now of the opinion that it's much simpler to just have all local and foreign criteria live at once, to reapply all equality rules at every step, and to evaluate foreign audits in their native criteria. The above comment is one issue that comes from doing this. The other issue is that the set of criteria we have to track can grow much faster, blowing the budget on CriteriaSet's current static allocation of 64 bits. We may need to move to a hybrid where we can fallback to bitvec. The code has always anticipated this, which is why CriteriaSets aren't Copy and need to be explicitly initialized with their "capacity" (number of criteria). Still I'd like to avoid doing that for as long as we can.

bholley · 2022-06-29T21:37:53Z

Would it simplify things if we only allowed 1:1 mappings for now? I'm not sure how widely custom criteria maps will be used, and so I don't want to put too much engineering effort into them before we have a clearer sense of the use-cases.

Gankra · 2022-06-29T22:42:43Z

Yeah it it's always 1:1 we can do some transitive closure stuff again and use the implies machinery.

bholley · 2022-06-29T22:48:28Z

Let's do that for now and then see if it's an issue in practice.

Gankra added the p1 do now label Jun 29, 2022

This was referenced Jun 30, 2022

implement builtin criteria automapping #65

Closed

criteria should be namespaced #16

Closed

Audit imports corner cases #4

Closed

Gankra mentioned this issue Jul 20, 2022

namespace and map criteria properly #273

Merged

Gankra closed this as completed in #273 Jul 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reimplement foreign audit handling in resolver #240

reimplement foreign audit handling in resolver #240

Gankra commented Jun 29, 2022

Gankra commented Jun 29, 2022 •

edited

bholley commented Jun 29, 2022

Gankra commented Jun 29, 2022

bholley commented Jun 29, 2022

reimplement foreign audit handling in resolver #240

reimplement foreign audit handling in resolver #240

Comments

Gankra commented Jun 29, 2022

Gankra commented Jun 29, 2022 • edited

bholley commented Jun 29, 2022

Gankra commented Jun 29, 2022

bholley commented Jun 29, 2022

Gankra commented Jun 29, 2022 •

edited