-
Notifications
You must be signed in to change notification settings - Fork 9
Enable unions in lockstep tests #614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a96a9cb to
46b6c90
Compare
This commit should be dropped before merging, in favour of #614.
|
We can rebase this once #608 is merged. |
46b6c90 to
402b713
Compare
jorisdral
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice nice!
| genUnionTableVar = QC.frequency [ | ||
| (9 * length unionDescendantTableVars, genUnionDescendantTableVar) | ||
| , (1 * length notUnionDescendantTableVars, genNotUnionDescendantTableVar) | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea is to skew more towards union descendants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, though in a follow-up patch I made this even more extreme and moved the trivial cases of non-union tables to a unit test, and replaced genUnionTableVar with genUnionDescendantTableVar so the QLS test only covers the non-trivial cases of union-derived tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the original code does not do what it intends to do. It intends to have 90% : 10% skew toward union tables, but in fact the result is 40% : 60% skew, i.e. 60% are trivial non-union tables.
I fix this in a later patch (by removing the trivial case entirely), but it could also be fixed by moving the two cases up into the parent QC.frequency, each with their corresponding guard. That way we don't get forced into picking the trivial case when there are no non-trivial cases available. It's very common that there are no union tables available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 but even in the parent QC.frequency you'd get a different distribution than the integers you supply, depending on what actions are available and which guards are true/false. But I agree it's still better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you would, but that should then work as intended.
402b713 to
72d1f12
Compare
|
@jorisdral I see my changes raced with your review, but I happened to address some of your comments. |
449d67f to
c66028b
Compare
dcoutts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously I'm also reviewing my own changes here, so I think @mheinzel should review this one too now.
| -- We already want to enable unions, but some operations on tables don't | ||
| -- support unions yet. Therefore, we want to only run them on tables that | ||
| -- don't descend from a union. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the extra changes I've added to this PR, there's now another reason: because we don't want to squander QLS coverage on boring trivial cases that are now covered in unit tests.
| genUnionTableVar = QC.frequency [ | ||
| (9 * length unionDescendantTableVars, genUnionDescendantTableVar) | ||
| , (1 * length notUnionDescendantTableVars, genNotUnionDescendantTableVar) | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the original code does not do what it intends to do. It intends to have 90% : 10% skew toward union tables, but in fact the result is 40% : 60% skew, i.e. 60% are trivial non-union tables.
I fix this in a later patch (by removing the trivial case entirely), but it could also be fixed by moving the two cases up into the parent QC.frequency, each with their corresponding guard. That way we don't get forced into picking the trivial case when there are no non-trivial cases available. It's very common that there are no union tables available.
| -- | ||
| -- Tests for 0-way and 1-way unions are included in the UnitTests | ||
| -- module. n-way unions for n>3 lead to large unions, which are less | ||
| -- Tests for 1-way unions are included in the UnitTests module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point being that there are no 0-way tests (the type forbids it), so it's not actually covered in the unit tests.
| [ genActionsSession | ||
| , genActionsTables | ||
| , genUnionActions | ||
| , genActionsUnion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for consistency, while we're at it
| -- | The state of model tables at the point they were closed. This is used | ||
| -- to augment the tables from the final model state (which of course has | ||
| -- only tables still open in the final state). | ||
| , closedTables :: !(Map Model.TableID Model.SomeTable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turned out that I never needed this change (as I had expected I would) but I've left it here anyway since it's a cheap and sane generalisation, and may indeed be useful later.
| -- | The subset of tables (open or closed) that were created as a result | ||
| -- of a union operation. This can be used for example to select subsets of | ||
| -- the other per-table tracking maps above, or the state from the model. | ||
| -- The map value is the size of the union table at the point it was created, | ||
| -- so we can distinguish trivial empty unions from non-trivial. | ||
| , unionTables :: !(Map Model.TableID Int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the reason we need this is that the model tables do not in fact tell us that they're derived from a union, despite the existence of isUnionDescendant :: Model.Table k v b -> IsUnionDescendant. That's because Model.Table k v b and Model.Table k v b are not the same type. One is Database.LSMTree.Model.Session as Model and the other is Database.LSMTree.Model.Table as Model. The latter is the one we get from the model state, and does not tell us about unions. The former is really a table handle, while the latter is a table state.
Otherwise I would have used the model final state + the closed tables (generalised in the previous patch) and selected a subset using isUnionDescendant.
But as it is, we have to track union tables again in the stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because
Model.Table k v bandModel.Table k v bare not the same type. One isDatabase.LSMTree.Model.Session as Modeland the other isDatabase.LSMTree.Model.Table as Model.
This is a slightly confusing sentence. Maybe we should qualify the imports differently? Or is there a typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a confusing sentence reflecting the confusion I encountered in debugging it :-)
In each individual module the imports are clear, but it goes like this:
module Test.Database.LSMTree.StateMachine where
import qualified Database.LSMTree.Model.Session as Model
module Database.LSMTree.Model.Session where
import qualified Database.LSMTree.Model.Table as Model
so for code like
, let unionTables =
Map.filter (Model.withSomeTable (\t -> Model.isUnionDescendant t == Model.IsUnionDescendant))
(snd <$> Model.tables finalState)
we get
• Couldn't match expected type: Model.Table k40 v0 b0
with actual type: Database.LSMTree.Model.Table.Table k v b
NB: ‘Model.Table’ is defined in ‘Database.LSMTree.Model.Session’
‘Database.LSMTree.Model.Table.Table’
is defined in ‘Database.LSMTree.Model.Table’
• In the first argument of ‘Model.isUnionDescendant’, namely ‘t’
which is sort-of clear actually. GHC does tell us the definition sites.
| inserts table' [(Key1 17, Value1 43, Nothing)] | ||
| inserts table'' [(Key1 17, Value1 44, Nothing)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! The untested unit test was buggy!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which no doubt git blame will tell me was my fault!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was all me 😜
| | credits <= 0 = do | ||
| _ <- guardTableIsOpen t | ||
| pure c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit more consistent/regular to specify it this way. We don't really expect the impl to allow supplying 0 credits to closed tables without an error. All operations on closed tables are an error, trivial or not.
| genUnionTableVar = QC.frequency [ | ||
| (9 * length unionDescendantTableVars, genUnionDescendantTableVar) | ||
| , (1 * length notUnionDescendantTableVars, genNotUnionDescendantTableVar) | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This nesting of QC.frequency in an outer QC.frequency doesn't do what one might expect. The decisions are dependent rather than independent.
| --TODO: remove this 0 special case once the general case covers it. | ||
| -- We do not need to optimise the 0 case. It is just here to | ||
| -- simplify test coverage. | ||
| | otherwise -> error "supplyUnionCredits: not yet implemented" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured we would more reliably notice this TODO and resolve it when we implement the general case, rather than notice a TODO on an out-of-the-way unit test to say to enable it once supplyUnionCredits was implemented. So better to have the unit test running, and temporarily implement the special case in the real impl.
|
If @jorisdral and @mheinzel are happy, we can squash the fixes and merge. |
| -- | The subset of tables (open or closed) that were created as a result | ||
| -- of a union operation. This can be used for example to select subsets of | ||
| -- the other per-table tracking maps above, or the state from the model. | ||
| -- The map value is the size of the union table at the point it was created, | ||
| -- so we can distinguish trivial empty unions from non-trivial. | ||
| , unionTables :: !(Map Model.TableID Int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because
Model.Table k v bandModel.Table k v bare not the same type. One isDatabase.LSMTree.Model.Session as Modeland the other isDatabase.LSMTree.Model.Table as Model.
This is a slightly confusing sentence. Maybe we should qualify the imports differently? Or is there a typo?
| inserts table' [(Key1 17, Value1 43, Nothing)] | ||
| inserts table'' [(Key1 17, Value1 44, Nothing)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was all me 😜
| genUnionTableVar = QC.frequency [ | ||
| (9 * length unionDescendantTableVars, genUnionDescendantTableVar) | ||
| , (1 * length notUnionDescendantTableVars, genNotUnionDescendantTableVar) | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 but even in the parent QC.frequency you'd get a different distribution than the integers you supply, depending on what actions are available and which guards are true/false. But I agree it's still better
1cde123 to
07c4354
Compare
The parentTable in the Stats tracks the ultimate parent for each table. This is used along with the action log to measure coverage of operations on table duplicates. With unions, a table can now have multiple ultimate parents. A union table's ultimate parents are the ultimate parents of each of its input tables. Also deal with the consequences of multiple parent tables for the dupTableActionLog tracking: we extend the action log for _each_ ultimate parent table separately.
Co-authored-by: Joris Dral <joris@well-typed.com>
Track the whole closed table, not just the table size. This will let us reuse the same tracking for additional purposes.
Label the number of tables with trivial and non-trivial unions. Label the number of actions on tables from non-trivial unions.
Can be enabled now that unions are implemented. And of course, because the test was not being run, there was a previously undetected bug in the unit test. Now fixed.
…redits The trivial case is when the table has no union level, as then there is no union debt, and supplying credits returns them all as leftovers. And implement these trivial cases.
We cover the trivial case of querying or supplying credits to tables that are not derived from a union operation in the unit tests, so we do not need to spend coverage space in the QLS tests on the trivial cases. So only gnerate RemainingUnionDebt, SupplyUnionCredits for tables that are derived from union operations.
Same principle as previous patches. Keep the precious coverage space in the QLS tests for the non-trivial cases, and cover the trivial case in a unit test.
e837a83 to
a3c5cd0
Compare
Slightly better memory use in the tests.
Description
Should go on top of #608 and #609. We should also add some union-related labelling.