Skip to content

Conversation

@jorisdral
Copy link
Collaborator

@jorisdral jorisdral commented Sep 26, 2024

No description provided.

@jorisdral jorisdral self-assigned this Sep 26, 2024
@jorisdral jorisdral force-pushed the jdral/unify-IO-models branch from 56d1c73 to 850b4a8 Compare September 26, 2024 12:09
@dcoutts
Copy link
Collaborator

dcoutts commented Sep 26, 2024

A bit of explanation of motivation would be good here.

Copy link
Collaborator

@dcoutts dcoutts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok, and it's nice to have a clear structure and less duplication. What I want to see somewhere in these models and instances is a high level explanation of them all. A reader coming to see and understand needs some guidance. And then elsewhere perhaps links back to the central explanation.


{-------------------------------------------------------------------------------
Model instance
Model 2 instance
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want some docs somewhere that says what we're doing here with these different models. There's a hierarchy, but no explanation of what it is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a short module description for each model, but it could probably still need some more documentation.

What I want to see somewhere in these models and instances is a high level explanation of them all.

Not sure where best to put that, but it could be make use of the comparison in #403 (comment).

@jorisdral jorisdral force-pushed the jdral/unify-IO-models branch from 416c70b to 4ce8ef4 Compare October 8, 2024 15:52
@jorisdral jorisdral assigned mheinzel and unassigned jorisdral Oct 15, 2024
@mheinzel mheinzel force-pushed the jdral/unify-IO-models branch 3 times, most recently from 4809594 to 4d88ef3 Compare October 21, 2024 12:27
@mheinzel
Copy link
Collaborator

A refactoring to reduce code duplication. Previously, multiple models (table-level, session-level, mutable API on top) had been implemented twice, once for normal tables and once for monoidal tables.

The main change is that now the table- and session-level don't make a distinction between normal and monoidal tables any more. This also comes with some other re-organisation, which IMO makes the structure and dependencies between different models and tests clearer.

The are now three models, each depending on the previous one:

  • Model.Table: a pure model of a single table (e.g. updates :: [(k, Update v b)] -> Table k v b -> Table k v b)
  • Model.Session: a pure model of a session (e.g. updates :: MonadState Model m => [(k, Update v b)] -> TableHandle k v b -> m ())
  • Model.IO.{Normal,Monoidal}: STM-based models of an explicit (potentially closed) session (e.g. updates :: MonadSTM m => MSession m -> [(k, Update v b)] -> TableHandle k v b -> m ()), separated between normal/monoidal, which requires some type conversions such as convLookupResult

In a way, the StateMachine tests are a fourth model, based on Model.Session.

@jorisdral, since I picked up the work from you, I left the original commit in place (only rebased it), so you can see what changed from when it was handed over. A few things I changed:

  • moved modules around to have all models under a Model directory, each with a small description.
  • re-added export lists
  • avoided convLookupResult creeping into observeReal etc., and the need for convLookupResult' etc. in general.
  • avoided ResolveValue creeping into any Normal modules by passing ResolveSerialisedValue as an argument instead of using a class in Model.Session

@mheinzel mheinzel marked this pull request as ready for review October 21, 2024 12:42
@mheinzel mheinzel force-pushed the jdral/unify-IO-models branch from 4d88ef3 to ed722d6 Compare October 21, 2024 14:03
Comment on lines 1 to 10
-- | A pure model of a session, supporting both blobs and mupserts.
module Database.LSMTree.Model.Session (
-- * Model
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not important enough to change, but maybe the main type in this module should not be called Model, but Session? It models a session, similar to how Table models a table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisdral Do you think we should rename it?

Copy link
Collaborator Author

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the PR! Few things to address, and then we can merge 🚀

...

The are now three models, each depending on the previous one:

* `Model.Table`: a pure model of a single table (e.g. `updates :: [(k, Update v b)] -> Table k v b -> Table k v b`)

* `Model.Session`: a pure model of a session (e.g. `updates :: MonadState Model m => [(k, Update v b)] -> TableHandle k v b -> m ()`)

* `Model.IO.{Normal,Monoidal}`: STM-based models of an explicit (potentially closed) session (e.g. `updates :: MonadSTM m => MSession m -> [(k, Update v b)] -> TableHandle k v b -> m ()`), separated between normal/monoidal, which requires some type conversions such as `convLookupResult`

In a way, the StateMachine tests are a fourth model, based on Model.Session.

You could put this overview of the hierarchy, maybe extended to also mention the class modules and normal/monoidal, in a top-level module that only contains a bit of documentation. For example, a new Database.LSMTree.ModelHierarchy module, or something similar.

Also, isn't the StateMachine model "just" the session model? I wouldn't say it adds any functionality over the session model, it is just adding some boilerplate for the lockstep framework

@jorisdral, since I picked up the work from you, I left the original commit in place (only rebased it), so you can see what changed from when it was handed over. A few things I changed:

* moved modules around to have all models under a `Model` directory, each with a small description.

* re-added export lists

* avoided `convLookupResult` creeping into `observeReal` etc., and the need for `convLookupResult'` etc. in general.

* avoided `ResolveValue` creeping into any `Normal` modules by passing `ResolveSerialisedValue` as an argument instead of using a class in `Model.Session`

Regarding avoiding convLookupResult, what it the concrete change that made this possible?

Regarding ResolveValue, I don't think it was "bad" to have it in the normal modules, but it's nice that it is no longer required

, MCursor (..)
, MErr (..)
, MBlobRef (..)
, module Types
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of re-exports, because it makes it hard to follow where identifiers are coming from. However, it's also weird that one would have to import a bunch of other modules. What do you think?

If we change this, we should make sure it's consistent across the model/class modules

I'm currently leaning towards having no re-exports in the Class and Model modules. The model/class hierarchy is already a little bit complicated, and re-rexports of identifiers might not help

Also, I haven't really been consistent with re-exports vs. no re-exports in these modules myself, so I'm mainly criticising myself 😝

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes a big difference, but let me share some of my reasoning.

IMHO it's sometimes a bit clearer to have the re-export, for example with convLookupResult :: Model.LookupResult v b -> Class.LookupResult v b it makes it clear that it exists because the model's type doesn't quite fit the class we want to implement. If it was something like -> Real.LookupResult, the purpose is already less obvious.

Another way of thinking about it: The class uses a lot of the same types as the public API of the implementation. Now the question is: Is that a consciously chosen property of the class? If so, we can expect downstream code to also import something like LSMTree.Normal. If it is more of an implementation detail, we should re-export the types and use them as Class._, which allows using/implementing the class without considering the implementation, and also makes it easier to swap types out later.

{-# OPTIONS_GHC -Wno-missing-export-lists #-}

module Database.LSMTree.TableModel where
-- | A pure model of a single table, supporting both blobs and mupserts.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- | A pure model of a single table, supporting both blobs and mupserts.
-- | A pure model of a single table.
--
-- This model supports all the features for /both/ normal and monoidal tables. For example, the table model supports both blobs and mupserts. The former is typically only provided by the normal API, and the latter only by the monoidal API.

Comment on lines +45 to +55
updates :: V.Vector (Key, Update Value Blob) -> Table Key Value Blob -> Table Key Value Blob
updates = Model.updates Model.getResolve

inserts :: V.Vector (Key, Value, Maybe Blob) -> Table Key Value Blob -> Table Key Value Blob
inserts = Model.inserts Model.getResolve

deletes :: V.Vector Key -> Table Key Value Blob -> Table Key Value Blob
deletes = Model.deletes Model.getResolve

mupserts :: V.Vector (Key, Value) -> Table Key Value Blob -> Table Key Value Blob
mupserts = Model.mupserts Model.getResolve
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the ResolveValue -> ResolveSerialisedValue change, we would not need these functions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it adds a little bit of boilerplate here (didn't want to make the properties themselves more noisy), but I think it's overall still nicer than having spurious ResolveValue constraints show up.

@mheinzel
Copy link
Collaborator

Thanks for the review, good points!

Regarding avoiding convLookupResult, what it the concrete change that made this possible?

-    MLookupResult :: (Model.C_ v, Model.C_ blob)
-                  => Model.LookupResult v (Val h (WrapBlobRef h IO blob))
-                  -> Val h (R.LookupResult v (WrapBlobRef h IO blob))
+    MLookupResult :: (Class.C_ v, Class.C_ blob)
+                  => LookupResult v (Val h (WrapBlobRef h IO blob))
+                  -> Val h (LookupResult v (WrapBlobRef h IO blob))

And same for OLookupResult. So the conversion from Model.LookupResult (which can contain mupserts and blobs) to the real/Normal version happens as part of runModel.

@mheinzel mheinzel force-pushed the jdral/unify-IO-models branch from ed722d6 to bf808b3 Compare October 24, 2024 20:01
@mheinzel
Copy link
Collaborator

mheinzel commented Oct 24, 2024

I rebased and addressed all feedback. Just needs to be squashed.

Copy link
Collaborator Author

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I can't approve, so you'll have to do it @mheinzel

I rebased and addressed all feedback. Just needs to be squashed.

And yes, let's squash the commits. Make sure to add yourself and me as author for the resulting commit

Co-authored-by: Joris Dral <joris@well-typed.com>
@mheinzel mheinzel force-pushed the jdral/unify-IO-models branch from bf808b3 to 6907a73 Compare October 25, 2024 11:44
@mheinzel mheinzel enabled auto-merge October 25, 2024 11:47
@mheinzel mheinzel added this pull request to the merge queue Oct 25, 2024
Merged via the queue into main with commit 1678933 Oct 25, 2024
24 checks passed
@mheinzel mheinzel deleted the jdral/unify-IO-models branch October 25, 2024 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants