Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add projections #135

Merged
merged 4 commits into from
Nov 10, 2022
Merged

Add projections #135

merged 4 commits into from
Nov 10, 2022

Conversation

the-mikedavis
Copy link
Member

@the-mikedavis the-mikedavis commented Aug 22, 2022

Projections are a system for creating replicated ETS caches available on all members of a Khepri cluster. A new function khepri_projection:new/3 creates a projection resource:

ProjectionName = wood_stocks,
ProjectionFun = fun([stock, wood, Kind], Stock) -> {Kind, Stock} end,
Options = #{type => set, read_concurrency => true},
Projection = khepri_projection:new(ProjectionName, ProjectionFun, Options).

The projection resource may then be registered to a pattern of nodes with khepri:register_projection/4:

StoreId = stock,
PathPattern = "/:stock/:wood/*",
RegisterOptions = #{},
khepri:register_projection(
  StoreId, PathPattern, Projection, RegisterOptions).

Once registered, the projection's ETS table is created and then filled with any existing records matching the pattern. Paths and data are passed through the ProjectionFun to create records which are then stored in the table. Any future changes to records which match the PathPattern are also applied to the projection table.

The projection table is a named ETS table which may be queried directly with ets:

khepri:put(StoreId, "/:stock/:wood/oak", 100),
ets:lookup_element(ProjectionName, <<"oak">>, 2).
%%=> 100

Projections should be used to maximize query throughput and/or minimize query latency at the cost of consistency and memory consumption. Projections have the same consistency guarantees as queries tried with the #{favor => low_latency} option. Data stored in the projection table is at least partially duplicated between the store and the projection table, so increased memory consumption is expected for projections with path patterns matching many nodes.

@codecov
Copy link

codecov bot commented Aug 22, 2022

Codecov Report

Base: 90.96% // Head: 89.36% // Decreases project coverage by -1.60% ⚠️

Coverage data is based on head (64df3a4) compared to base (c77bac6).
Patch coverage: 84.44% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #135      +/-   ##
==========================================
- Coverage   90.96%   89.36%   -1.61%     
==========================================
  Files          18       19       +1     
  Lines        3167     3292     +125     
==========================================
+ Hits         2881     2942      +61     
- Misses        286      350      +64     
Flag Coverage Δ
erlang-24 ?
erlang-25 89.36% <84.44%> (-0.37%) ⬇️
os-ubuntu-latest 89.36% <84.44%> (-1.48%) ⬇️
os-windows-latest ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/khepri_tx_adv.erl 77.34% <ø> (-3.14%) ⬇️
src/khepri.erl 91.70% <33.33%> (-3.23%) ⬇️
src/khepri_projection.erl 83.33% <83.33%> (ø)
src/khepri_machine.erl 93.94% <94.20%> (-0.78%) ⬇️
src/khepri_fun.erl 89.91% <0.00%> (-3.51%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@dumbbell dumbbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so glad about this addition! This is really cool :-)

I submitted several comments. I could split my feedback into two topics:

  • Some documentation comments
  • The way payloads are passed from the state machine to the projection function

About the payloads, I suggest we do like the rest of the library: call khepri_machine:gather_node_props() to create a node properties map and pass this common format to khepri_projection:trigger(), then decide in that function how to call the projection function.

I think we could pass the entire node properties maps (old and new) to the extended projection function, and reduce payloads to bare data for the simple one (possibly undefined for the old data in case a payload is added). The simple function will never be called when the new payload has no data.

What do you think?

doc/overview.edoc Outdated Show resolved Hide resolved
doc/overview.edoc Outdated Show resolved Hide resolved
src/khepri.erl Outdated Show resolved Hide resolved
src/khepri.erl Outdated Show resolved Hide resolved
src/khepri.erl Outdated Show resolved Hide resolved
src/khepri_projection.erl Outdated Show resolved Hide resolved
src/khepri_projection.erl Outdated Show resolved Hide resolved
src/khepri_projection.erl Outdated Show resolved Hide resolved
src/khepri_projection.erl Outdated Show resolved Hide resolved
src/khepri_tx_adv.erl Show resolved Hide resolved
@the-mikedavis the-mikedavis force-pushed the md-projections branch 2 times, most recently from 34d66d0 to 9c99147 Compare November 8, 2022 16:42
src/khepri.erl Outdated Show resolved Hide resolved
@the-mikedavis
Copy link
Member Author

Thanks for the feedback @dumbbell!

I like the idea of passing khepri:node_props() around rather than payloads 👍. I switched to that and now khepri_projection:extended_projection_fun()s take full node-props maps too.

For the {simple | extended, Fun} part: originally I thought we could avoid using standalone funs for simple projections (we can't because the funs are persisted to the log) and that tag is a relic from that. I switched to just holding on to a standalone fun and checking the arity.

Copy link
Member

@dumbbell dumbbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, it looks good to me. I have a few more tiny comments on the documentation and one question for the simple projection function.

src/khepri.erl Outdated Show resolved Hide resolved
src/khepri.erl Outdated Show resolved Hide resolved
src/khepri.erl Outdated Show resolved Hide resolved
src/khepri.erl Outdated Show resolved Hide resolved
src/khepri_machine.erl Outdated Show resolved Hide resolved
src/khepri_machine.erl Outdated Show resolved Hide resolved
src/khepri_projection.erl Outdated Show resolved Hide resolved
src/khepri_projection.erl Outdated Show resolved Hide resolved
src/khepri_projection.erl Outdated Show resolved Hide resolved
src/khepri_projection.erl Outdated Show resolved Hide resolved
Ra 2.4.0 introduces new reply modes which control which member of a
cluster performs the `gen_statem:reply/2` to a command. This can be
used to block the caller until the command has been handled by a given
member. The child commit will introduce a way to pass the new reply
modes in `khepri:command_options()`.
This option wraps the equivalent option in `ra:process_command/3` -
introduced in the dependency upgrade in the parent commit - which
controls which member of the cluster calls `gen_statem:reply/2`. This
can be used to block until a command has been handled by the local
member which can prevent consistency issues with projections. A caller
of `khepri:put/4` might assume that a subsequent `ets:lookup/2` (or
similar) call is up-to-date immediately. By calling `khepri:put/4` with
the `#{reply_from => local}` option, the caller can rely on this
behavior.
@the-mikedavis the-mikedavis force-pushed the md-projections branch 3 times, most recently from da046b6 to 8cd3925 Compare November 9, 2022 18:23
Projections are similar in spirit to database views. They are a subset
of the store collected into an ETS table which is replicated to all
members of the Khepri cluster. Implementation-wise, projections are
similar to triggers except that projections are updated synchronously
by the machine and are triggered on all changes to a given path
pattern.

Projections can be used for cases where reads are much more common than
writes or in cases where read throughput and/or latency are important.

This change initially introduces projections with the following
additions to the API:

* `khepri_projection:new/3` - Create a projection record.
* `khepri_projection:name/1` - Retrieve the name (the ETS table name)
  of the projection. (Necessary because the projection record is
  opaque).
* `khepri:register_projection/4` - Register a projection within the
  store against a given path pattern.

As well as the internal implementation in `khepri_machine` and the
associated documentation.
With this change, if any projections exist, they are listed in the
output of `khepri:info/2`.
@dumbbell dumbbell merged commit 7f0b8b8 into rabbitmq:main Nov 10, 2022
@the-mikedavis the-mikedavis deleted the md-projections branch November 10, 2022 12:54
@dumbbell dumbbell added this to the v0.6.0 milestone Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants