Implement `reduce` and `dependency` task logic #144

sandreae · 2022-06-04T10:44:02Z

`reduce`

`dependency`

~~retrieve relation graph from db (including parent and children)~~ check child pinned relations are materialised
return reduce tasks for every missing child dependency
error on deleted documents
error handling
tests

📋 Checklist

Add tests that cover your changes
Add this PR to the Unreleased section in CHANGELOG.md
Link this PR to any issues it closes
New files contain a SPDX license header

codecov · 2022-06-04T10:49:56Z

Codecov Report

Merging #144 (b0d0124) into development (32d12e7) will increase coverage by 0.95%.
The diff coverage is 96.05%.

@@               Coverage Diff               @@
##           development     #144      +/-   ##
===============================================
+ Coverage        87.63%   88.59%   +0.95%     
===============================================
  Files               42       42              
  Lines             2055     2280     +225     
===============================================
+ Hits              1801     2020     +219     
- Misses             254      260       +6

Impacted Files	Coverage Δ
aquadoggo/src/db/provider.rs	`87.50% <ø> (ø)`
aquadoggo/src/db/stores/document.rs	`93.46% <ø> (-0.04%)`	⬇️
aquadoggo/src/materializer/input.rs	`100.00% <ø> (+100.00%)`	⬆️
aquadoggo/src/materializer/tasks/schema.rs	`0.00% <ø> (ø)`
aquadoggo/src/materializer/worker.rs	`88.34% <ø> (ø)`
aquadoggo/src/materializer/tasks/reduce.rs	`93.80% <93.80%> (+93.80%)`	⬆️
aquadoggo/src/materializer/tasks/dependency.rs	`98.26% <98.26%> (+98.26%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 32d12e7...b0d0124. Read the comment docs.

adzialocha

Just writing some comments while we're going as I was very curious 🤩

This is really really really beautiful, the tasks are so small and powerful!

aquadoggo/src/materializer/tasks/reduce.rs

aquadoggo/src/materializer/tasks/dependency.rs

adzialocha · 2022-06-08T22:15:45Z

The error handling is funny as we probably end up only having one critical error type or Ok(None) 🤣

There could be two approaches to the semantics of the return values:

Ok(...) Task finished without any critical problem
Err(Critical) Task failed with a critical problem

.. or:

Ok(...) Task finished and did it's job
Err(Failure) Task finished but couldn't do it's job
Err(Critical) Task failed with a critical problem

sandreae · 2022-06-09T09:18:59Z

Thanks for all the comments @adzialocha! I will make those changes.

When you have time, you can also look over the dependency task logic. It seems a little more nuanced than I originally imagined. The current logic is:

for "parent" dependencies which have been materialised (the only kind) => dispatch a dependency task
for "child" dependencies which have been materialised => do nothing
for "child" dependencies which have not been materialised => dispatch a reduce task
for documents or document views which have all their child dependencies met => dispatch a schema task

"parent dependency" = a document relating to the document being processed by the task
"child dependency" = a document which the tasks' document relates to

adzialocha

I haven't reflected yet on the new schema store in this context but I think we currently don't need a schema task with this db layout.

I'd add it as soon as we build the service 👍

For now we can build the schema through schema store on every request. That's naive of course but it will be fine for now!

Later we can add the schema service which will be used to query schema instances in memory (as a caching layer), it will be populated during node startup and filled with new schemas in a schema task (only if the wanted schema was missing during startup). Later we can deprecate / delete schemas in the task when the related document got deleted.

sandreae · 2022-06-09T16:20:23Z

but I think we currently don't need a schema task with this db layout

Yep, that makes sense.

Later we can add the schema service which will be used to query schema instances in memory (as a caching layer), it will be populated during node startup and filled with new schemas in a schema task (only if the wanted schema was missing during startup). Later we can deprecate / delete schemas in the task when the related document got deleted.

Cool, this is what I imagined we'd be doing as well 👍

adzialocha · 2022-06-13T10:52:13Z

for "parent" dependencies which have been materialised (the only kind) => dispatch a dependency task

This looks really close to the specification we have in the Whimsical 👍 except of this one point where I would expect this to dispatch a reduce task (which again will dispatch a dependency task), ~~we might otherwise miss out on materialising the specific document view of this parent (it might be pinned as well from another parent).~~ (that should itself being wrong after meeting with @sandreae, we actually don't need to check for parents at all)

aquadoggo/src/materializer/tasks/dependency.rs

aquadoggo/src/materializer/worker.rs

cafca

I added some documentation for the dependency task, I think the same could be very helpful for the reducer (but dinner is ready so I stop now ;) ). The control flow is quite deeply nested with multiple matches. Flattening that out would probably help make this maintainable.

Overall this is good to go though :)

aquadoggo/src/materializer/tasks/dependency.rs

aquadoggo/src/materializer/tasks/reduce.rs

sandreae · 2022-06-15T09:13:50Z

Thanks for the review @cafca, I improved the docs and did a little flattening, I like it! Think I covered everything now.

adzialocha · 2022-06-15T15:59:18Z

Alright! I did some crazy rebase-kung-fu and now this is ready for merging into development! Also, I had a go on @cafca's idea to improve the control flow nesting in reduce_task by factoring all into smaller methods. Please have a (hopefully) last look, especially @sandreae, I hope I didn't mess with your code too much.

sandreae · 2022-06-15T16:19:28Z

Thanks! That's great 👍 for some reason I thought @cafca 's comment was about the dependency task....

* development: Implement `reduce` and `dependency` task logic (#144) Send published entries to materializer (#161) Updates for use of `VerifiedOperation` (#158) A backlink is not a skiplink (#163) Update README.md

* development: (42 commits) Implement `reduce` and `dependency` task logic (#144) Send published entries to materializer (#161) Updates for use of `VerifiedOperation` (#158) A backlink is not a skiplink (#163) Update README.md Remove clippy warnings Update `p2panda-rs` API & refactor tests (#147) fmt Rename `ES` generic type param to `EntryStore` Add better instructions for regenerating the schema file. Rename `skiplinks` to `certificate_pool` and regen schema Use cafca's wording for error message. Fix typo Target p2panda main (#142) Fix a clipp warn clippy --fix Add rt-multi-thread tokio feature to fix build of dump_schema Fix naming of param in SingleEntryAndPayload Add docstring to the `AliasedAuthor` type. Fix import order of tests ...

sandreae changed the title ~~WIP: implement document materialisation and storage~~ Materialise and store documents and views in reduce_task Jun 4, 2022

sandreae changed the title ~~Materialise and store documents and views in reduce_task~~ Materialise and store document views in reduce_task Jun 4, 2022

sandreae changed the title ~~Materialise and store document views in reduce_task~~ Implement reduce and dependency task logic Jun 8, 2022

adzialocha reviewed Jun 8, 2022

View reviewed changes

This was linked to issues Jun 9, 2022

Dependency task in worker #106

Closed

Schema task in worker #107

Closed

adzialocha reviewed Jun 9, 2022

View reviewed changes

This was referenced Jun 10, 2022

How do we deal with deleted documents in the reduce task? #150

Open

StorageProvider methods [tracking] #149

Closed

adzialocha linked an issue Jun 13, 2022 that may be closed by this pull request

Use task queue for materialization and resolving relations of documents #59

Closed

adzialocha removed a link to an issue Jun 13, 2022

Schema task in worker #107

Closed

sandreae linked an issue Jun 13, 2022 that may be closed by this pull request

Reduce task in worker #105

Closed

sandreae marked this pull request as ready for review June 13, 2022 15:48

sandreae force-pushed the implement-reducer-task branch from 5ce599f to a1e0e16 Compare June 14, 2022 11:15

sandreae changed the base branch from development to updates-for-verified-operation June 14, 2022 11:15

adzialocha requested changes Jun 14, 2022

View reviewed changes

sandreae requested a review from adzialocha June 14, 2022 13:00

adzialocha approved these changes Jun 14, 2022

View reviewed changes

cafca approved these changes Jun 14, 2022

View reviewed changes

sandreae requested a review from cafca June 15, 2022 09:14

adzialocha changed the base branch from updates-for-verified-operation to development June 15, 2022 14:22

sandreae added 3 commits June 15, 2022 16:59

WIP: implement document materialisation and storage

7dd1adf

WIP: Implement document_view materialisation

f19c136

Bump p2panda_rs branch

be401bf

sandreae and others added 22 commits June 15, 2022 17:02

Handle errors

11dd828

Completely redo dependency task logic :lolz:

ddda36c

Add test for deleted documents

82188c2

Schema task

2b68e43

Refactoring and commetns in reduce

54e4a9b

Comment for helper tasks

17d0aa4

A few more comments

a5f9415

Some more comments

2be82ba

We don't need to check parent dependencies in dependency task

e8eca47

Remove code relating to schema task

89d4357

Fix reduce error test

15ff1a0

Make clippy happy

e2193a7

Update CHANGELOG

6f8b43d

Changes after rebase

89783ac

Review changes

b788737

Rephrase dependency task docstrings

1bb76ec

Refactoring in dependency task

1c6bd0a

More detailed docs for sependency task

95abba9

Reduce task doc string

7263bfc

Remove comment about wanted schema

1f33743

Fix rebase

43bd91a

Fix missing docstring after rebase

ba3d649

adzialocha force-pushed the implement-reducer-task branch from c94379d to ba3d649 Compare June 15, 2022 15:09

Try to improve control flow in reduce task as well

b0d0124

adzialocha merged commit 72df98b into development Jun 15, 2022

adzialocha deleted the implement-reducer-task branch June 29, 2022 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `reduce` and `dependency` task logic #144

Implement `reduce` and `dependency` task logic #144

sandreae commented Jun 4, 2022 •

edited

Loading

codecov bot commented Jun 4, 2022 •

edited

Loading

adzialocha left a comment

adzialocha commented Jun 8, 2022

sandreae commented Jun 9, 2022 •

edited

Loading

adzialocha left a comment •

edited

Loading

sandreae commented Jun 9, 2022

adzialocha commented Jun 13, 2022 •

edited

Loading

cafca left a comment

sandreae commented Jun 15, 2022

adzialocha commented Jun 15, 2022

sandreae commented Jun 15, 2022

Implement reduce and dependency task logic #144

Implement reduce and dependency task logic #144

Conversation

sandreae commented Jun 4, 2022 • edited Loading

reduce

dependency

📋 Checklist

codecov bot commented Jun 4, 2022 • edited Loading

Codecov Report

adzialocha left a comment

Choose a reason for hiding this comment

adzialocha commented Jun 8, 2022

sandreae commented Jun 9, 2022 • edited Loading

adzialocha left a comment • edited Loading

Choose a reason for hiding this comment

sandreae commented Jun 9, 2022

adzialocha commented Jun 13, 2022 • edited Loading

cafca left a comment

Choose a reason for hiding this comment

sandreae commented Jun 15, 2022

adzialocha commented Jun 15, 2022

sandreae commented Jun 15, 2022

Implement `reduce` and `dependency` task logic #144

Implement `reduce` and `dependency` task logic #144

sandreae commented Jun 4, 2022 •

edited

Loading

`reduce`

`dependency`

codecov bot commented Jun 4, 2022 •

edited

Loading

sandreae commented Jun 9, 2022 •

edited

Loading

adzialocha left a comment •

edited

Loading

adzialocha commented Jun 13, 2022 •

edited

Loading