Skip to content

Conversation

@awrichar
Copy link
Contributor

First step toward hyperledger/firefly-fir#10

@codecov-commenter
Copy link

codecov-commenter commented Feb 14, 2022

Codecov Report

Merging #517 (3369cea) into main (82b8d54) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##              main      #517    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files          295       303     +8     
  Lines        17000     17386   +386     
==========================================
+ Hits         17000     17386   +386     
Impacted Files Coverage Δ
internal/config/config.go 100.00% <ø> (ø)
internal/txcommon/txcommon.go 100.00% <ø> (ø)
pkg/fftypes/operation.go 100.00% <ø> (ø)
internal/apiserver/route_post_op_retry.go 100.00% <100.00%> (ø)
internal/assets/manager.go 100.00% <100.00%> (ø)
internal/assets/operations.go 100.00% <100.00%> (ø)
internal/assets/token_approval.go 100.00% <100.00%> (ø)
internal/assets/token_pool.go 100.00% <100.00%> (ø)
internal/assets/token_transfer.go 100.00% <100.00%> (ø)
internal/batch/batch_processor.go 100.00% <100.00%> (ø)
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 82b8d54...3369cea. Read the comment docs.

@awrichar awrichar force-pushed the opmanager branch 3 times, most recently from 7b66d2b to 4ff5b01 Compare February 14, 2022 20:30
@awrichar awrichar marked this pull request as draft February 16, 2022 15:39
Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
Other managers can register to handle particular operation types. This
keeps the logic for each type concentrated in the owning Manager instead
of giving too much specialized knowledge to the Operations Manager.

Also introduce a serializable PreparedOperation type for wrapping
operations before they are sent off to plugins - makes for a neater
split between parsing and running operations, and may also be useful for
tests/debugging later.

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
@awrichar awrichar changed the title Add Operations Manager and use for token operations Add Operations Manager and use for all operations Feb 17, 2022
@awrichar awrichar marked this pull request as ready for review February 17, 2022 01:43
@awrichar awrichar requested a review from nickgaski as a code owner February 17, 2022 20:47
@awrichar awrichar changed the title Add Operations Manager and use for all operations Add Operations Manager, wrap all operations, and add retry functionality Feb 17, 2022
Retrying an operation that has already been retried will cause it to look
up the newest copy of the operation, and retry that one. In this way,
retries will always form a single chain, and attempting to re-run any of
them will always add a new one to the end of the chain.

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
@awrichar
Copy link
Contributor Author

I've continued working this so it now encompasses all operations and actually adds the /retry route for operations. Unit tests and basic manual verification look good. This now fully implements hyperledger/firefly-fir#10, except for the unresolved questions listed there and the E2E tests.

Copy link
Contributor

@peterbroadhurst peterbroadhurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really think this worked out well. A very clear template for what an operation means in the codebase, that's easy to extend 👍

Some minor comments to consider @awrichar

@@ -0,0 +1,3 @@
BEGIN;
ALTER TABLE operations DROP COLUMN retry_id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that you'll need to get a different slot (think currently 70 is next after PRs in the pipe)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➡️ ➡️ ➡️ ➡️ ➡️


func (om *operationsManager) writeOperationSuccess(ctx context.Context, opID *fftypes.UUID) {
if err := om.database.ResolveOperation(ctx, opID, fftypes.OpStatusSucceeded, "", nil); err != nil {
log.L(ctx).Errorf("Failed to update operation %s: %s", opID, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it would be good to write the full data of the operation here - particularly the outputs - as they would have been lost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have any path that writes operation outputs here. I'm not sure there's any more data to capture beyond what is in the log. Need to think further to be sure there's no time we should be capturing outputs here...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - I understand now the ResolveOperation call is only modifying one field on the operation, it's not the call that does the full update with the outputs etc.


func (om *operationsManager) writeOperationFailure(ctx context.Context, opID *fftypes.UUID, err error) {
if err := om.database.ResolveOperation(ctx, opID, fftypes.OpStatusFailed, err.Error(), nil); err != nil {
log.L(ctx).Errorf("Failed to update operation %s: %s", opID, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above: Think it would be good to write the full data of the operation here - particularly the outputs - as they would have been lost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - the only "outputs" from a synchronous failure are in the form of a Go error which is logged here. I'm not sure there's anything else useful to log... unless it's a sign that we're not returning enough info from some other layer.

InsertOperation(ctx context.Context, operation *fftypes.Operation) (err error)

// ResolveOperation - Resolve operation upon completion
ResolveOperation(ctx context.Context, id *fftypes.UUID, status fftypes.OpStatus, errorMsg string, output fftypes.JSONObject) (err error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have a generic UpdateOperation - I wonder if it would be more consistent to remove the ResolveOperation from the DB layer as that's just a thin wrapper:

func (s *SQLCommon) ResolveOperation(ctx context.Context, id *fftypes.UUID, status fftypes.OpStatus, errorMsg string, output fftypes.JSONObject) (err error) {
update := database.OperationQueryFactory.NewUpdate(ctx).
Set("status", status).
Set("error", errorMsg)
if output != nil {
update.Set("output", output)
}
return s.updateOperation(ctx, id, update)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, was thinking the same thing. I can do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at this again, there are a lot of places where we invoke ResolveOperation, and this change would take all of those from 1 line to 5 lines. So it would be more consistent, but more verbose as well. I slightly prefer keeping the concise helper.

Could also move the helper somewhere other than the database layer. For instance, it could be on the Operations Manager (but would then need to add a dependency from Event Manager on Operations Manager).

awrichar added 2 commits March 1, 2022 17:23
Includes support for token approvals in Operations Manager.
Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>
Copy link
Contributor

@peterbroadhurst peterbroadhurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've marked approval @awrichar as I think you were proposing this goes in at this point, and the comments are taken as input for future work in other changes.

Plugin: op.Plugin,
Input: op.Input,
}
key, err := json.Marshal(opCopy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like extra evidence to the point that we should treat the inputs as references, rather than full data wherever possible. As this is going to be a large string cache key that we are marshaling.

Copy link
Contributor

@peterbroadhurst peterbroadhurst Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... as an aside I still agree serialized JSON is a more efficient cache key than a SHA etc. - so the choice looks right here to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, that was my inkling as well but wanted to have someone else back me up. But agreed it's yet another reason to minimize the size of "Input" as much as is practical.

Copy link
Contributor

@peterbroadhurst peterbroadhurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the caching. Like that implementation, and think it's great we'll have de-dup on retry within the batch processor.

@awrichar awrichar merged commit ea83d3a into hyperledger:main Mar 3, 2022
@awrichar awrichar deleted the opmanager branch March 3, 2022 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants