Allow sending a message with a token transfer #245

awrichar · 2021-10-12T20:52:38Z

The message hash will be recorded with the transfer, and the message will
not be considered confirmed until the transfer is also confirmed.

The message hash will be recorded with the transfer, and the message will not be considered confirmed until the transfer is also confirmed. Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

awrichar · 2021-10-12T20:58:32Z

internal/assets/manager.go

+	}
+	switch in.Header.Type {
+	case fftypes.MessageTypeTransferBroadcast:
+		return am.broadcast.BroadcastMessage(ctx, ns, in, false)


Note that even in the waitConfirm=true case, the message is dispatched async. This is because dispatching the message synchronously would block forever waiting for the transfer to arrive.

Technically we could build and seal the message without sending it in this case, then record the hash, then send the transfer synchronously, then send the message synchronously. I spent some time going down that road and it requires some non-trivial refactoring of the message sending logic, so I tabled it for the moment.

This means that the current behavior of waitConfirm=true guarantees the token transfer has occurred, but does not totally guarantee the message data has been received (although it should have been in most cases).

My understanding of where we were aiming with this feature, was that the transaction object would be the thing we would generate an event on, and as such the sync/async code would need to have a path where the thing in-flight wasn't a message, but instead a higher-level transaction.

... going to look through the rest of the changes to see if we've learned more that means this approach didn't work out, or if we're still on that path.

There is a transaction in flight, but it's possible I'm resolving that prematurely. We discussed holding the message confirmation until the transfer completes, which is what I tackled in the event aggregation - but I think I am still missing a few pieces of the puzzle here.

Just to close the loop on the comments here, per the summary #245 (comment) when we're done with this, we will block until the message confirms.

However, that's not a blocker on merging this PR as we've done the architectural work to know where we're up to

awrichar · 2021-10-12T22:06:36Z

internal/events/aggregator.go

+		if err != nil {
+			return false, err
+		} else if len(transfers) == 0 {
+			return false, fmt.Errorf("transfer for message '%s' not available", msg.Hash)


It felt like maybe I should be able to return false, nil here - but that didn't seem to trigger retries. Perhaps there's an alternate path that I'm supposed to trigger explicitly if the transfer comes in after the message? @peterbroadhurst maybe you can offer some insight?

On the other hand, if returning an error is the correct/optimal route here, I guess maybe I should add it to i18n - this was essentially just a placeholder.

Going to spend some time looking at the private message correlation case. This certainly doesn't seem like an Errorf condition to me - it's simply that things arrived in the other of two potential orders, which given the use of two independent blockchains we need to be able to process in either order with deterministic outcome.

We do also need rules on which order we dispatch the events in. Is it the order of the messages, or the order of the transfers. That needs to be deterministic. Given transfers could be happening for lots of different reasons, I believe it's impossible for it to be the order of the transfers, so that means we have to block further messages on a given topic until a transfer has been completed associated with an earlier message.

Or we need to have some other rule that allows applications to function, without needing to have their own stateful message aggregation capability.

Here's where we deal with a blockchain transaction arriving before the message, within the correlation of a private pinned message (note the logging is Debug level - which I think is appropriate):

firefly/internal/events/aggregator.go

Line 177 in 06dedc5

l.Debugf("Batch %s not available - pin %s is parked", pin.Batch, pin.Hash)

Maybe we can discuss it, but my current opinion as I review things here is that:
The blockchain event for a transfer that contains a message reference, needs to be added to the pins table, and be processed by the aggregator sequentially with other pins.

The object structure I think we discussed is:

Messages of type transfer* contain an extra data array element that is the Transaction object

The Transaction object contains a reference to the UUID of a token transfer

This means we can have:

Pins that refer to indexes of a message within a batch

Pins that refer to the UUID of a transfer

Note this means some tweaking to the pins table. Specifically I think we need a type field (which maybe this is performance sensitive enough we want to make it a numeric), and to rename batch to ref - as it might not be a batch.

I'd thought we could avoid that, and consider transfers themselves to be pins. But the problem is how you get a single deterministic order of processing, without making the aggregator listen to two tables.

Yea, I was struggling with the charter of what is considered a pin.

We did discuss sharing the transaction details as part of the message, but I didn't include that yet, because I wasn't sure if it was needed. In the case of pool creation, the transaction ID had to be shared via message, because there was no guarantee of a blockchain transaction happening (which is the "normal" way that transaction IDs propagate currently). As of #239, I am assuming that all transfers include a blockchain transaction, and the FireFly transaction ID is shared as part of that blockchain transaction - therefore it was not necessary to also share the transaction ID in the message. But sounds like it may still be needed in both pieces (the transfer and the message) to tie them together.

peterbroadhurst

Hey @awrichar - I've probably raised more questions than answers here sorry.
The code proposal is awesome in moving the ball forwards, but I think we need a bit more consideration to the points above before we've got the determinism we're hoping to promise apps on these coordinated transfers.

peterbroadhurst · 2021-10-13T12:03:04Z

internal/assets/manager.go

+	}
+	switch in.Header.Type {
+	case fftypes.MessageTypeTransferBroadcast:
+		return am.broadcast.BroadcastMessage(ctx, ns, in, false)


My understanding of where we were aiming with this feature, was that the transaction object would be the thing we would generate an event on, and as such the sync/async code would need to have a path where the thing in-flight wasn't a message, but instead a higher-level transaction.

... going to look through the rest of the changes to see if we've learned more that means this approach didn't work out, or if we're still on that path.

peterbroadhurst · 2021-10-13T12:08:56Z

internal/events/aggregator.go

+		if err != nil {
+			return false, err
+		} else if len(transfers) == 0 {
+			return false, fmt.Errorf("transfer for message '%s' not available", msg.Hash)


Going to spend some time looking at the private message correlation case. This certainly doesn't seem like an Errorf condition to me - it's simply that things arrived in the other of two potential orders, which given the use of two independent blockchains we need to be able to process in either order with deterministic outcome.

peterbroadhurst · 2021-10-13T12:12:39Z

internal/events/aggregator.go

+		if err != nil {
+			return false, err
+		} else if len(transfers) == 0 {
+			return false, fmt.Errorf("transfer for message '%s' not available", msg.Hash)


We do also need rules on which order we dispatch the events in. Is it the order of the messages, or the order of the transfers. That needs to be deterministic. Given transfers could be happening for lots of different reasons, I believe it's impossible for it to be the order of the transfers, so that means we have to block further messages on a given topic until a transfer has been completed associated with an earlier message.

Or we need to have some other rule that allows applications to function, without needing to have their own stateful message aggregation capability.

peterbroadhurst · 2021-10-13T12:14:54Z

internal/events/aggregator.go

+		if err != nil {
+			return false, err
+		} else if len(transfers) == 0 {
+			return false, fmt.Errorf("transfer for message '%s' not available", msg.Hash)


Here's where we deal with a blockchain transaction arriving before the message, within the correlation of a private pinned message (note the logging is Debug level - which I think is appropriate):

firefly/internal/events/aggregator.go

Line 177 in 06dedc5

l.Debugf("Batch %s not available - pin %s is parked", pin.Batch, pin.Hash)

pkg/fftypes/message.go

peterbroadhurst · 2021-10-13T12:26:38Z

internal/events/aggregator.go

+		if err != nil {
+			return false, err
+		} else if len(transfers) == 0 {
+			return false, fmt.Errorf("transfer for message '%s' not available", msg.Hash)


Maybe we can discuss it, but my current opinion as I review things here is that:
The blockchain event for a transfer that contains a message reference, needs to be added to the pins table, and be processed by the aggregator sequentially with other pins.

The object structure I think we discussed is:

Messages of type transfer* contain an extra data array element that is the Transaction object

The Transaction object contains a reference to the UUID of a token transfer

This means we can have:

Pins that refer to indexes of a message within a batch

Pins that refer to the UUID of a transfer

Note this means some tweaking to the pins table. Specifically I think we need a type field (which maybe this is performance sensitive enough we want to make it a numeric), and to rename batch to ref - as it might not be a batch.

I'd thought we could avoid that, and consider transfers themselves to be pins. But the problem is how you get a single deterministic order of processing, without making the aggregator listen to two tables.

peterbroadhurst · 2021-10-13T13:30:11Z

I've started working on a flow chart, to help flush out the challenges here.
I think there's a key bit of thinking and decisions to make about how ordering works between transfers and pins.
Per the sticky, it seems like it's' not possible to have messages and transfers in one deterministic single order. However, we can have transfers in a deterministic order, and messages in a deterministic order.

peterbroadhurst · 2021-10-13T18:08:38Z

So a bit more thinking (thanks @awrichar), and actually there is not a big gap from how the code is in this PR and what we need.
Adding a new type of pin would have introduced problems, because we're trying to bi-directionally order two streams that are ordered on different blockchains 🤯 (where the yellow sticky came from above).

So instead, we just need to to perform a rewind when transfers come in, and (as the code does already in this PR) add an additional check on transfer messges that the transfer is complete.

Summary

Transfers are ordered within the transfers collection, via the token-backing blockchain
Messages and data are ordered using the pins, via the FireFly primary blockchain
Messages associated with transfers, will not become confirmed until any associated transfer has been confirmed

As the above flowchart was helpful in describing the aggregator core function, I've updated it as follows:

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

codecov-commenter · 2021-10-13T20:20:03Z

Codecov Report

Merging #245 (31d0c65) into main (c894700) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 31d0c65 differs from pull request most recent head c75f160. Consider uploading reports for the commit c75f160 to get more accurate results

@@            Coverage Diff            @@
##              main      #245   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          214       214           
  Lines        11820     11860   +40     
=========================================
+ Hits         11820     11860   +40

Impacted Files	Coverage Δ
pkg/fftypes/message.go	`100.00% <ø> (ø)`
internal/apiserver/route_post_token_transfer.go	`100.00% <100.00%> (ø)`
internal/assets/manager.go	`100.00% <100.00%> (ø)`
internal/broadcast/manager.go	`100.00% <100.00%> (ø)`
internal/events/aggregator.go	`100.00% <100.00%> (ø)`
internal/orchestrator/orchestrator.go	`100.00% <100.00%> (ø)`
internal/privatemessaging/privatemessaging.go	`100.00% <100.00%> (ø)`
internal/tokens/fftokens/fftokens.go	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 17bd391...c75f160. Read the comment docs.

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

awrichar · 2021-10-13T20:40:36Z

I've pushed a few more commits that hopefully address the majority of the concerns here. Some highlights of the behavior in the context of pins and ordering (reiterating some of what @peterbroadhurst stated above):

token operations and blockchain-pinned message operations may occur on different blockchains
it is guaranteed that each node will receive token events in the same order, and message events in the same order (that is, relative to other events of the same type)
in the case of transfers with messages attached, it is not totally guaranteed that the ordering of transfers matches the ordering of their corresponding messages (particularly when different blockchains are used)
token transfers (and the corresponding FireFly transaction of type token_transfer) will be confirmed as soon as the transfer occurs on the blockchain
messages accompanying transfers (of type transfer_broadcast or transfer_private) will not be considered confirmed until the message and its data and the actual transfer have all been recorded

Some items that are still outstanding:

initiating a transfer+message with confirm=true will return as soon as the transfer is confirmed, without waiting for the message to also be confirmed (as I already noted)
I'm unsure if sending a transfer with a private unpinned message accompanying it should be supported, and if it requires any special treatment (thus far I've focused specifically on pinned messages)

peterbroadhurst

One last question on LocalID @awrichar, and a couple of comments to state my understanding of where we'll be when this PR drops, and what's left.

peterbroadhurst · 2021-10-14T12:07:38Z

internal/assets/manager.go

+	}
+	switch in.Header.Type {
+	case fftypes.MessageTypeTransferBroadcast:
+		return am.broadcast.BroadcastMessage(ctx, ns, in, false)


Just to close the loop on the comments here, per the summary #245 (comment) when we're done with this, we will block until the message confirms.

However, that's not a blocker on merging this PR as we've done the architectural work to know where we're up to

peterbroadhurst · 2021-10-14T12:13:00Z

internal/assets/manager.go

+		transfer.MessageHash = msg.Hash
+	}
+
+	result, err := am.transferTokensWithID(ctx, fftypes.NewUUID(), ns, typeName, poolName, &transfer.TokenTransfer, waitConfirm)


It's my understanding @awrichar that this will switch around, so that in the case of a message+transfer we'll no longer use syncasync.SendConfirmTokenTransfer to block until the transfer completes, but instead will restructure the code to allow the transfer to be fired off with the message hash, but then to block on the message returning. But that's going to be in a follow-on PR.

#249 is my parallel attempt to restructure the messaging code in order to have a hook at the point that the message is sealed but not sent. Hopefully I can leverage that to fire off the token transfer just before firing the message, and then wait for the message to be confirmed. Once both of these PRs are merged, I'll tackle that follow-on change.

peterbroadhurst · 2021-10-14T12:17:57Z

internal/events/tokens_transferred.go

+	if err != nil {
+		return false, err
+	}
+	if len(operations) > 0 {


@awrichar - I need a bit of help understanding why we're using an operation here to find the LocalID of the transfer. I had thought this was something we were passing through the data of the transfer, so that it could be consistent on all nodes in the case of a FireFly initiated transfer.

Short answer - either route is definitely possible.

I only ended up here because I felt it was easier to explain - in this case, each node always assigns a different LocalID for each transfer, and that ID is never shared with other nodes or written to the chain.

We could easily stay with the original route - where the ID is shared on the chain for FireFly-initiated transfers, but assigned randomly by each node for non-FireFly-initiated transfers. It just felt a little more confusing. I foresee cases of "why does this transfer ID match across nodes but this other one does not?"

👍 - going with "local means local" sounds good to me.

I understand now why we're looking at the operation here, because the transfer ID needs to be correlated with an ID, and that ID needs to be known at the beginning of the process by the submitter (only) - because they might be blocking with confirm for it to complete.

peterbroadhurst

👍

Allow sending a message with a token transfer

1e505ea

The message hash will be recorded with the transfer, and the message will not be considered confirmed until the transfer is also confirmed. Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

awrichar requested review from nguyer, nickgaski and peterbroadhurst as code owners October 12, 2021 20:52

awrichar commented Oct 12, 2021

View reviewed changes

peterbroadhurst reviewed Oct 13, 2021

View reviewed changes

awrichar mentioned this pull request Oct 13, 2021

Fabric plugin #184

Merged

awrichar added 2 commits October 13, 2021 15:35

Move token transfer event handler into events package

11188fb

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

Rewind event aggregator if a token transfer arrives after its message

1ff0916

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

Use underscores in enum values

c75f160

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

awrichar force-pushed the transfer branch from d2c325a to c75f160 Compare October 13, 2021 20:21

awrichar mentioned this pull request Oct 13, 2021

Token support #218

Closed

peterbroadhurst reviewed Oct 14, 2021

View reviewed changes

Merge branch 'main' of github.com:hyperledger/firefly into transfer

4884273

peterbroadhurst approved these changes Oct 15, 2021

View reviewed changes

peterbroadhurst merged commit ea1f5ca into hyperledger:main Oct 15, 2021

peterbroadhurst deleted the transfer branch October 15, 2021 18:20

Allow sending a message with a token transfer #245

Allow sending a message with a token transfer #245

Uh oh!

Conversation

awrichar commented Oct 12, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peterbroadhurst left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peterbroadhurst commented Oct 13, 2021

Uh oh!

peterbroadhurst commented Oct 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

codecov-commenter commented Oct 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

awrichar commented Oct 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

peterbroadhurst left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

awrichar Oct 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peterbroadhurst left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

peterbroadhurst commented Oct 13, 2021 •

edited

Loading

codecov-commenter commented Oct 13, 2021 •

edited

Loading

awrichar commented Oct 13, 2021 •

edited

Loading

awrichar Oct 14, 2021 •

edited

Loading