Time Monotonicity Enforcement #141

AdityaSripal · 2021-04-26T23:07:45Z

Description

enforce time monotoncity
create time misbehaviour
early return on identical updates
misbehaviour detection within updateclient

closes: #XXXX

Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

Targeted PR against correct branch (see CONTRIBUTING.md)
Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
Code follows the module structure standards.
Wrote unit and integration tests
Updated relevant documentation (docs/) or specification (x/<module>/spec/)
Added relevant godoc comments.
Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
Re-reviewed Files changed in the Github PR explorer
Review Codecov Report in the comment section below once CI passes

…HSA-fw94-c6w5-gw8j into aditya/update-client-fix

Co-authored-by: Zarko Milosevic <zarko@informal.systems>

…HSA-fw94-c6w5-gw8j into aditya/update-client-fix

…-c6w5-gw8j into alderfly-ibc-fix

codecov-commenter · 2021-04-26T23:15:43Z

Codecov Report

Merging #141 (76e932a) into main (5b9e3c0) will increase coverage by 13.06%.
The diff coverage is 95.40%.

@@             Coverage Diff             @@
##             main     #141       +/-   ##
===========================================
+ Coverage   65.92%   78.99%   +13.06%     
===========================================
  Files         131      109       -22     
  Lines        8382     6516     -1866     
===========================================
- Hits         5526     5147      -379     
+ Misses       2476     1009     -1467     
+ Partials      380      360       -20

Impacted Files	Coverage Δ
...clients/07-tendermint/types/misbehaviour_handle.go	`85.45% <69.23%> (-5.85%)`	⬇️
modules/core/02-client/keeper/client.go	`98.52% <100.00%> (+0.25%)`	⬆️
.../light-clients/07-tendermint/types/client_state.go	`70.00% <100.00%> (ø)`
...odules/light-clients/07-tendermint/types/header.go	`100.00% <100.00%> (ø)`
.../light-clients/07-tendermint/types/misbehaviour.go	`72.13% <100.00%> (-0.89%)`	⬇️
modules/light-clients/07-tendermint/types/store.go	`89.24% <100.00%> (+0.35%)`	⬆️
...odules/light-clients/07-tendermint/types/update.go	`81.41% <100.00%> (+3.52%)`	⬆️
testing/simapp/encoding.go
testing/simapp/export.go
testing/app.go
... and 19 more

AdityaSripal

I've decided to keep the FrozenHeight to being the height of the offending header that caused misbehaviour. With time misbehaviour i use the larger height. Since we no longer use its specific value, it doesn't really matter so long as its non-zero. @colin-axner argues for just making the height {0-1} since it isn't used for anything other than a boolean value.

I figure if we're keeping it, we may as well make it meaningful. Care to weigh in @cwgoes ?

AdityaSripal · 2021-04-26T23:10:54Z

modules/core/02-client/keeper/client.go

+	// If client state is not frozen after clientState CheckHeaderAndUpdateState,
+	// then write the update state changes, and set new consensus state.
+	// Else the update was proof of misbehaviour and we must emit appropriate misbehaviour events.
+	if !newClientState.IsFrozen() {


There was too much confusion regarding separation of responsibilities for detecting misbehaviour here. Because conflicting header can be detected here, but time monotonicity can't. Thus, it makes more sense to just make it the responsibility of client developers to do this correctly so we have clear separation of responsibility.

Here i just check if new clientstate is frozen and if so emit appropriate events/write state

I think resulting code is cleaner

I don't understand why we cache the context if it is now the full responsibility of the client developers to handle all instances of misbehaviour correctly

Solo machines store the consensus state in the client state. Thus the only protection cached context adds is against metadata, but it still seems confusing to me. As client developers should be very aware not to write unwanted state changes for an update which is actually evidence of misbehaviour. What if a client wanted to write metadata everytime it handled misbehaviour in an update client message?

We should either be as defensive as possible by assuming client developers miss checks or we should be as explicit as possible in saying it is entirely the responsibility of the app developer. If we cache the context, then I think we might as well do the duplicate consensus state check (and return an error if a duplicate update is successful)

I'd actually prefer to be as defensive as possible. In which case, we should keep the cached context and return an error if a duplicate update occurs without the client detecting misbehavior

Regardless, these requirements should be clearly documented in a light_client.md under docs/. These are subtle checks that are essential for security

I don't understand why we cache the context if it is now the full responsibility of the client developers to handle all instances of misbehaviour correctly

They are responsible for telling core IBC if an update was misbehaviour. They are not responsible for rolling back all state changes.

What if a client wanted to write metadata everytime it handled misbehaviour in an update client message?

This is definitely possible, i guess its up to us what we want to enable. The downside is accidentally leaving in metadata writes after misbehaviour that we intend to write only for valid updates. Think @cwgoes can weigh in on tradeoff between flexibility and opinionated code. I believe being opinionated here and having the ClientKeeper write metadata on valid update makes more sense. Light client implementations are fully responsible for doing update logic (UpdateClient will do none of that).
But ClientKeeper will take the returned output, and do all of the necessary store writes. I think that's a clean separation of responsibility.

I'd actually prefer to be as defensive as possible. In which case, we should keep the cached context and return an error if a duplicate update occurs without the client detecting misbehavior

I tried doing this and the code got super ugly because there was freezing logic in both the ClientKeeper and in tendermint's update function. That could have been much cleaner if it was just in one place.

Furthermore, I think it's possible to take all client developer checks that must be done by every light client and put them in the ClientKeeper to minimize the possibility of light-client developer error.
But I think in practice, this would make things less secure if it trades off too much on separation of concerns.
Critically, I think it just needs to be clear to a reviewer/developer where a particular check is supposed to happen.
My proposal is that we create a very clear separation of concern that acts as a contract between core IBC and light client developer.

Light client implementation must give core IBC the updated clientstate/consensus state. And it must return a frozen client state if the update was evidence of misbehaviour.
Core IBC will in turn store the clientstate (and consensus if valid update), write all callback state changes on successful updates, and emit appropriate events.

This means that there may be redundant checks happening in light clients, that may be missed by some of them. But it gives a very clear rule for what a light client implementation is responsible for. Even though i place responsibility of all misbehaviour checks on light client. As a reader and reviewer I can analyze the light-client implementation in isolation and check that it is catching all misbehaviour and holding up its side of bargain.

Without this, I need to be checking whether ClientKeeper+misbehaviour together are catching all misbehaviour. And I need to make sure together they don't miss a gap between them. And that they aren't doing redundant checks. It's also harder as time goes on to determine where a check should go. We would need to make a subjective decision on whether we think some check is universal or not.

For these reasons I think clear separation of concerns is more important than putting all universal checks (even subtle ones) in the ClientKeeper. But yes, this should absolutely be documented in light_client.md. Will do so once there's consensus on this point

Regardless of if we allow metadata writes on misbehaviour, we still want to cache so we can discard on error.

Developers shouldn't be forced to revert state themselves on error

I think you make great points.

it must return a frozen client state if the update was evidence of misbehaviour.

I agree with this.

As a reader and reviewer I can analyze the light-client implementation in isolation and check that it is catching all misbehaviour and holding up its side of bargain.

I like this, and I think we can still achieve this with a duplicate check. My concern is that allowing a duplicate update at an existing height is a critical security vulnerability and I'm hesitant to let it go by when we have the capacity to do the check. This is the code I have in mind:

consState, exists := keeper.GetConsensusState() newClientState, newConsensusState, err := CheckHeaderAndUpdateState() if err != nil { return err } // write client state, errors returned later revert state changes switch { case: newCilentState.IsFrozen() // use logic you have case: exists && !reflect.DeepEqual(consState, newConsensusState) // light client implementation missed misbehaviour handling return err default: // regular update code }

I don't see why this code gets ugly? It allows light clients to fully implement misbehaviour logic without relying on 02-client and it allows 02-client to prevent duplicate updates which are misbehaviour

Regardless of if we allow metadata writes on misbehaviour, we still want to cache so we can discard on error.

Developers shouldn't be forced to revert state themselves on error

Do you have the use case in mind that update is being called by an external module? Messages that result in errors always have state changes reverted by baseapp. I think this is a safe assumption to make

Do you have the use case in mind that update is being called by an external module? Messages that result in errors always have state changes reverted by baseapp. I think this is a safe assumption to make

Oh yes you're correct about this. We should only cache if we discard metadata on misbehaviour

My concern is that allowing a duplicate update at an existing height is a critical security vulnerability and I'm hesitant to let it go by when we have the capacity to do the check.

Here's a question - is this always true? Certainly it is a problem if unequal consensus states at the same height would allow for violation of exactly-once packet delivery guarantees or timeouts, but there could conceivably be client types which allow duplicate consensus states, just not verification at them (so they are only intermediate update points) - for example, a (non-Tendermint) consensus algorithm could have a block history which looks like this:

Is this a case we want to consider? There is something to be said for not overly constraining what it means for clients to be "correct", since clients implement all of the packet data / timeout / etc. verification functions anyways.

Great question! I didn't realize intermediate update points were a possibility.

In light of our discussion yesterday, I don't see the usefulness of adding this check if in the near future, light client implementations will be responsible for getting/setting client/consensus states. In this design, light clients should definitely be aware to guard against duplicate updates which constitute misbehaviour

colin-axner

Overall LGTM, nice work!

modules/core/02-client/keeper/client.go

colin-axner · 2021-04-27T11:24:52Z

modules/core/02-client/keeper/client.go

+	// If client state is not frozen after clientState CheckHeaderAndUpdateState,
+	// then write the update state changes, and set new consensus state.
+	// Else the update was proof of misbehaviour and we must emit appropriate misbehaviour events.
+	if !newClientState.IsFrozen() {


I don't understand why we cache the context if it is now the full responsibility of the client developers to handle all instances of misbehaviour correctly

Solo machines store the consensus state in the client state. Thus the only protection cached context adds is against metadata, but it still seems confusing to me. As client developers should be very aware not to write unwanted state changes for an update which is actually evidence of misbehaviour. What if a client wanted to write metadata everytime it handled misbehaviour in an update client message?

We should either be as defensive as possible by assuming client developers miss checks or we should be as explicit as possible in saying it is entirely the responsibility of the app developer. If we cache the context, then I think we might as well do the duplicate consensus state check (and return an error if a duplicate update is successful)

I'd actually prefer to be as defensive as possible. In which case, we should keep the cached context and return an error if a duplicate update occurs without the client detecting misbehavior

Regardless, these requirements should be clearly documented in a light_client.md under docs/. These are subtle checks that are essential for security

modules/core/02-client/keeper/client.go

modules/light-clients/07-tendermint/types/misbehaviour.go

modules/light-clients/07-tendermint/types/store.go

modules/light-clients/07-tendermint/types/update_test.go

colin-axner · 2021-04-27T17:16:04Z

changelog entry is needed (I keep forgetting as well)

colin-axner

LGTM, waiting to approve once we resolve decisions on

using cache ctx (no preference)
returning error if client did not handle misbehaviour (duplicate update produces different consensus state)

I have a preference for doing the consensus state check, but willing to concede if others consider it unnecessary

modules/core/02-client/keeper/client_test.go

cwgoes · 2021-04-30T13:23:40Z

I figure if we're keeping it, we may as well make it meaningful. Care to weigh in @cwgoes ?

Just to refresh my memory, the goal here is backwards compatibility, right? Otherwise it seems odd to me to keep a field with a range (integer) far greater than the set of semantically meaningful values (0, non-zero). For backwards compatibility I guess there isn't too much difference between 0, 1 and 0, nonzero height - although the latter does run the risk of appearing to be more semantically relevant than it actually is - and we don't really want IBC users to start using this value for other kinds of data processing, if they really care about misbehaviour heights they should read them from the event logs - those are both some (weak) reasons to prefer 0, 1 I think.

…rfly-ibc-fix

modules/light-clients/07-tendermint/types/misbehaviour_handle.go

colin-axner

Excellent work! In light of our discussion yesterday, I think the current solution aligns perfectly with future changes

colin-axner · 2021-05-05T10:51:52Z

@AdityaSripal I fixed the merge conflicts. Everything was straightforward except for the check in update client for if the client is frozen. I update the code to be semantically equivalent:

Before

if !client.IsFrozen {

} else {

}

after

if status := client.Status; status != exported.Frozen {

} else {

}

It is just slightly odd because we aren't checking if the client is active, just that it is not frozen

cwgoes

ACK modulo quick confirmation question

modules/light-clients/07-tendermint/types/misbehaviour_handle.go

Minor fixes

…hains * Adding github workflows for coco * Adding CoCo & FreeFlix chains details * updating node details.

AdityaSripal and others added 20 commits March 19, 2021 12:48

implement update client fix and start changing tests

993f5f2

fix bug and write identical case test

1ef071c

write misbehaviour detection tests

d432451

add misbehaviour events to UpdateClient

593e5a5

fix client keeper and write tests

38f6115

add Freeze to ClientState interface

4a56b57

Merge branch 'main' into aditya/update-client-fix

77f4c2f

Merge branch 'aditya/update-client-fix' of github.com:cosmos/ibc-go-G…

84adabc

…HSA-fw94-c6w5-gw8j into aditya/update-client-fix

add cache context and fix events

a29d312

Update modules/light-clients/07-tendermint/types/update.go

b6a4635

Co-authored-by: Zarko Milosevic <zarko@informal.systems>

address colin comments

9edfaf6

Merge branch 'aditya/update-client-fix' of github.com:cosmos/ibc-go-G…

dedd6d5

…HSA-fw94-c6w5-gw8j into aditya/update-client-fix

freeze entire client on misbehaviour

7172807

add time misbehaviour and tests

b3e50b1

Merge branch 'main' into alderfly-ibc-fix

117d7d1

enforce trusted height less than current height in header.ValidateBasic

937364d

Merge branch 'alderfly-ibc-fix' of github.com:cosmos/ibc-go-GHSA-fw94…

3ec116d

…-c6w5-gw8j into alderfly-ibc-fix

cleanup and tests

5c793ae

fix print statement

906a413

fix merge

f62cca7

AdityaSripal requested review from colin-axner and fedekunze as code owners April 26, 2021 23:07

enforce monotonicity in update

155e461

add docs and remove unnecessary interface function

326e5fb

AdityaSripal commented Apr 26, 2021

View reviewed changes

AdityaSripal requested a review from cwgoes April 26, 2021 23:28

colin-axner requested a review from milosevic April 27, 2021 09:13

colin-axner added this to the 1.0.0 milestone Apr 27, 2021

colin-axner reviewed Apr 27, 2021

View reviewed changes

AdityaSripal mentioned this pull request Apr 28, 2021

Technical Debt Tracker #145

Closed

6 tasks

AdityaSripal added 2 commits April 28, 2021 12:09

first round of review comments

2eaa2c9

CHANGELOG

f63ae67

colin-axner mentioned this pull request Apr 29, 2021

Write ADR for IBC handler logic in msg.ValidateBasic #150

Open

3 tasks

update updateclient test

c3ac1eb

colin-axner mentioned this pull request Apr 29, 2021

IsFrozen() changed to Status() #140

Merged

9 tasks

colin-axner added 2 commits April 30, 2021 12:01

bump tendermint to 0.34.10

75bf94a

Merge branch 'main' into alderfly-ibc-fix

e2a70fb

colin-axner reviewed Apr 30, 2021

View reviewed changes

modules/core/02-client/keeper/client_test.go Outdated Show resolved Hide resolved

modules/core/02-client/keeper/client_test.go Outdated Show resolved Hide resolved

AdityaSripal added 3 commits April 30, 2021 15:50

remove caching and specific frozen height

eab22f0

Merge branch 'alderfly-ibc-fix' of github.com:cosmos/ibc-go into alde…

979435a

…rfly-ibc-fix

document in go code

c7f8bd2

colin-axner reviewed May 3, 2021

View reviewed changes

modules/light-clients/07-tendermint/types/misbehaviour_handle.go Outdated Show resolved Hide resolved

DRY FrozenHeight

76e932a

colin-axner approved these changes May 5, 2021

View reviewed changes

colin-axner added 2 commits May 5, 2021 12:45

fix merge conflicts

cbbb715

fix build

2f90b44

cwgoes approved these changes May 9, 2021

View reviewed changes

modules/light-clients/07-tendermint/types/misbehaviour_handle.go Show resolved Hide resolved

fix minor merge conflicts

b288be9

colin-axner enabled auto-merge (squash) May 10, 2021 11:02

colin-axner merged commit 2d3132f into main May 10, 2021

colin-axner deleted the alderfly-ibc-fix branch May 10, 2021 11:11

faddat referenced this pull request in notional-labs/ibc-go Feb 23, 2022

Merge pull request #141 from CosmWasm/minor_fixes

ab5fdbb

Minor fixes

faddat referenced this pull request in notional-labs/ibc-go Mar 1, 2022

Merge PR #141: Adding github workflows for CoCo, along with testnet c…

498f1c3

…hains * Adding github workflows for coco * Adding CoCo & FreeFlix chains details * updating node details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time Monotonicity Enforcement #141

Time Monotonicity Enforcement #141

AdityaSripal commented Apr 26, 2021 •

edited

codecov-commenter commented Apr 26, 2021 •

edited

AdityaSripal left a comment

AdityaSripal Apr 26, 2021

colin-axner Apr 27, 2021

AdityaSripal Apr 28, 2021

AdityaSripal Apr 28, 2021

colin-axner Apr 28, 2021 •

edited

colin-axner Apr 28, 2021 •

edited

AdityaSripal Apr 28, 2021

cwgoes May 4, 2021

colin-axner May 5, 2021

colin-axner left a comment

colin-axner Apr 27, 2021

colin-axner commented Apr 27, 2021

colin-axner left a comment

cwgoes commented Apr 30, 2021

colin-axner left a comment

colin-axner commented May 5, 2021

cwgoes left a comment

Time Monotonicity Enforcement #141

Time Monotonicity Enforcement #141

Conversation

AdityaSripal commented Apr 26, 2021 • edited

Description

codecov-commenter commented Apr 26, 2021 • edited

Codecov Report

AdityaSripal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colin-axner Apr 28, 2021 • edited

Choose a reason for hiding this comment

colin-axner Apr 28, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colin-axner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colin-axner commented Apr 27, 2021

colin-axner left a comment

Choose a reason for hiding this comment

cwgoes commented Apr 30, 2021

colin-axner left a comment

Choose a reason for hiding this comment

colin-axner commented May 5, 2021

cwgoes left a comment

Choose a reason for hiding this comment

AdityaSripal commented Apr 26, 2021 •

edited

codecov-commenter commented Apr 26, 2021 •

edited

colin-axner Apr 28, 2021 •

edited

colin-axner Apr 28, 2021 •

edited