Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify CheckHeaderAndUpdateState to set client state/consensus states in IBC light client implementations #284

Closed
3 tasks
colin-axner opened this issue Jul 22, 2021 · 21 comments
Labels
02-client core type: refactor Architecture, code or CI improvements that may or may not tackle technical debt.

Comments

@colin-axner
Copy link
Contributor

Summary

02-client should not set client states and consensus states as generalizing IBC client implementations is difficult. Better to give IBC clients more control and flexibility. This will already be useful for solo machines which don't need to store consensus states for each update height

This design is already specified in ICS, our implementation didn't follow those changes. See comment


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged/assigned
@colin-axner
Copy link
Contributor Author

Revisiting this decision many months later, I'm trying to piece together the exact benefits.

This decision was sparked by the motivation of allowing the client state definition to be upgraded over time. The idea was that the light client code internally use a different type than what is stored in the client store. But now I see that it would also be possible to create an entirely new client state definition for the breaking versions of the same client. That is, if we modified the client state definition of the IBC tendermint client, the new definition should be v2 and treated as an entirely different client rather than a continuation of v1. Note, this upgrade path still requires all counterparty chains to understand v2 before upgrading.

There do remain subtle benefits, such as the consensus state not necessarily needing to be set for every consensus state (solo machines), or removal of the GetHeight function from the header interface. The removal of the GetHeight function from the header interface was requested as part of the Wasm client implementation, but was also justified by the proposed changes. It might be possible to find a solution which allows for the GetHeight removal without completely handing over all responsibility to the light client developer

The changes would also align with the spec, but increases the responsibility of light client developers.

One possible adjustment that comes to mind is only removing the setting of consensus states from 02-client. This would require light client developers to set the consensus states themselves (allowing header.GetHeight to be removed and stopping unnecessary state storage for solo machines). Although at that point, I'm not sure why they shouldn't be expected to update the client state as well

@mkaczanowski do you have any insights into whether this change would positively or negatively benefit the Wasm client?

cc @AdityaSripal @cwgoes any thoughts on how other light client types might be affected by this decision?

@colin-axner
Copy link
Contributor Author

Another consideration which might add to this discussion is the getting of timestamps. Currently ibc-go will attempt to get a timestamp for a height by getting the timestamp of a consensus state at a specific height, but this does not work for solo machines. The solo machine will sign a message at sequence X indicating a timeout, but core IBC will attempt to check the timeout using sequence X as the consensus state height. The consensus state isn't written (if at all) until after the timeout message is expected to succeed.

The GetTimeoutAtHeight should likely be brought into the client state interface, or perhaps we should move the GetConsensusState into the client state interface?

@crodriguezvega crodriguezvega modified the milestones: vNext, v4.0.0 Dec 16, 2021
@crodriguezvega crodriguezvega added the type: refactor Architecture, code or CI improvements that may or may not tackle technical debt. label Jan 3, 2022
@seunlanlege
Copy link
Contributor

I'm currently leading the development efforts of 11-beefy which is to be the canonical ibc light client for polkadot.

I'm in support of these changes for 2 reasons:

  • This would allow light clients handle batch header updates in CheckHeaderAndUpdateState, for the special case of 11-beefy proving the finality for a batch of headers is much more space and time efficient than the space/time complexity of proving each individual headers in that batch, combined.

  • This also allows for a single light client instance of 11-beefy be used to prove finality for every parachain connected to the relay chain (Polkadot/Kusama). We achieve this by setting the appropriate ConsensusState for individual parachain headers in CheckHeaderAndUpdateState

@cwgoes
Copy link
Contributor

cwgoes commented Jan 19, 2022

I'm still strongly in favour of this separation of concerns, I think the abstractions are much clearer.

The changes would also align with the spec, but increases the responsibility of light client developers.

This is true - if we want to simplify light client development, can we just put some sort of standard interface in between for all light clients which use consensus states in a "standard" way (that is currently serviced by the implementation)? That seems also beneficial to me.

@colin-axner
Copy link
Contributor Author

Thanks everyone for contributing to the discussion. This has helped clarify things for me.

@seunlanlege would the proposed changes in #668 work for your light client design? Also see minor modifications to the proposal below.


I think these reasons provide strong justification for removing the setting of consensus states from the 02-client create/update/upgrade functions. I think it also justifies removing GetHeight from the Header interface.

If we remove GetHeight from the Header interface, one may notice that:

/ Header is the consensus state update information
type Header interface {
    proto.Message

    ClientType() string
    ValidateBasic() error
}
// Misbehaviour defines counterparty misbehaviour for a specific consensus type
type Misbehaviour interface {
    proto.Message

    ClientType() string
    GetClientID() string
    ValidateBasic() error
}

The Header and Misbehaviour interfaces are the same (we should remove GetClientID() - it is already supplied by MsgSubmitMisbehaviour). I bring this up, because I think it probably makes sense to combine their usage. We already discovered with 07-tendermint, evidence of misbehaviour isn't just two headers, it can also be a header compared against existing state. Thus by combining these interfaces, we could replace the following functions:

CheckHeaderAndUpdateState(header, ...)
CheckMisbehaviourAndUpdateState(misbehaviour, ...)

with

VerifyHeader(header,...)
CheckMisbehaviour(header, ...)
UpdateState(header, ...)

In relation to #668, instead of adding an additional CheckHeaderForMisbehaviour function, we could reuse the CheckMisbehaviour function.

Under the hood, I would expect 07-tendermint to do something like:

switch header.(type) {
case: NormalHeader{}
    // check time monotonicity misbehaviour  
case: MisbehaviourHeader{}
    // check double singing misbehaviour

Of course, it might be best to keep these concepts separate, but misbehaviour detection has already become an ingrained concept in updating a client, so I think it might be best to embrace this direction and make it easier for light client developers to reason about.

I think when doing these changes we should also move GetTimestampAtHeight() to the client state interface. This would fix issues for the solo machine.

Proposal

  • remove setting of consensus states from Create/Update/Upgrade Client functions in 02-client
  • Move GetTimestampAtHeight to ClientState interface
  • Remove GetHeight from the Header interface
  • Combine Header and Misbehaviour interface

The only part I'm still uncertain about is what to do with the ClientState. I don't see good reasons for why it must be set by the client state (since it is being used by 02-client). I guess the only benefit is avoiding unnecessary writes on no-ops when the client state doesn't change. Also, it might be more explicit of a change that all client store changes are now managed by the light client.


This part is documenting reasoning which may not be obvious to everyone

One question that pops into my head is why 02-client should be capable of understanding the consensus height to consensus state mapping. As the spec states:

It must finally be introspectable by the state machine which it is for, such that the state machine can look up its own ConsensusState at a past height.

So why do we need to allow functionality of introspecting non-self consensus states? Strictly speaking, we don't need to, but if the spec assumes consensus states are associated with a consensus height then we might as well.

Why do we need to assume consensus heights are associated with consensus states? Well, we need timeouts. We must have monotonically increasing time and comparing the "time" of two consensus states must determine which is older/newer. In theory, I think, consensus heights could be even more abstract than they already are. Within ibc-go, Height is an interface only because of go cyclical imports. But heights are a fairly straightforward concept. It is probably hard to find a good reason for why heights should be an opaque type and not a concrete type.

Thus, it makes sense that each consensus state is associated with a consensus height. It also makes sense that updating from a consensus height via some opaque type like Header may perform batch updates (writing many consensus states). An update may even remove past consensus heights.

@AdityaSripal
Copy link
Member

I really like the proposal, @colin-axner! My question is if CheckMisbehaviour and VerifyUpdate should not write to state? If so, should we enforce this in code somehow?

Also, in this case UpdateState looks like it will need to be passed in information on whether the header is an update or a misbehaviour since the checks are happening earlier.

It also looks like we need to inform light client developers to deal with these headers in standard ways:

  • If it is a regular update, then create a consensus state for each height
  • If it is misbehaviour, then freeze the client at the lowest provided height

@colin-axner
Copy link
Contributor Author

colin-axner commented Jan 21, 2022

My question is if CheckMisbehaviour and VerifyUpdate should not write to state? If so, should we enforce this in code somehow?

If the SDK supported read-only store access, then I think maybe we could enforce no writes to state, but since the SDK doesn't, I think we should just document the intended behaviour of these functions as clearly as possible. I don't see any potential issues from the core IBC side if light clients write state. It could just be problematic for the light client developer who does go against the recommended functionality.

Also, in this case UpdateState looks like it will need to be passed in information on whether the header is an update or a misbehaviour since the checks are happening earlier.

Yes. In #668 I intended to convey that CheckHeaderForMisbehaviour should check for misbehaviour and update state if necessary (misbehaviour detected). Thus UpdateState is assumed to only do valid updates. We can change the CheckMisbehaviour naming to CheckForMisbehaviourAndUpdateState or something else?

Alternatively we could create another function, UpdateStateOnMisbehaviour

I don't have strong preferences.

@colin-axner
Copy link
Contributor Author

I'm also kinda wondering if we should deprecate MsgSubmitMisbehaviour. It will do the exact same as MsgUpdateClient minus regular updates

@notbdu
Copy link
Contributor

notbdu commented Jan 22, 2022

The solo machine will sign a message at sequence X indicating a timeout, but core IBC will attempt to check the timeout using sequence X as the consensus state height. The consensus state isn't written (if at all) until after the timeout message is expected to succeed.

I found a note here on solo machine's lack of support for GetTimestampAtHeight. Is the issue here that a consensus state is not guaranteed to be persisted for every block height?

// NOTE: this is a temporary fix. Solo machine does not support usage of 'GetTimestampAtHeight'
// A future change should move this function to be a ClientState callback.
if clientType != exported.Solomachine {
latestTimestamp, err := k.connectionKeeper.GetTimestampAtHeight(ctx, connectionEnd, latestHeight)
if err != nil {
return err
}
if packet.GetTimeoutTimestamp() != 0 && latestTimestamp >= packet.GetTimeoutTimestamp() {
return sdkerrors.Wrapf(
types.ErrPacketTimeout,
"receiving chain block timestamp >= packet timeout timestamp (%s >= %s)", time.Unix(0, int64(latestTimestamp)), time.Unix(0, int64(packet.GetTimeoutTimestamp())),
)
}
}

It looks like the transaction is aborted and consensus state is not set if the TimestampedSignatureData timestamp is <= consensus timestamp.

if cs.ConsensusState.GetTimestamp() > timestamp {
return nil, nil, 0, 0, sdkerrors.Wrapf(ErrInvalidProof, "the consensus state timestamp is greater than the signature timestamp (%d >= %d)", cs.ConsensusState.GetTimestamp(), timestamp)
}

If TimestampedSignatureData is signed with now(), wouldn't every update get accepted? I'm not quite following here what you mean by consensus state isn't written (if at all) until after the timeout message is expected to succeed.

Or are you saying that the TimestampedSignatureData is signed with a future timestamp indicating a timeout and we end up storing this future timeout ts as the consensus timestamp?

@notbdu
Copy link
Contributor

notbdu commented Jan 22, 2022

I think when doing these changes we should also move GetTimestampAtHeight() to the client state interface. This would fix issues for the solo machine.

Also wondering how doing the above would fix the solo machine issues. What extra context does the 02-client have that the solo machine light client lacks to support this method? Perhaps the answer to this will also help answer the Qs I had above.

@notbdu
Copy link
Contributor

notbdu commented Jan 22, 2022

@seunlanlege would the proposed changes in #668 work for your light client design? Also see minor modifications to the proposal below.

Seems like removing the height <> header association would allow a light client impl to opaquely accept batched headers?

Also, I like the separation of the methods. Think it's more clear to the light client implementer what needs to be done (and where).

@seunlanlege
Copy link
Contributor

@seunlanlege would the proposed changes in #668 work for your light client design? Also see minor modifications to the proposal below.

yes it works perfectly 👍🏿 .

@notbdu
Copy link
Contributor

notbdu commented Jan 22, 2022

This decision was sparked by the motivation of allowing the client state definition to be upgraded over time. The idea was that the light client code internally use a different type than what is stored in the client store. But now I see that it would also be possible to create an entirely new client state definition for the breaking versions of the same client.

Isn't it already possible for a light client impl to return a different client definition and have that persisted by 02-client as long it satisfies the Client iface? Looks like 02-client just persists whatever is returned below. You could technically overwrite the existing client definition originally persisted by 02-client after a chain upgrade.

// Any writes made in CheckHeaderAndUpdateState are persisted on both valid updates and misbehaviour updates.
// Light client implementations are responsible for writing the correct metadata (if any) in either case.
newClientState, newConsensusState, err := clientState.CheckHeaderAndUpdateState(ctx, k.cdc, clientStore, header)
if err != nil {
return sdkerrors.Wrapf(err, "cannot update client with ID %s", clientID)
}
// emit the full header in events
var (
headerStr string
consensusHeight exported.Height
)
if header != nil {
// Marshal the Header as an Any and encode the resulting bytes to hex.
// This prevents the event value from containing invalid UTF-8 characters
// which may cause data to be lost when JSON encoding/decoding.
headerStr = hex.EncodeToString(types.MustMarshalHeader(k.cdc, header))
// set default consensus height with header height
consensusHeight = header.GetHeight()
}
// set new client state regardless of if update is valid update or misbehaviour
k.SetClientState(ctx, clientID, newClientState)

Update:
Forgot that 02-client and 03-connection that forces ClientState to be 07-tendermint client state.
Validation checks on counterparty connection open.

if err := k.clientKeeper.ValidateSelfClient(ctx, clientState); err != nil {

if err := k.clientKeeper.ValidateSelfClient(ctx, clientState); err != nil {

Enforce 07-tendermint client type.

func (k Keeper) ValidateSelfClient(ctx sdk.Context, clientState exported.ClientState) error {
tmClient, ok := clientState.(*ibctmtypes.ClientState)
if !ok {
return sdkerrors.Wrapf(types.ErrInvalidClient, "client must be a Tendermint client, expected: %T, got: %T",
&ibctmtypes.ClientState{}, tmClient)
}

The above would need to be changed to support more client types?

@AdityaSripal
Copy link
Member

If the SDK supported read-only store access, then I think maybe we could enforce no writes to state, but since the SDK doesn't, I think we should just document the intended behaviour of these functions as clearly as possible. I don't see any potential issues from the core IBC side if light clients write state. It could just be problematic for the light client developer who does go against the recommended functionality.

Agreed, theoretically i think we could wrap the store in an interface with a no-op write but that should be a feature SDK introduces not something we maintain.

Yes. In #668 I intended to convey that CheckHeaderForMisbehaviour should check for misbehaviour and update state if necessary (misbehaviour detected). Thus UpdateState is assumed to only do valid updates. We can change the CheckMisbehaviour naming to CheckForMisbehaviourAndUpdateState or something else?
Alternatively we could create another function, UpdateStateOnMisbehaviour

Ahh ok, I thought the UpdateState does the updating for both verify header and check misbehaviour. I think we should rename it to CheckForMisbehaviourAndUpdateState so that it is clear we will be doing update in here, and we should make UpdateClient do:

If that is the case, then why are we splitting up VerifyHeader and UpdateState. Why shouldn't it just be VerifyHeaderAndUpdateState?

@colin-axner
Copy link
Contributor Author

colin-axner commented Jan 24, 2022

If that is the case, then why are we splitting up VerifyHeader and UpdateState. Why shouldn't it just be VerifyHeaderAndUpdateState?

Misbehaviour checks also need verified headers

@AdityaSripal
Copy link
Member

Ahh got it, ok then. Makes sense

@AdityaSripal
Copy link
Member

Though why don't we just do VerifyHeaderAndUpdateState in 02-client and have the individual client implementation have complete control and check for misbehaviour in the function if appropriate.

I understand splitting up the functions to make it explicit in-code for the light client developers that they must check for misbehaviour. But we are already moving towards giving them more flexibility and holding their hand less in 02-client. I think given that design direction, it makes sense to go all the way and just document very clearly that misbehaviour should also be checked within that function.

To me, it seems like creating these additional methods will cause more confusion and inefficiency than any benefits.

@colin-axner
Copy link
Contributor Author

colin-axner commented Jan 24, 2022

I disagree. A function should not be overloaded, the greater complexity of a function, the more likely bugs will occur/slip by. The proposed functionality would require light clients developers to:

  • verify the header
  • account for duplicate headers
  • automatically detect misbehaviour
  • return updated client and consensus states

Given that all light clients need to verify the header, check for misbehaviour, and update state, I don't see a reason why we need to bundle this functionality under the hood of a single function.

Furthermore, it makes 02-client event emission difficult. The function could error because of:

  • invalid headers
  • duplicate headers
  • misbehaviour
  • no error (success)

In the first two cases we don't want to emit events, in the last two we want to emit different events.

What are the points of confusion and inefficiency you foresee?

@AdityaSripal
Copy link
Member

Those are all fair points.

My concern was as an interface we have one operation: UpdateClient that is split into a stateless verification and stateful update
But then Misbehaviour is just one function that does both verification and the update in a single function.

So from an interface standpoint there's this inconsistency that is at best inelegant and at worst confusing without being very clear with our naming.

I would prefer to design the interface to be more consistent if possible

@colin-axner
Copy link
Contributor Author

colin-axner commented Jan 24, 2022

Good point.

I suppose we consider the following?

    if err := clientState.VerifyHeader(header); err != nil {
        return err
    }
    
    foundMisbehaviour := clientState.CheckForMisbehaviour(header)
    if foundMisbehaviour {
        clientState.UpdateStateOnMisbehaviour(header)
        // emit misbehaviour event
        return 
    }
    
    clientState.UpdateState(header) // expects no-op on duplicate header
    // emit update event
    return
}

MsgSubmitMisbehaviour would reuse the first 3 calls (we could optionally deprecate this message)

This would give us stateless verification/checks and 2 explicit types of updates

@AdityaSripal
Copy link
Member

Agreed with your final proposal 👍

@colin-axner colin-axner removed their assignment Feb 7, 2022
faddat pushed a commit to notional-labs/ibc-go that referenced this issue Feb 23, 2022
Base64 encoded return data on wasm raw query REST endpoint
faddat pushed a commit to notional-labs/ibc-go that referenced this issue Mar 1, 2022
* Working state

* Fix tests

* WIP packet relay

* Merge PR cosmos#285: Fixed issues in scripts to start two or three chains

* Compling

* Finish denom work, move on to streaming relayer proof query issue

* working streaming relayer

* Add debug logging to faucet for remote debug

* add additional logging

* moar faucet debugging

* Update to v0.40.0-rc0 and ensure working README, other docs, tests, etc...

Co-authored-by: Andy Nogueira <me@andynogueira.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
02-client core type: refactor Architecture, code or CI improvements that may or may not tackle technical debt.
Projects
Status: Done 🥳
Development

No branches or pull requests

7 participants