Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat) Add PFM v6.1.1 #8878

Merged
merged 26 commits into from
Feb 23, 2024
Merged

(feat) Add PFM v6.1.1 #8878

merged 26 commits into from
Feb 23, 2024

Conversation

ivanlei
Copy link
Contributor

@ivanlei ivanlei commented Feb 9, 2024

closes: #7712

Description

The aim of this change is to add the Strangelove Packet Forwarding Middleware to the Agoric chain.

PFM Version Choice
PFM v6.1.1 best version matches HEAD of agoric-sdk master

--- PFM agoric-sdk
go.mod go.mod go.mod
Tendermint v0.34.29 v0.34.29
Cosmos-SDK v0.46.15 v0.46.16
IBC-Go v6.2.0 v6.2.1

PFM v6.1.1 Useful Links

PFM Integration
I largely followed the Integration Guide. However, the guide did have some bugs:

  1. The v6 guide says to import v7 of PFM instead of v6
  2. The guide does not mention the need to integrate the PFM with the cosmos store loader.
  3. The call to ibcRouter.AddRoute uses the wrong module name.

For setting up the IBC router in app.go & properly wiring middleware, I took a lot of looks at the osmosis implementation osmosis v21.2.1

Security Considerations

The PFM has been deployed on many cosmos chains. It is not officially 3rd party audited. This is realistically the same level of audit the rest of the cosmos SDK receives.

PFM v7 had a well-publicized security vulnerability named Pigeonfall. This vuln does not impact IBCv6 or PFMv6.

Scaling Considerations

PFM has a number of settings that will impact scalability. For now, defaults are being used:

  • Retries On Timeout - how many times will a forward be re-attempted in the case of a timeout?
  • Timeout Period - how long can a forward be in progress before giving up?
  • Refund Timeout - how long can a forward be in progress before issuing a refund back to the original source chain?
  • Fee Percentage - % of the forwarded packet amount which will be subtracted and distributed to the community pool.

All of these have been left to default values as is done in gaia, umee, osmosis, omniflix.

Documentation Considerations

Testing Considerations

$ cd golang/cosmos/e2e_test
$ make TestPFM
$ make TestConformance

To verify the PFM integration, two new supporting tools are introduced: the Strangelove Interchain Test & the Strangelove Heighliner docker image creation tool.

Strangelove Interchain Test

Agoric-labs Interchain Test Fork
The interchaintest v6 assumes that & every cosmos chain has an x/crisis module. More recent versions of interchaintest drop this assumption, but Agoric must use interchaintest v6 to pair with IBCv6. Interchain v6 is a close/non-supported version so Agoric won't be able to rapidly upstream changes to solve the x/crisis issue. After a lot of messing about, the easiest solution was to create a fork of interchaintest at https://github.com/Agoric-labs/interchaintest

The v6_agoric branch - and associated v6.0.0-agoric tag - of agoric-labs/interchaintest has the necessary changes to avoid the x/crisis module. It's important to note that the go.mod for these new tests explicitly references the v6.0.0-agoric tag. For agoric-sdk to consume new changes in agoric-labs/interchaintest, either the v6.0.0-agoric tag must be updated, or a new tag must be created and added to agoric-sdk/golang/cosmos/agoricinterchaintest/go.mod. It's a pretty crappy workflow TBH.

Heighliner
Heighliner creates docker images for consumption by Interchain Test. I didn't play too much with it, mostly used a fork already made by @toliaqat at strangelove-ventures/heighliner#211

It probably would be smart to make this an agoric-labs fork as well. It would also be smart to modify the agoric Dockerfile in heighliner to make it easier to specify a base image. Thus far, I did that manually, using a3p-integration docker images as my base image and pushing the heighliner images to my own account on the Dockerhub.

# One time
git clone git@github.com:strangelove-ventures/heighliner.git
cd heighliner
git remote add toliaqat git@github.com:toliaqat/heighliner.git
git pull toliaqat
git merge remotes/toliaqat/main
go build

# Every time
# make sure to point `dockerfile/agoric/Dockerfile` at the right agoric image
go build
./heighliner build -c agoric -g main -t heighliner-agoric

agoric-sdk/golang/cosmos/e2e_test
With Agoric compatible version of Interchaintest and Heighliner in place, we're finally able to test IBC/PFM conformance of the Agoric chain.

  • Tests in agoric-sdk/golang/cosmos/e2e_test/ibc_conformance_test.go validate IBC conformance. Run them with go test -v -run ^TestChainPair.
  • Tests in agoric-sdk/golang/cosmos/e2e_test/pfm_test.go validate PFM conformance. They're a port of https://github.com/strangelove-ventures/noble-router-audit-release/blob/19022562dfdc64f6d7165b5fb9770391742634b2/interchaintest/packet_forward_test.go#L36 to from IBCv7 to IBCv6.
  • The tests show a repeatable issue in compatibility between Agoric & Hermes relayer. I have not fully debugged but the issue doesn't crop up with the Strangelove relayer and doesn't crop up when testing CosmosHub instead of Agoric. I think it is related to the test infrastructure.
  • These tests could potentially move out of the agoric-sdk repo and into Agoric-labs/interchaintest. The tests have a complex go.mod so there's warnings not to include interchaintest in the same go.mod as agd is built from.

Upgrade Considerations

This must be released as part of a chain-halting upgrade. After bug fixing, this passes a manual run of e2e_test.

Add per the Integration Guide - https://github.com/cosmos/ibc-apps/blob/middleware/packet-forward-middleware/v6.1.1/middleware/packet-forward-middleware/docs/integration.md
At this point, the integrated change built locally after `go mod tidy` to pull in new dependencies
No testing has been run yet.

Version Choice:
* PFM v6.1.1 - https://github.com/cosmos/ibc-apps/releases/tag/middleware%2Fpacket-forward-middleware%2Fv6.1.1
    * go.mod - https://github.com/cosmos/ibc-apps/blob/middleware/packet-forward-middleware/v6.1.1/middleware/packet-forward-middleware/go.mod
        * Cosmos-SDK v0.46.15
        * IBC-Go v6.2.0
        * Tendermint v0.34.29

* Works with agoric-sdk versions
    * go,mod - https://github.com/Agoric/agoric-sdk/blob/master/golang/cosmos/go.mod
    * Cosmos-SDK v0.46.16
        * github.com/cosmos/cosmos-sdk => github.com/agoric-labs/cosmos-sdk v0.46.16-alpha.agoric.1
    * IBC-Go v6.2.1
    * Tendermint v0.34.29
        * github.com/tendermint/tendermint => github.com/agoric-labs/cometbft v0.34.30-alpha.agoric.1
The Strangelove Interchain Test
- https://github.com/strangelove-ventures/interchaintest
- https://strange.love/blog/announcing-interchaintest
allows for spinning up docker images representing chains - like osmosis or cosmoshub - and IBC relayers - like the go relayer or hermes - and then connecting them all together and testing IBC, ICA, PFM conformance.

This change adds a new agoricinterchain package that runs basic conformace tests against osmosis & the hub. To run the tests `make interchaintest` from  `agoricinterchain/`. Example output here: https://gist.github.com/ivanlei/6275c67ffd19204a127139b18b97c612

The major challenge with this commit was finding a good set of go.mod replace directives to get everything building.

My next commit should add agoric to the test mix.
…r like:

`failed to load latest version: version of store packetfowardmiddleware mismatch root store's version; expected 1337 got 0`

Found cosmos/cosmos-sdk#16003 which hinted "Probably forgot to configure the store upgrades in a upgrade that added this store." The upgrade guide for PFM didn't mention this but I got lucky and remembered the `Delete` block about `x/crisis`. Created an `Added` block in the same place and things got happy.
@ivanlei ivanlei marked this pull request as draft February 11, 2024 14:33
golang/cosmos/agoricinterchaintest/packet_forward_test.go Outdated Show resolved Hide resolved
golang/cosmos/daemon/cmd/root.go Outdated Show resolved Hide resolved
packages/cosmic-swingset/test/test-make.js Outdated Show resolved Hide resolved
@ivanlei ivanlei marked this pull request as ready for review February 22, 2024 00:41
@ivanlei ivanlei added the automerge:squash Automatically squash merge label Feb 22, 2024
Copy link
Contributor

@JimLarson JimLarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some minor comments I trust you to resolve.

I am unsure whether the test timeouts are based on real time, and hence might be flaky. If the test framework doesn't support a purely simulated time, we'll have to live with it.

# TestChainPair - Minimal version of TestConformance does less permutations
TestChainPair:
# Add a 20min timeout since tests are slow
go test -timeout 20m -v -run ^TestChainPair
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add newline to end of file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do


To test PFM functionality only:
```
$ make Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no Test target in the Makefile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cause I'm a derp. Fixing...

func TestConformance(t *testing.T) {
cs := getChainSpec(t)
rf := getRelayerFactory(t)
testConformance(t, cs[0:2], rf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider(*) inlining the helper function testConformance() to keep everything in one place (and the same for TestChainPair() below).

(*) In my code reviews, a request to "consider" can be satisfied by pausing, staring into space for a moment, then saying "nah" with no further justification needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the "nah" comes from the fact that I have - during the development process - had multiple entry points that call helpers like testConformance.

Comment on lines +45 to +50
var (
ctx = context.Background()
client, network = interchaintest.DockerSetup(t)
rep = testreporter.NewNopReporter()
eRep = rep.RelayerExecReporter(t)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make these short variable declarations (e.g. ctx := context.Background()) unless there's some reason for the var block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're only like that because I stole this test from the location referenced in the PR description and changed only what I needed to. Will follow up briefly in slack to get more opinion from you.

Channel: bcChan.ChannelID,
Port: bcChan.PortID,
Retries: &retries,
Timeout: 1 * time.Second,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this simulated time or real-world time? If it's real-world time, please comment that it might flake, or give a reason why it should never flake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A real-world time of 1 sec for timeout is always < 1 block, so this doesn't flake.

Comment on lines +180 to +181
nv := 1
nf := 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider changing nv to numValidators, nf to numFull.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is idiomatic and follows https://go.dev/wiki/CodeReviewComments#variable-names

@mergify mergify bot merged commit 08c75a9 into master Feb 23, 2024
67 checks passed
@mergify mergify bot deleted the ivan/8876-pfm-on-chain branch February 23, 2024 02:01
This was referenced Feb 23, 2024
# - `E2ETEST_CHAINIMAGE_AGORIC` - the value of this will be used specific the repository & version of docker image to use for the agoric chain. a valid value must have a semicolon and be formatted as `repository:tag`. ex: `E2ETEST_CHAINIMAGE_AGORIC="ghcr.io/agoric/agoricinterchain:latest"`
# - `E2ETEST_RELAYERNAME` - set to `"cosmos"` or `"hermes"` to choose the relayer type
# - `E2ETEST_BLOCKS_TO_WAIT` - set to a number to control how many blocks to wait for an ACK from an IBC transfer and how many blocks to wait for TX settlement.
all: TestConformance TestPFM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question

Any reason why TestChainPair is not included in all target?

$ git clone git@github.com:strangelove-ventures/heighliner.git
$ cd heighliner
$ git remote add toliaqat git@github.com:toliaqat/heighliner.git
$ git pull toliaqat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got below error here

git pull toliaqat
Confirm user presence for key ED25519-SK SHA256:F0zHI5oyzBKE4NVE3NrWOKx66JhphjwPzJqkGyo1+dM
User presence confirmed
You asked to pull from the remote 'toliaqat', but did not specify
a branch. Because this is not the default configured remote
for your current branch, you must specify a branch on the command line.

Did you mean git fetch toliaqat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agoric-cosmos automerge:squash Automatically squash merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate IBC Packet Forwarding Middleware
4 participants