Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make terra relayer more resillient #120

Merged
merged 1 commit into from
Mar 25, 2022

Conversation

ali-bahjati
Copy link
Contributor

@ali-bahjati ali-bahjati commented Mar 25, 2022

Purpose of this PR is to reduce failures in relaying. I investigated because there was a 5min period that relayer didn't work and I highly suspect that this was a network/terra issue but it's good to increase the relayer tolerance.

  • Increase retry attempts (4 to 6) and retry_delay (250ms to 1s) to be more resillient
    • This is because when account sequence mismatch happens it might take some time be fixed
  • Removed estimate fee because it's being done in wallet.createAndSignTx (less requests)
  • Improved logging on when error happens

On sequence number mismatch:
Sequence number is the transaction number of the payer and it is required in Terra to give some sort of order and also avoid messing up with the order of someones transaction for some profit.

I couldn't understand why it happened in the downtime but it appeared a lot in dev environment. The error is that I expected a bigger sequence number (like 101) but a smaller is provided (like 100).

Where the sequence number in transaction comes from? when creating a transaction the wallet calls the api the get account sequence number.

So the problem happens because in the devnet we have 9 symbols which form 2 batches that will be sent to the wormhole almost at the same time. It seems that although the 1st one executes correctly the 2nd one doesn't get sequence number correctly and it is not async issue either. It happened even in the 2nd or 3rd retry with >1s after the first transactions.

It seems that the api for giving sequence number back is not consistent with the api that does the transaction simulation in terms of transactions commitments. by improving the delay of retrying I fixed the problem in devnet to not miss messages.

This PR doesn't completely solve it. It just makes it more resilient and the amount of delay will cause it to be successful because essentially some times passes for both endpoints to get synced. Solving it is not important now because in devnet we only have 1 batch (hence the error was very weird for me).

So.. how to solve it for the future?

  1. Add manual sequence number tracking as was before in the relayer.
  2. Move retry logic to each relayer (by extracting some common parts out) and when receiving such a message, fix the sequence number based on the log with new number.

There are other ways to achieve it too. I like the second one more because it cover broader cases such as the downtime we faced today. I will probably do it after evm changes are merged.

I created #122 to not forget it.

- Increase retry attempts (4 to 6) and retry_delay (250ms to 1s) to be more resillient
  - This is because when account sequence mismatch happens it might take some time be fixed
- Removed estimate fee because it's being done in wallet.createAndSignTx (less requests)
- Improved logging on when error happens
@ali-bahjati ali-bahjati merged commit aaa44ad into dev.v2 Mar 25, 2022
@ali-bahjati ali-bahjati deleted the abehjati/terra-relayer-make-more-resillient branch March 25, 2022 16:54
ali-bahjati added a commit that referenced this pull request Apr 11, 2022
* Move js sdk on p2w-sdk to js folder

Also modifies other dependencies to correct path

* Reversed removal of wasm build for nodejs

* Add newline to a file

* pyth2wormhole: Fix attestation validation bug

commit-id:567942d7

* Add p2w sdk

It uses Pyth clients structs and cleans some of definitions for Pyth2Wormhole structures.

* Add emitter type and add wasm function for it

- It requires solitaire and it requires nightly rust
- No logic is applied, code is from p2w solana contract. (Eventually will be removed from there)

* Add new line

* Move WASM gen docker to root

It is because wasm is going to be used for p2w-sdk too.

* Fix unchanged cache mount paths

* Move terra relayer into the repo

* Update readme

* p2w-client: Add lib target, make helpers into lib functions there

commit-id:3aeb9ee6

* pyth2wormhole-client: Implement retries

commit-id:462677a2

* Make p2w-sdk js use p2w-sdk rust wasm bindings (#65)

* Make p2w-sdk js use p2w-sdk rust wasm bindings (instead of solana contract bindings)
- Removes `wasm.rs` in solana contract too.

* p2w attester contract use p2w-sdk (#68)

* Make solana pyth2wormhole contract to use the sdk

* Use threadpool to set up price symbols (#69)

* Add solana feature flag for p2w sdk (#71)

* Pyth bridge terra contract support batch attestation + use p2w sdk (#72)

* Make terra contract to use pyth2wormhole-sdk and support batch attestation

* Update packages + code format

* Move terra dockerfile out to support third-party dependency

* pyth2wormhole-client: Add polling-based concurrent tx confirmation

commit-id:5d16d035

* chore: p2w spy guarding improve Dockerfile

* fix: p2w_autoattest don't die after initialization

also minimal formatting

* add P2W_EXIT_ON_ERROR

* set P2W_EXIT_ON_ERROR default to True

* Remove bool test

* hopefully this time.

* add tilt p2w-attest P2W_EXIT_ON_ERROR

* convert P2W_EXIT_ON_ERROR to "true"

* Fix pyth test publisher (#76)

* Fix test pyth publisher to actually publish price

- Uses newer pyth images and removes existing hacks for old versions. It essentially makes dockers cleaner.
- Also improve some adds in dockers to cache more efficiently

* Support Batch Price attestation for terra relay (#75)

* Support Batch Price attestation for terra relay

* Abehjati/update p2w sdk to pyth sdk (#83)

* Make p2w-sdk use pyth-sdk

* Correct test values to reflect .env.test

* update p2w sdk to use ema instead of twa (#84)

* Rename twa to ema in terra relay (#85)

* Bring PythStructs.PriceAttestation struct in line with new API

* Add ability to parse batch price attestations

* Pyth terra remove wormhole governance (#87)

* Pyth in terra: remove wormhole governance

* [WIP] p2w-relay-iface: Add NPM package with relayer interface PoC

commit-id:efcb9b34

* Define Pyth SDK Price struct

* Define internal PythStructs.PriceInfo struct

* Cache price updates in standardised PriceInfo format

* Cache price updates from batch attestations

* p2w-relay-iface -> p2w-relay-terra/src/relay/iface.ts

commit-id:ed9846e3

* p2w relay interface: remove config from Relay iface

commit-id:0359d886

* Remove now unnused parsePriceAttestation function

* Pyth terra bridge: add contract deployment script (#88)

* Add pyth deployment script

- Also updates build.sh to build pyth completely
- Add a readme for deployment guide

* Add test for partial update behaviour

* update p2w sdk to new pyth (#91)

* p2w-sdk/rust use pyth sdk solana v2

* Dockerfile.client: solana 1.8.1 -> 1.9.4

commit-id:643299d3

* p2w-terra-relay: ignore lib and node, own project dir in docker

commit-id:b084bc40

* p2w-terra-relay: iface.ts review nits, naive impl for Terra

commit-id:0ecbfdd6

* Terra contract public api (#79)

* Use pyth-sdk in terra contract
* Update terra contract according to agreed API
- Also adds v2 suffix to price_info key because this migration is breaking.

* p2w-terra-relay: apply review nits

commit-id:aec39c85

* p2w-terra-relay: make worker.ts generic w.r.t. Relay interface

commit-id:5937a08c

* terra.ts: add missing return statement

commit-id:ba0365e6

* Update worket to handle timeout in callback correctly (#97)

* Remove wormhole-based governance

* Remove now unused legacy governance state and variables

* Remove Pyth Implementation implementation

* p2w-terra-relay: run formatter

commit-id:df311e23

* p2w-terra-relay: apply review nits

commit-id:5034b061

* Run formatter to trigger CI

commit-id:7c643d79

* p2w-terra-relay: EVM boilerplate

commit-id:8ad73ded

* Remove old PythProxy inheritance hierarchy

* Remove now unnused initialized implementations map

* Remove old mock bridge implementation

* Remove dependency to wormhole sdk as path and cleanup wrong eth copies (#104)

* Dockerfile.pyth_relay: Fix lockfile issue in ethereum

This commit fixes a lockfile issue resulting from newer NPM in our
container.

Specifically, our Dockerfile is pinned, relaxes Ethereum's
lockfile (npm ci -> npm install) and hardens our lockfile (npm install
-> npm ci)

commit-id:3381c8ec

* p2w-terra-relay: Admit loss against mkdir -p

commit-id:3abdb58d

* Remove unused components from wormhole (#108)

* Remove unused components from wormhole

Removes the following:
- explorer
- e2e
- bridge_ui
- algorand stuff (teal dockerfile and third_party/algorand)
- ci_tests (testing directory)  which are for JS/Bridge UI

* Remove unused terra contracts (#109)

- Note: Terra contract addresses are changed by this PR due to deterministic ordering.
- Removed unused nft and token bridge, and migration contracts in Terra
- Modified documentation to remove info regarding removed contracts.(docs/devnet.md)

* Remove unused solana contracts and their wasm creations (#110)

Removes token bridge, nft bridge, migration. Also removes them from deployments and docs.

* Add fee estimate for terra relay (#112)

* Removes directores which are not related to p2w (#111)

Removes
- audits
- dashboards (dashboard is removed from Tilt)
- event_database (all of it's dependencies are removed from Tilt and it's not for p2w)
- lp_ui: a project (pressumably liquidity pool) not related to p2w
- sdk: wormhole sdk, p2w depends on it's npm package and there is no dependency to rust one
- spydk: it's not anywhere in p2w
- staging/algorand: these are for alrogrand which is not used in p2w
- whitepapers: these are for wormhole

* Add and update openzeppelin packages

* Add initializer to Pyth contract

* Add upgradable PythProxy contract

* Update tests to work with new proxy setup

* Update migrate script to work with new proxy setup

* Add tests for new proxy setup

* Inline PythStorage.Provider struct

* Make Pyth.verifyPythVM function internal

* Fix struct field names

* Rename Price to PriceFeed to be consistent with SDK

* Replace PythGetters.latestPriceInfo with Pyth.queryPriceFeed in public API

* p2w-terra-relay: Add a query() EVM call and Tilt boilerplate

commit-id:f97d0c16

* Clarify test comments

* Add health probe (#107)

* Rename PythProxy to PythUpgradable

* p2w-evm-relay: Backport the proxy address change from debug session

commit-id:55b63ed5

* p2w-terra-relay -> p2w-relay, split EVM relay into new service

commit-id:36d0db6e

* Tiltfile: typo

commit-id:3bbba986

* p2w-evm-relay.yaml: typo

commit-id:35c87c79

* p2w-evm-relay.yaml: typo 2: electric boogaloo

commit-id:40892265

* Add build folder to dockerignore

* Rename attestPriceBatch to updatePriceBatchFromVm

* Update comment on time check

* Trigger Build

* Tiltfile: Fix port forwards for p2w-evm-relay

commit-id:6e5e9c14

* p2w-relay: PythImplementation -> PythUpgradable

commit-id:bfea7eb5

* Remove unused Pyth Chain ID metadata

* Add the query() call

commit-id:02966ce5

* p2w-terra-relay: Fix evm.ts after contract rename

commit-id:87381bec

* Make truffle migrations directory configurable

* p2w-evm-relay: Fix wrong EVM contract ID, add a check for it

This commit takes care of an outdated pyth2wormhole EVM contract
address and implements a contract/non-contract check using
web3.eth.getCode() (empty for non-contracts).

This problem cost us several hours of debugging and resulted from an
EVM gotcha - a contract call to a non-contract address will simply
ignore the call payload and make a plain transfer. Additionally, ETH
accounts don't have a notion of initialization - used and unused
addresses are equally valid tx recipients. Resulting from both
properties, any unused address could potentially yield wrongly
successful calls, wasting funds and debug time over p2w-relay. Thus
the heuristic to protect us from this is to see if the address' code
storage is populated.

commit-id:b655a720

* p2w-relay: Also implement the contract check in EVM relay()

commit-id:e28709e5

* evm.ts: Fix wording in changed/unchanged logs

commit-id:13c81625

* Make terra relayer more resillient (#120)

- Increase retry attempts (4 to 6) and retry_delay (250ms to 1s) to be more resillient
  - This is because when account sequence mismatch happens it might take some time be fixed
- Removed estimate fee because it's being done in wallet.createAndSignTx (less requests)
- Improved logging on when error happens

* Update dockerfile to chown less files (#121)

* Update dockerfile to chown sooner

* p2w-relay: review nits

* p2w-evm-relay: make feed verification queries configurable

* p2w-relay: cache wormhole import

* p2w-relay: formatter, remove getcode() from relay(), add comments

commit-id:1a65c52c

* p2w-relay: typos and leftovers

commit-id:9b523b25

* Change websocket to json socket to support bsc testnet + improves env vars (#139)

* Change websocket to json socket to support bsc testnet + imporving env vars

* Add unit test to Pyth Terra Contract (#123)

* Add unit test to the terra contract

- Refactors the code into multiple functions to make unit testing easier
- Adds build and test of terra contract to CI according to #73

* p2w-relay: harden exception handling, yell about uncaught stuff

commit-id:24e14835

* p2w-relay: Correct outdated comment

commit-id:d0b57d33

* p2w-evm-relay: s/async (e)/(e)/

commit-id:11b3a474

* Modify proto docker and tiltfile to stop creating unnecessary files (#144)

* Remove sdk/spydk from wasm and remove buf gen web yaml (#145)

* Remove wormhole contract from wasm generation (#160)

* pyth2wormhole: Add num_publishers to libraries and contracts

commit-id:f7263eed

* pyth2wormhole: add max_num_publishers to cross-chain metadata

commit-id:7550fa50

* Move p2w relayer parsing to p2w sdk js (#162)

* Move Price Attestation parsing logic to the sdk

* pyth2wormhole: Add contract testing boilerplate for attest()

commit-id:51949fbe

* Create p2w-api base (from p2w-relay) (#142)

* Create p2w-api base (from p2w-relay)

* Refactor project structure

* Rename p2w to pyth price service (#166)

* Abehjati/price-service-add-rest-layer (#167)

* Add rest api for latest vaa

Co-authored-by: Stan Drozd <stan@nexantic.com>
Co-authored-by: Eran Davidovich <edavidovich@jumptrading.com>
Co-authored-by: Eran Davidovich <erancx@users.noreply.github.com>
Co-authored-by: Tom Pointon <tom@teepeestudios.net>
Co-authored-by: Stan Drozd <drozdziak1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants