Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: insertRewardStakeAddressAndPool #546

Closed
erikd opened this issue Mar 15, 2021 · 10 comments
Closed

Error: insertRewardStakeAddressAndPool #546

erikd opened this issue Mar 15, 2021 · 10 comments
Labels
bug Something isn't working priority high issues/PRs that MUST be addressed. The release can't happen without this;

Comments

@erikd
Copy link
Contributor

erikd commented Mar 15, 2021

@jbgi reports on stage net:

[2021-03-11 22:27:15.48 UTC] Finishing epoch 244: 2 rewards, 0 orphaned_rewards, 10 stakes.
[2021-03-11 22:27:15.49 UTC] Error: insertReward StakeAddressAndPool e12ec4894e33...b00e0ebdba36
[2021-03-11 22:27:15.49 UTC] Shutting down DB thread
[2021-03-11 22:27:15.49 UTC] recvMsgRollForward: AsyncCancelled

Connecting to stagenet:

nix build -f default.nix scripts.staging.node -o launch_node_staging.sh
# on db-sync:
nix build -f default.nix scripts.staging.db-sync -o launch_db-sync_staging.sh \
    --arg customConfig '{ services.cardano-db-sync.stateDir = "./staging-db-sync"; }'
@erikd
Copy link
Contributor Author

erikd commented Mar 15, 2021

Weird thing is that restarting db-sync clears this error. How? Why?

Is that because a previous insert has not been commited yet?

@erikd
Copy link
Contributor Author

erikd commented Mar 15, 2021

Nope, adding a transactionCommit just before the insert does not fix the problem, but restarting db-sync does. Why?

@jbgi
Copy link
Contributor

jbgi commented Mar 15, 2021

I don't really observe that restart fix the issue. After restart it does advance to next epoch but fails with same error.

@disassembler disassembler added bug Something isn't working priority high issues/PRs that MUST be addressed. The release can't happen without this; labels Mar 15, 2021
@erikd
Copy link
Contributor Author

erikd commented Mar 15, 2021

@jbgi But the restart seems to have fixed the issue according to the logs from @ArturWieczorek .

@erikd
Copy link
Contributor Author

erikd commented Mar 16, 2021

Ok, found something incredibly weird.

The rejected address is \xe12ec4894e334a8d39e07eba00ff8f4a02122cf67f2be5b00e0ebdba36 but on this network, all stake addresses should start with \xe0 .

@erikd
Copy link
Contributor Author

erikd commented Mar 16, 2021

All the stake addresses in the stake_address table start with \xe1... and all the reward addresses in the pool_update table start with \xe0..... I think the former is incorrect.

Something really wrong here.

@erikd
Copy link
Contributor Author

erikd commented Mar 16, 2021

All inserts in the pool_update table are done with:

    , DB.poolUpdateRewardAddr = Shelley.serialiseRewardAcnt (Shelley._poolRAcnt params)

and all stake addresses are inserted using:

    { DB.stakeAddressHashRaw = Shelley.serialiseRewardAcnt rewardAddr

However, there is a function insertStakeAddressRefIfMissing which does:

        Shelley.Addr nw _pcred sref ->
          case sref of
            Shelley.StakeRefBase cred ->
              Just <$> insertStakeAddress txId (Shelley.RewardAcnt nw cred)

So this looks like the wrong nw value in the Shelley.Addr struct.

@erikd
Copy link
Contributor Author

erikd commented Mar 16, 2021

The \xe1 start of a stake address means Mainnet, so why does the reward_addr in the pool_update tables look like this?:

staging=# select reward_addr from pool_update ; 
                         reward_addr                          
--------------------------------------------------------------
 \xe02ec4894e334a8d39e07eba00ff8f4a02122cf67f2be5b00e0ebdba36
 \xe02ec4894e334a8d39e07eba00ff8f4a02122cf67f2be5b00e0ebdba36
 \xe02ec4894e334a8d39e07eba00ff8f4a02122cf67f2be5b00e0ebdba36
 \xe02ec4894e334a8d39e07eba00ff8f4a02122cf67f2be5b00e0ebdba36
 \xe02ec4894e334a8d39e07eba00ff8f4a02122cf67f2be5b00e0ebdba36
 \xe02ec4894e334a8d39e07eba00ff8f4a02122cf67f2be5b00e0ebdba36
 \xe02ec4894e334a8d39e07eba00ff8f4a02122cf67f2be5b00e0ebdba36
 \xe0dbd46a6ddd2f9df9e48c0517a4718d8bd91448e1da7eb7a74e1dcb99
 \xe09cd4b79e0ee98afc1aae680892be141e972b16190891e8b4618dc93e
(9 rows)

Interestingly, its the first address that should match, but it differs by the leading \xe1 vs \xe0 .

@erikd
Copy link
Contributor Author

erikd commented Mar 16, 2021

The root cause of this problem is that ledger-specs accepts pool_updates with a reward_addr field that contains an incorrect network tag. Then when rewards are distributed they are distributed to a reward_addr with a "corrected" network tag.

The real fix is in ledger-specs but I will need to come up with a work around here in db-sync.

@erikd
Copy link
Contributor Author

erikd commented Mar 17, 2021

Two possible solutions to this:

  • Keep the database info as it is, but ignore the network id when looking up the reward address. This is almost certainly cause a slowdown.
  • Fix the network id when the pool certificate's reward address is inserted into the database.

erikd added a commit that referenced this issue Mar 18, 2021
This is a work around for issue #546. The ledger accepts and allows
reward addresses in a pool registration (or re-registration) with an
incorrect network tag. When rewards are distributed for that pool it
goes to the version of that reward address with the correct network
id.

The chosen (ie least bad) fix is to correct the reward address network
tag in pool registration.

Closes: #546
erikd added a commit that referenced this issue Mar 19, 2021
This is a work around for issue #546. The ledger accepts and allows
reward addresses in a pool registration (or re-registration) with an
incorrect network tag. When rewards are distributed for that pool it
goes to the version of that reward address with the correct network
id.

The chosen (ie least bad) fix is to correct the reward address network
tag in pool registration.

Closes: #546
erikd added a commit that referenced this issue Mar 19, 2021
This is a work around for issue #546. The ledger accepts and allows
reward addresses with an incorrect network tag (ie mainnet/testnet) in
a pool registration (or re-registration). HOwever, when rewards are
distributed for that pool it goes to the version of that reward address
with the correct network id for the current network.

The chosen (ie least bad) fix is to correct the reward address network
tag in pool registration.

Closes: #546
@erikd erikd closed this as completed in 18e7113 Mar 19, 2021
erikd added a commit that referenced this issue Mar 22, 2021
This is a work around for issue #546. The ledger accepts and allows
reward addresses with an incorrect network tag (ie mainnet/testnet) in
a pool registration (or re-registration). However, when rewards are
distributed for that pool it goes to the version of that reward address
with the correct network id for the current network.

The chosen (ie least bad) fix is to correct the reward address network
tag in pool registration.

Closes: #546
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority high issues/PRs that MUST be addressed. The release can't happen without this;
Projects
None yet
Development

No branches or pull requests

3 participants