-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect leader_vrf_0 values in chain table? #19
Comments
@AndrewWestberg would you please describe how the leader vrf value is calculated for a block or how one can independently verify it? |
@TerminadaPool I'll check into this next week, but I've done a video on the babbage changes here: https://www.youtube.com/watch?v=SGNAsfsVr6Q |
@AndrewWestberg Thanks. I watch all of your excellent videos and I just watched that one again. I wish I could understand Rust and Haskell better to be more helpful. But maybe I can help with some investigative work. Here is a query to retrieve the last 5 slot battles where the winner unexpectedly had a higher leader_vrf_0 value:
Here is what my logs show for the latest slot battle:
And the second latest slot battle:
In both cases the second block was received only a few milliseconds (9,22) later and it was immediately preferred despite this second block having a higher leader_vrf_0 value in the cncli database. Possible explanations:
Sorry to give you an extra concern to look at when you are busy. |
I looked at logs on my relay3, which is on the other side of the world. It happened to receive the latest slot battle blocks in the opposite order to my relay1. And, it didn't extend its tip with the second block. Notice the difference between the logs below and the first output in my previous comment which logs the same blocks.
Block 7e509... was received 59ms after 9ae2d... but relay3 did not adopt block 7e509... So the order of block receipt doesn't matter. In both relays block 9ae2d was preferred. I must presume that both these nodes preferred block 9ae2d because its leader vrf score was actually lower. Whereas the cncli database shows block 7e509 as having the lower leader_vrf_0 score:
Therefore the likely explanation is:
|
@TerminadaPool There is nothing wrong in cncli. I believe you've actually found a bug in cardano-node. It appears that the node is incorrectly using the block_vrf_0 value for determining who wins the slot battle instead of the leader_vrf_0 value. This is incorrect because it removes the slight advantage that small pools had in slot leadership and instead makes it entirely random. I queried my cncli db in a similar manner as yours and determined a fairly random pattern of winners when I looked at leader_vrf_0. Out of the 30 most recent slot battles queried, 13 were showing the higher leader_vrf_0 value winning. When you compare the block_vrf_0 values for all 30, they are perfectly in line with the lower block_vrf_0 winning each time. |
@TerminadaPool I'll reach out to IOG to see what they want to do about this. |
@TerminadaPool Please create a github issue here and link this ticket in it. https://github.com/input-output-hk/ouroboros-network/issues |
Bug confirmed by Jared from IOG. In Alonzo and earlier. the leader_vrf value was being used: In Babbage, the raw vrf value is being passed instead of calculating the leader_vrf from it. |
watch it in action: https://www.youtube.com/shorts/pXGQQmCneM4 |
Now that I've looked into this, I know enough to be able to make an issue myself. |
Apologies to @AndrewWestberg for thinking the bug was in cncli. |
@TerminadaPool no need to apologize whatsoever. You found a bug. This will forever be a contribution to improving cardano that you have made. |
In case this "bug" was instead a design decision: I pointed out that leader VRF value is not possible to game whereas block VRF can be manipulated by custom software doing transaction selecting. |
I have been corrected by @JaredCorduan
|
I hope that "corrected" reads like "helped" to most folks! None of this is obvious, there is a lot of technical detail here which is not easy to discover, and my hope is that I can empower y'all. |
Closing this issue on the cncli side for now. Thanks for reporting @TerminadaPool and thanks @JaredCorduan for taking the ball. |
Describe the bug
Chain table format for leader_vrf_0 values has changed in recent versions and now these values appear to be incorrect as they no longer correlate with the slot battle winner.
To Reproduce
Query the chain table for slot battles where the winner had a higher leader_vrf_0 value
Eg:
Note that the winner should have a lower leader_vrf_0 value. The above query should therefore retrieve cases where the slot battle subsequently became a height battle. This happens rarely when another node builds the next block on top of the block that had the higher leader_vrf_0 value because it hadn't received the other block with the lower leader_vrf_0 value yet. Thus what should have been a slot battle becomes a height battle and is subsequently resolved by the longest chain rule. This should happen rarely and can be confirmed by review of log files on a node.
I noticed that the above query returns a lot more instances in more recent times. However, upon reviewing log files all these instances are simple slot battles which should have been resolved in favour of the block with the lower leader_vrf_0 score.
Since around slot 72236516 the leader_vrf_0 value hex string is now much shorter. In recent versions of cncli this shorter hex leader_vrf_0 value no longer appears to predict the slot battle winner.
Expected behavior
According to Duncan Coutts' statement at IntersectMBO/ouroboros-network#2913 (comment) slot battles should be won by the block that has the lower leader vrf score.
Additional context
It looks like the format of the chain table leader_vrf_0 values changed in recent versions of cncli. Are these values still correct?
The text was updated successfully, but these errors were encountered: