Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot handle RoundStart: Error: Cannot find RoundStart event #307

Closed
bajtos opened this issue May 16, 2024 · 2 comments
Closed

Cannot handle RoundStart: Error: Cannot find RoundStart event #307

bajtos opened this issue May 16, 2024 · 2 comments

Comments

@bajtos
Copy link
Member

bajtos commented May 16, 2024

While deploying #306, I found the following error in the logs:

2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]Cannot handle RoundStart: Error: Cannot find RoundStart event for round index 6994
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]    at getRoundStartEpoch (file:///app/lib/round-tracker.js:266:7)
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]    at async updateSparkRound (file:///app/lib/round-tracker.js:35:29) {
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]  roundIndex: 6994n,
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]  recentRoundStartEvents: [
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]    { blockNumber: 3918547, roundIndex: 6989n },
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]    { blockNumber: 3918588, roundIndex: 6990n },
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]    { blockNumber: 3918629, roundIndex: 6991n },
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]    { blockNumber: 3918672, roundIndex: 6992n },
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]    { blockNumber: 3918714, roundIndex: 6993n }
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]  ]
2024-05-16T14:19:02Z app[48ed67dc0e0308] cdg [info]}
2024-05-16T14:19:02Z app[3d8dd965ae2998] cdg [info]Mapped 0x8460766Edc62B525fc1FA4D628FC79229dC73031 IE round index 6994n to SPARK round number 12417n
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]Cannot handle RoundStart: Error: Cannot find RoundStart event for round index 6994
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]    at getRoundStartEpoch (file:///app/lib/round-tracker.js:266:7)
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]    at async updateSparkRound (file:///app/lib/round-tracker.js:35:29) {
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]  roundIndex: 6994n,
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]  recentRoundStartEvents: [
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]    { blockNumber: 3918547, roundIndex: 6989n },
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]    { blockNumber: 3918588, roundIndex: 6990n },
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]    { blockNumber: 3918629, roundIndex: 6991n },
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]    { blockNumber: 3918672, roundIndex: 6992n },
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]    { blockNumber: 3918714, roundIndex: 6993n }
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]  ]
2024-05-16T14:19:03Z app[9185e760f41658] cdg [info]}

Notice that:

  • Machine 48ed67dc0e0308 was not able to find RoundStart event for roundIndex 6994
  • Machine 3d8dd965ae2998 mapped round index 6994 to a new spark round 12417n, so presumably it was able to find the event
  • Machine 9185e760f41658 was not able to find RoundStart event for roundIndex 6994

A bit later, when the machine 3d8dd965ae2998 was restarted and picked up the new version, it found round index 6993n and mapped it to Spark round number 12416n.

2024-05-16T14:19:08Z app[3d8dd965ae2998] cdg [info]Mapped 0x8460766Edc62B525fc1FA4D628FC79229dC73031 IE round index 6993n to SPARK round number 12416n
2024-05-16T14:19:08Z app[3d8dd965ae2998] cdg [info]SPARK round started: 12416n (epoch: 3918714)

This is what I found in the production database - the Spark round 12417 (Meridian round index 12417) exists but was not picked up by the spark-api instance.

spark=# SELECT * FROM spark_rounds ORDER BY id DESC LIMIT 3;
  id   |          created_at           |              meridian_address              | meridian_round | max_tasks_per_node | start_epoch
-------+-------------------------------+--------------------------------------------+----------------+--------------------+-------------
 12417 | 2024-05-16 14:19:32.632483+00 | 0x8460766Edc62B525fc1FA4D628FC79229dC73031 |           6994 |                 15 |     3918757
 12416 | 2024-05-16 13:57:32.491816+00 | 0x8460766Edc62B525fc1FA4D628FC79229dC73031 |           6993 |                 15 |     3918714
 12415 | 2024-05-16 13:36:32.672675+00 | 0x8460766Edc62B525fc1FA4D628FC79229dC73031 |           6992 |                 15 |     3918672
(3 rows)
@bajtos
Copy link
Member Author

bajtos commented May 16, 2024

A few moments later, the new version was deployed to the machine 9185e760f41658, and it picked up the correct round number.

2024-05-16T14:20:14Z app[9185e760f41658] cdg [info]Mapped 0x8460766Edc62B525fc1FA4D628FC79229dC73031 IE round index 6994n to SPARK round number 12417n
2024-05-16T14:20:14Z app[9185e760f41658] cdg [info]SPARK round started: 12417n (epoch: 3918757)
2024-05-16T14:20:14Z app[9185e760f41658] cdg [info]Initialized round tracker in 1587ms. SPARK round number at service startup: 12417n

@bajtos
Copy link
Member Author

bajtos commented Oct 3, 2024

I don't remember encountering this problem recently, perhaps the changes in round-tracking logic fixed the cause.

@bajtos bajtos closed this as completed Oct 3, 2024
@github-project-automation github-project-automation bot moved this to ✅ done in Space Meridian Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ done
Development

No branches or pull requests

1 participant