Skip to content

Add webseed checker#18755

Merged
anacrolix merged 6 commits intomainfrom
anacrolix/webseed-checker
Feb 9, 2026
Merged

Add webseed checker#18755
anacrolix merged 6 commits intomainfrom
anacrolix/webseed-checker

Conversation

@anacrolix
Copy link
Contributor

@anacrolix anacrolix commented Jan 22, 2026

I added a subcommand to check webseed snapshot data matches the preverified/chain.tomls after we had an issue with a torrent file not matching its data file on hoodi.

I'm not sure where the command should go, or what it should be called. I put it as snapshots preverified for now in cmd/erigon, but it doesn't quite fit in with the flags and assumed handling that occurs. So maybe it belongs in cmd/downloader.

It reuses the --preverified flag from snapshots reset, which is very useful.

I also exported the signal listener for use with subcommands, not just the default command, so you can cancel them and get proper cleanup.

Update:

I've added it as cmd/downloader verify_webseeds as well.

I've used it to test gnosis webseed issues:

GOFLAGS='-race -v' just run-raw-erigon-cmd downloader --chain gnosis webseed_verify --preverified embedded | tee -i out
jq < out '.gnosis | map_values(select(.Err)) | to_entries | map({name: .key, err: .value.Err}) | .[]' -c > gnosis

Copy link
Member

@sudeepdino008 sudeepdino008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only place i can think of adding it is in erigon-snapshot repo, in the mirror workflow (which copies the gh merged toml to the webseeds)

@anacrolix
Copy link
Contributor Author

I will try to get it merged in. There's some low-hanging fruit with reusing etags and not refetching stuff to make it check webseeds extremely quickly. It outputs JSON to allow quick queries on the state of things and cause of failure.

@anacrolix anacrolix force-pushed the anacrolix/webseed-checker branch from ad06b69 to ab894f4 Compare February 9, 2026 00:12
@anacrolix anacrolix self-assigned this Feb 9, 2026
@anacrolix anacrolix merged commit 08553fb into main Feb 9, 2026
22 of 24 checks passed
@anacrolix anacrolix deleted the anacrolix/webseed-checker branch February 9, 2026 09:52
Sahil-4555 pushed a commit to Sahil-4555/erigon that referenced this pull request Feb 11, 2026
I added a subcommand to check webseed snapshot data matches the
preverified/chain.tomls after we had an issue with a torrent file not
matching its data file on hoodi.

I'm not sure where the command should go, or what it should be called. I
put it as `snapshots preverified` for now in `cmd/erigon`, but it
doesn't quite fit in with the flags and assumed handling that occurs. So
maybe it belongs in `cmd/downloader`.

It reuses the `--preverified` flag from `snapshots reset`, which is very
useful.

I also exported the signal listener for use with subcommands, not just
the default command, so you can cancel them and get proper cleanup.

Update:

I've added it as cmd/downloader verify_webseeds as well.

I've used it to test gnosis webseed issues:
```
GOFLAGS='-race -v' just run-raw-erigon-cmd downloader --chain gnosis webseed_verify --preverified embedded | tee -i out
jq < out '.gnosis | map_values(select(.Err)) | to_entries | map({name: .key, err: .value.Err}) | .[]' -c > gnosis
```
wmitsuda added a commit that referenced this pull request Feb 15, 2026
PR #18755 accidentally passed `base` (the webseed URL) for both the
`webseedUrlBase` and `name` parameters of GetMetainfoFromWebseed,
producing URLs like:
  https://host/https://host/.torrent

Pass the snapshot `name` instead so webseedMetainfoUrl correctly
builds: webseedUrlBase + name + ".torrent".

Fixes #19094

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
wmitsuda added a commit that referenced this pull request Feb 15, 2026
PR #18755 accidentally passed `base` (the webseed URL) for both the
`webseedUrlBase` and `name` parameters of GetMetainfoFromWebseed,
producing URLs like:
  https://host/https://host/.torrent

Pass the snapshot `name` instead so webseedMetainfoUrl correctly
builds: webseedUrlBase + name + ".torrent".

Fixes #19094

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
wmitsuda added a commit that referenced this pull request Feb 16, 2026
## Summary

- PR #18755 ("Add webseed checker") accidentally passed `base` for both
`webseedUrlBase` and `name` in `GetMetainfoFromWebseed`, producing
doubled URLs like `https://host/https://host/.torrent` which return HTTP
404
- Fix: pass the snapshot `name` instead of `base` as the second argument

Fixes #19094

## Test plan

- [x] Built and ran an ephemeral mainnet instance for 2 minutes — no
`"error fetching metainfo from webseeds"` warnings, all 305/305 file
metadata fetched, webseed download steady at ~62-70 MB/s

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants