Skip to content

keep per-provider data on partial failures and surface errors in fingerprint generation#499

Open
Mzack9999 wants to merge 1 commit into
mainfrom
375-fingerprint-resilience
Open

keep per-provider data on partial failures and surface errors in fingerprint generation#499
Mzack9999 wants to merge 1 commit into
mainfrom
375-fingerprint-resilience

Conversation

@Mzack9999
Copy link
Copy Markdown
Member

@Mzack9999 Mzack9999 commented May 20, 2026

Summary

  • generate.Compile / Category.fetchInputItem no longer abort an entire category on the first failing URL or ASN. Per-source errors are accumulated and joined.
  • Multiple URLs or ASNs declared for the same provider are now merged (deduped) instead of overwritten.
  • getIpInfoASN now includes IPv6 prefixes (Prefixes6).
  • cmd/generate-index gains a -strict flag (default true) so a failing provider source surfaces as a non-zero exit, instead of silently producing an incomplete sources_data.json.
  • Fingerprint Update workflow bumps actions/checkout@v4 and actions/setup-go@v5, and adds a jq -e guard that fails the run if any top-level category collapses to zero providers.

This was triggered by silent data loss in recent runs (e.g. WAF Akamai disappearing from sources_data.json after ipinfo started returning 401 for the first ASN of the category).

Closes #375

Test plan

  • go test ./...
  • go vet ./...
  • New unit tests cover: merge of multiple URLs per provider, continue-on-error with one failing URL, static CIDR + URL merge, ASN skipped without auth while URL still fetched.
  • Manual rerun of Fingerprint Update after merge + token rotation to confirm Akamai/Sucuri/Leaseweb repopulate.

Summary by CodeRabbit

  • New Features

    • Added -strict flag to control provider source compilation failure behavior; when disabled, generation continues with available data despite failures.
    • Improved data merging from multiple provider sources with automatic deduplication.
  • Bug Fixes

    • Enhanced compilation error handling to accumulate errors and continue processing instead of stopping on the first failure.

Review Change Stack

@auto-assign auto-assign Bot requested a review from dwisiswant0 May 20, 2026 06:07
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: faff7d8d-8564-44f1-9a3d-0defbccbb295

📥 Commits

Reviewing files that changed from the base of the PR and between 1ba9617 and 15cada8.

⛔ Files ignored due to path filters (1)
  • .github/workflows/fingerprint-update.yml is excluded by !**/*.yml
📒 Files selected for processing (5)
  • cmd/generate-index/main.go
  • generate/cidrs.go
  • generate/cidrs_test.go
  • generate/input.go
  • generate/input_test.go

Walkthrough

This PR changes provider source compilation from fail-fast to error-accumulating behavior. A new CIDR deduplication helper merges results across multiple sources while preserving order and preventing duplicates. Input processing now accumulates errors and continues through all providers instead of returning immediately on failure. A CLI -strict flag controls whether accumulated errors are fatal or lenient.

Changes

Provider source compilation with error resilience

Layer / File(s) Summary
CIDR deduplication utility
generate/cidrs.go, generate/cidrs_test.go
appendUniqueCIDRs merges CIDR slices while deduplicating and preserving original order using a set-based map. Unit tests validate merging of disjoint and overlapping inputs, nil handling, and deduplication.
Error accumulation in input source processing
generate/input.go, generate/input_test.go
Categories.Compile and fetchInputItem now accumulate per-provider errors and merge CIDR results via appendUniqueCIDRs instead of fail-fast, continuing through all providers and returning joined errors with partially populated output. getIpInfoASN includes IPv6 prefix netblocks. Tests validate multi-URL merging, continuation on provider failure, combining static and fetched CIDRs, and auth token constraints.
CLI strict mode flag
cmd/generate-index/main.go
New -strict boolean flag (default true) controls whether accumulated compilation errors are fatal. When disabled, partial results are preserved even if provider sources fail.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A rabbit's ode to resilience:

Through CIDR deduplication spun with care,
Provider sources now merge everywhere.
When one path breaks, the rest carry on,
Collecting errors, but the work lives on—
Strict mode waits, while lenient shines free.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly summarizes the main change: keeping per-provider data during partial failures and surfacing errors, which aligns with the core modifications across multiple files.
Linked Issues check ✅ Passed The PR directly addresses issue #375 by implementing error handling to prevent silent data loss and surface failures, accumulating per-provider/source errors instead of aborting on first failure.
Out of Scope Changes check ✅ Passed All changes are in scope: error handling improvements in generate/input.go, CIDR merging in generate/cidrs.go, CLI flag in cmd/generate-index/main.go, and comprehensive tests align with the resilience objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 375-fingerprint-resilience

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error happen when generating sources_data.json

1 participant