-
Notifications
You must be signed in to change notification settings - Fork 0
processing lifecycle
You will learn what happens after the downloader stages a feed body, how the processing engine produces published artifacts, and how heavy phases and background work fit in.
When the processing loop wakes, it follows this sequence:
-
Claim staged body — rename a
.newfile to.processingfor each feed in the batch. - Feed-local processing — analyze the canonical feed body for per-feed state.
- Heavy phases — run global enrichment and comparison across all relevant feeds.
-
Commit — atomically promote
.processingfiles to committed feed bodies and publish all artifacts.
For each admitted feed, the engine produces:
- Metadata — size, unique IP count, IP family, change rate.
- History — bounded point-in-time snapshots of feed size over time.
- Retention — how long IPs have been listed, how long removed IPs had stayed.
- Change rate — rotation percentage, update frequency measurements.
- Provider enrichment — ASN distribution, geographic distribution, bogon overlap.
The engine reads only local canonical feed bodies and local provider databases. It never fetches upstream.
After feed-local work completes for the batch, the engine runs global phases:
| Phase | What it does |
|---|---|
| Pairwise comparison | Compares each updated feed against every other enabled public feed. Updates overlap counts on both sides. |
| GeoIP fan-out | Updates geographic enrichment for affected feeds using the current GeoIP provider. |
| ASN fan-out | Updates ASN enrichment for affected feeds using the current ASN provider. |
| Bogon analysis | Checks affected feeds against bogon reference data. |
| Critical infrastructure | Generates overlap artifacts for critical-infrastructure reference feeds. |
| Insights | Produces deterministic insights from all computed facts. |
Heavy-phase concurrency is independently configurable. The engine stops admitting new heavy work during shutdown and waits for in-flight workers to settle.
The commit step:
- Writes all public artifacts (metadata, history, comparisons, enrichment, insights).
- Promotes each
.processingbody to the committed feed body. - Sets correct mtimes on published files for pipeline integrity.
If any step fails before promotion, the previous committed outputs remain authoritative. The staged .processing body stays on disk for retry.
After the main batch commits, some work runs in the background:
- Entity artifact patching — country and ASN detail pages update incrementally.
- Entity sidecar generation — per-feed entity sidecars are precomputed during processing, then consumed by the background patcher.
Background work is visible in the admin UI. It does not block the next processing cycle.
Within one batch, feeds are processed in this order:
- Normal feeds (plain sources, artifact children)
- History derivatives
- Merges (ordered by increasing dependency count)
This ensures deterministic publication order. The engine does not compose history derivatives or merges — that is the downloader's job.
- Pipeline Overview — how processing fits into the full pipeline
- Download Lifecycle — what happens before processing
- Triggers and Reprocessing — what causes processing to run
- Daemon Command Reference
- Environment Variables
- Configuration Reload
- Listener Topologies
- Admin Authentication
- Feed Families
- Source Feeds
- Processor Reference
- Static Feeds
- Merge Feeds
- Artifact Parents
- History Derivatives
- Provider Databases
- Use Roles
- Critical Infrastructure Reference Feeds
- Legal Fields
- Feed Visibility & Lifecycle
- YAML Field Reference
- Pipeline Overview
- Download Lifecycle
- Processing Lifecycle
- Feed Status Reference
- Health Classes
- What Triggers Reprocessing
- Accessing the Admin
- Runtime Status
- Feed Inventory
- Artifact Inventory
- Live Queues
- Background Work
- Schedule State
- Operator Actions
- Enable & Disable