Skip to content

jks0x/dnsflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

dnsflow

Name

dnsflow — an advanced DNS forwarding plugin for CoreDNS focused on speed, reliability, and intelligent traffic routing. It extends the built-in forward plugin with CIDR-based dynamic rerouting, self-learning domain tracking, a high-performance cache, and Linux firewall integration.

Features

Multi-protocol upstreams: Supports UDP, TCP, and DNS-over-TLS (DoT). Connections are reused across requests to minimise handshake overhead.

Intelligent routing with self-learning: The check directive inspects the IP addresses in each DNS response against one or more CIDR files. Domains whose resolved IPs fall outside (or inside) the expected range are automatically rerouted to a different upstream. Combined with track, newly discovered domains are appended to a target file in full:DOMAIN format, persisted for crash safety, and restored on restart — enabling hands-free, long-lived adaptive routing.

High-performance cache: A sharded, 2Q (Two-Queue) eviction cache shields hot entries from one-time scan evictions. Supports proactive background prefetching (prefetch), Stale-While-Revalidate serving (serve_stale), gzip-compressed persistence across restarts (persist), and top-hit warmup on startup (warmup).

Linux firewall integration: Resolved IP addresses can be automatically injected into ipset or nftables sets via an internal async worker, enabling per-domain transparent proxy steering without external scripts. Per-entry timeouts are supported.

Load balancing & health checking: Three upstream selection policies — random, round_robin, and sequential — with continuous health probing. A configurable spray fallback prevents SERVFAIL when all upstreams are simultaneously down.

Observability: Exports a comprehensive set of Prometheus metrics covering request latency (microsecond precision), cache hit/miss/stale rates, CIDR reroute events, domain tracker state, and firewall queue health.

Syntax

In its most basic form, a simple DNS redirecter uses the following syntax:

dnsflow FROM... {
    to TO...
}
  • FROM... is the file list which contains base domain/rule to match for the request to be redirected.

    .(i.e. root zone) can be used solely to match all incoming requests as a fallback.

    Several formats are supported, including standard and v2fly compatible formats:

    • DOMAIN: matches the domain itself and all its subdomains.

    • full:DOMAIN: matches the domain exactly (exact match).

    • domain:DOMAIN: matches the domain itself and all its subdomains.

    • keyword:KEYWORD: matches any domain that contains the keyword.

    • regexp:PATTERN: matches domains using a regular expression.

    • include:FILENAME: includes domain rules from another file (path relative to current file).

    • server=/DOMAIN/...: dnsmasq format, only the DOMAIN part will be used.

      Attributes can be appended to rules (e.g., domain:google.com @cn), although they currently serve as metadata. Text after # character will be treated as comment. Unparsable lines(including whitespace-only line) are therefore just ignored.

  • to TO... are the destination endpoints to redirected to. This is a mandatory option.

    The to syntax allows you to specify a protocol, a port, etc:

    [dns://]IP[:PORT] use protocol specified in incoming DNS requests, it may UDP or TCP.

    [udp://]IP:[:PORT] use UDP protocol for DNS query, even if request comes in TCP.

    [tcp://]IP:[:PORT] use TCP protocol for DNS query, even if request comes in UDP.

    tls://IP[:PORT]@TLS_SERVER_NAME or tls://HOSTNAME[:PORT] for DNS over TLS. The @TLS_SERVER_NAME suffix supplies the SNI for the handshake; it is mandatory when the address is a literal IP. Hostname forms derive SNI from the hostname automatically.

    Example:

    dns://1.1.1.1
    8.8.8.8
    tcp://9.9.9.9
    udp://2606:4700:4700::1111
    
    tls://1.1.1.1@one.one.one.one
    tls://8.8.8.8@dns.google
    tls://dns.quad9.net
    

An expanded syntax can be utilized to unleash of the power of dnsflow plugin:

dnsflow FROM... {
    to TO...

    except IGNORED_NAME...

    ipset SETNAME FAMILY [timeout DURATION]
    nfset add element TABLE SET [ip|ip6|auto] [INTERVAL] [TIMEOUT]
    filter ipv4|ipv6
    check OUTPUT_FILE [in|out] CIDR_FILE... [match any|all]
    track [capacity N] [max_age DURATION] [compact_ratio FLOAT]

    cache {
        capacity SIZE
        persist FILE_PATH
    }

    # Advanced options (rarely needed, see below)
    fallback [spray|none]
    policy random|round_robin|sequential
    health_check DURATION
    path_reload DURATION
    metrics_lookup true|false

    # Advanced cache options
    cache {
        warmup COUNT
        prefetch HITS PERCENTAGE
        serve_stale DURATION
        min_ttl DURATION
        max_ttl DURATION
    }
}

Some of the options take a DURATION as argument, zero time(i.e. 0) duration to disable corresponding feature unless it's explicitly stated otherwise. Valid time duration examples: 0, 500ms, 3s, 1h, 2h15m, 7d, 1d12h, etc. Bare numbers are treated as seconds (e.g. 300 = 5m).

  • FROM... and to TO... as above.

  • except is a space-separated list of domains to exclude from redirecting. Requests that match none of these names will be passed through.

    It usually not a good idea to embed too many except domains in Corefile, in which case you should try to delete them directly in FROM files.

  • ipset(needs root user privilege) specifies resolved IP addresses from FROM... will be added to ipset SETNAME.

    • FAMILY: The address family to use, e.g., ipv4, ipv6, inet, inet4, inet6.

    • timeout DURATION: Optional timeout for the ipset entry (e.g., 7d, 24h, 1h30m, 604800s, or bare seconds 604800). timeout 0 means a permanent entry, not disabled.

      Note that this option is only effective on Linux.

  • nfset add element (needs root user privilege) adds resolved IP addresses to nftables set.

    • TABLE: nftables table name.
    • SET: nftables set name.
    • [ip|ip6|auto]: Address family (default is auto).
    • [INTERVAL]: Set to true if the nftables set is an interval set.
    • [TIMEOUT]: Optional duration for the set element timeout (e.g., 7d, 24h, 1h30m). A zero timeout means a permanent element, not disabled.
  • filter filters the DNS responses.

    • none: No filtering (default).
    • ipv4: Filter IPv4 addresses (A records).
    • ipv6: Filter IPv6 addresses (AAAA records).
  • check evaluates DNS responses against CIDR ranges and automatically reroutes matching domains to the target upstream. IPv4-mapped IPv6 answers (::ffff:x.x.x.x) are normalized to IPv4 before matching, the same as firewall set dispatch.

    • OUTPUT_FILE: The domain list file of the target upstream to reroute matched domains into (in v2fly full: format). Must correspond to a FROM file used by another dnsflow block.
    • [in|out]: Optional keyword to specify the matching direction:
      • in: Match domains whose IPs ARE inside the CIDR range.
      • out (default): Match domains whose IPs are NOT inside the CIDR range (e.g. foreign IPs falling outside a domestic CIDR list).
    • CIDR_FILE...: One or more files containing CIDR ranges (one prefix per line) to check against.
    • [match any|all]: Optional match policy (default all):
      • all (default): Reroute only when all IPs in the DNS response satisfy the condition. Recommended — avoids false positives for CDNs that return a mix of domestic and foreign addresses.
      • any: Reroute as soon as any IP satisfies the condition. Use for strict pollution detection where even one foreign IP is enough reason to reroute.

    check can be specified multiple times on the same upstream to check against different CIDR sets or target different outputs.

    Routing internals and dynamic tracking behavior are documented in docs/routing.md.

  • track enables runtime domain lifecycle management on a target upstream. It must be placed on the upstream whose FROM file is referenced by a check OUTPUT_FILE on another upstream.

    • capacity N: Optional, maximum number of dynamically discovered domains to keep in memory (default 8000). When full, the least recently queried domain is evicted. Set to 0 for unlimited.
    • max_age DURATION: Optional, evict domains that have not been queried for longer than this duration (e.g. 720h, 30d). Default is disabled (0). On restart, entries older than max_age are also discarded. Entries without a timestamp (from legacy files) are never age-evicted.
    • compact_ratio FLOAT: Optional, fraction of capacity evictions that triggers a file compaction (default 1.0, i.e. one full store turnover). Lower values compact more frequently at the cost of extra I/O. For unlimited-capacity trackers with max_age set, compaction triggers after every 2000 age evictions.

    Tracked domains are appended to OUTPUT_FILE immediately for crash safety, and the file is compacted automatically when evictions accumulate. On restart, persisted entries are restored into memory.

  • cache enables a high-performance DNS cache with persistence and warmup support.

    • capacity SIZE: Approximate maximum number of cached records. The cache is internally split into 128 shards, each capped at ceil(SIZE / 128) entries; the true upper bound is therefore ceil(SIZE / 128) × 128, which can exceed SIZE by up to 127 entries (notable only for very small SIZE). Set to 0 for unlimited. Default is 10000.

    • persist FILE_PATH: Persist hot cache entries to a gzip-compressed file. Supports atomic writes and versioned file headers. When multiple dnsflow blocks use the same path, files are automatically scoped by upstream name. Default is disabled.

    Cache internals, persistence format, and refresh behavior are documented in docs/cache.md.

Firewall backend behavior, timeout precedence, and async deduplication are documented in docs/firewall.md.

Advanced Options

The following options have sensible defaults and rarely need to be configured:

  • fallback configures the failsafe policy when all upstreams in to are marked as unhealthy. The default is spray, which will randomly pick one upstream to send the traffic to as a last resort, ignoring health status. Use fallback none to disable this behavior (requests will fail immediately with SERVFAIL when all upstreams are down).

  • policy specifies the policy to use for selecting upstream hosts. The default is random.

    • random will randomly select a healthy upstream host.

    • round_robin will select a healthy upstream host in round robin order.

    • sequential will select a healthy upstream host in sequential order.

  • health_check configure the behaviour of health checking of the upstream hosts:

    • DURATION specifies health checking interval. Default is 2s, minimal is 1s.
  • path_reload changes the reload interval between each path in FROM.... Default is disabled (0), minimal is 1s.

  • metrics_lookup true|false enables Prometheus timing for the domain lookup path (coredns_dnsflow_name_lookup_duration_us). Default is false because this metric adds timing/Observe overhead to every request; enable it only while profiling routing lookup cost.

Advanced Cache Options

  • warmup COUNT [RATE]: After restart, asynchronously pre-resolve the top N expired entries by hit count. Requires persist. Set 0 to disable. Default count is 200. Optional RATE (1–200) controls queries per second during warmup (default 10).

  • prefetch HITS PERCENTAGE: Trigger background refresh when an entry has at least HITS accesses and remaining TTL is below PERCENTAGE. HITS may be 0, which makes every cache hit eligible for the percentage check. PERCENTAGE may be 0 to effectively disable near-expiry prefetch. Default is 1 20%.

  • serve_stale DURATION: Maximum time window to serve expired entries as stale data (RFC 8767). 0 disables stale serving. Default is 1d.

  • min_ttl DURATION: Minimum cache TTL to prevent excessive origin lookups. 0 disables the lower clamp. Default is 5s.

  • max_ttl DURATION: Maximum cache TTL. Must be greater than or equal to min_ttl; 0 is only valid when min_ttl is also 0, and disables caching time for newly written entries. Default is 1h.

Metrics

If monitoring is enabled (via the prometheus plugin) then the following metrics are exported:

  • coredns_dnsflow_name_lookup_duration_us{server, matched} - duration (in microseconds) per domain name trie lookup. This metric is emitted only when metrics_lookup true is configured.

  • coredns_dnsflow_request_duration_us{server, to} - end-to-end duration (in microseconds) per upstream interaction, including cache hits.

  • coredns_dnsflow_request_count_total{server, to} - query count per upstream.

  • coredns_dnsflow_response_rcode_count_total{server, to, rcode} - count of RCODEs per upstream.

  • coredns_dnsflow_hc_failure_count_total{server, to} - number of failed health checks per upstream.

  • coredns_dnsflow_hc_all_down_count_total{server, to} - counter of when all upstreams marked as down.

Where server is the Server Block address responsible for the request (and metric). matched is the match flag, "1" if in any name list, "0" otherwise.

Cache Metrics

  • coredns_dnsflow_cache_hits_total{server, type} - cache lookup counts by result type: hit, miss, stale, expired.

  • coredns_dnsflow_cache_entries - current number of cached entries.

  • coredns_dnsflow_cache_size_bytes{upstream} - approximate DNS wire-format size of cached messages by upstream. This is useful for comparing relative cache footprint, but it is not Go heap usage or full process RSS.

  • coredns_dnsflow_cache_prefetch_total - number of background prefetch triggers.

  • coredns_dnsflow_cache_stale_refresh_total - number of async refresh operations triggered for stale (expired but within serve-stale window) entries.

  • coredns_dnsflow_cache_refresh_effective_total{type} - background refresh responses that actually updated the cache, by type (prefetch or stale). Compare against cache_prefetch_total / cache_stale_refresh_total to measure how often a refresh wins the race against a concurrent direct query. Triggered ≠ applied.

  • coredns_dnsflow_cache_refresh_skipped_fresher_total{type} - background refresh responses discarded because a fresher entry was written while the refresh was in flight. A steadily nonzero rate is healthy; a large ratio vs cache_refresh_effective_total suggests the prefetch window is too wide.

  • coredns_dnsflow_cache_drops_total - number of LRU evictions due to capacity limit.

  • coredns_dnsflow_cache_prefetch_drops_total - number of prefetch requests dropped due to semaphore saturation (concurrent prefetch limit reached).

  • coredns_dnsflow_firewall_queue_drops_total{backend} - number of firewall tasks dropped due to a full async queue, labeled by backend (ipset or nftables).

CIDR Reroute Metrics

  • coredns_dnsflow_cidr_reroute_total{file, result} - CIDR-triggered domain reroute events per target file. result is new (first discovery), known (already tracked by a concurrent goroutine), or no_tracker (fallback path without a DomainTracker).

Domain Tracker Metrics

  • coredns_dnsflow_tracker_domain_count{file} - current number of dynamically tracked domains per domain list file.

  • coredns_dnsflow_tracker_evictions_total{file, reason} - total evictions from the domain tracker. reason indicates whether the eviction was triggered by capacity (LRU) or age (TTL expiration).

  • coredns_dnsflow_tracker_compact_total{file} - total file compaction operations triggered.

  • coredns_dnsflow_tracker_restore_total{file} - total domains restored from disk on startup.

Bugs

Sometimes you modified Corefile and yet Caddy server failed to reload the new config with the error "Error during parsing", dnsflow will do sanity check during parsing, if you misconfiged the Corefile, you're out of lock:

  • Argument count mismatch, out of range arguments, unrecognizable arguments, etc.

  • Missing mandatory property to TO....

  • Used unsupported DNS transport type in to TO....

  • except contains domain names that conflict with FROM files.

  • .(i.e. root zone) is matched in a configuration block.

Also note that some of the properties are cumulative (can be specified multiple times): except, to, ipset, nfset, check.

Rationale: Strict checking to ensure that user can detect errors ASAP, and make the Corefile less confusing.

If you think you found a bug in dnsflow, please issue a bug report. Enhancements are also welcomed.

About

*dnsflow* — an advanced DNS forwarding plugin for CoreDNS focused on speed, reliability, and intelligent traffic routing. It extends the built-in *forward* plugin with CIDR-based dynamic rerouting, self-learning domain tracking, a high-performance cache, and Linux firewall integration.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors