Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plugin ACL for source ip filtering #3103

Merged
merged 24 commits into from Sep 4, 2019

Conversation

@ihac
Copy link
Collaborator

commented Aug 9, 2019

Signed-off-by: An Xiao hac@zju.edu.cn

1. Why is this pull request needed and what does it do?

This pr adds a new plugin acl to CoreDNS, which supports control of access to CoreDNS by enforcing custom ACL rules on source ip address, and protect DNS servers from being attacked.

2. Which issues (if any) are related?

#178
#1337
#2335

3. Which documentation changes (if any) need to be made?

A new README for the plugin itself demonstrating its basic usage.

4. Does this introduce a backward incompatible change or deprecation?

No.


Background

This is one of the projects under GSoC 2019 program (https://summerofcode.withgoogle.com/projects/#6519209754886144). It vision is to empower users to secure CoreDNS servers with minimal efforts.

When CoreDNS serves DNS queries publicly or inside Kubernetes clusters, the source IP of the incoming DNS query is an important identity. For security considerations, only certain queries (from specific source-IP or CIDR block) should be allowed to prevent the server from being attacked. The goal of this project is to support a firewall-like source-IP based block/allow mechanism for CoreDNS.

For more details about the background of this plugin or the GSoC project, you may refer to https://github.com/ihac/acl#story-of-gsoc.

Implementation

Guidelines

Below are some critical guidelines or specific metrics to measure the value or performance of this plugin quantitively.

For the target metrics (metrics we'd want to improve):

  • Rate of true positive (a request is blocked and actually it should be).
  • Rate of true negative (a request is allowed and actually it should be).
  • Mean time to crash/being attacked.

For the drag metrics (metrics we'd not want to hurt much):

  • Lines of the config file.
    • We should keep the project as easy-to-use as possible and don't bring much burden.
  • Rate of false positive (a request is blocked while actually it should not be).
    • We certainly never want to hurt the normal use of DNS.
    • A high rate of falseMy negative is not good of course but we could accept it.
  • Latency of DNS queries.
  • Throughput of CoreDNS server.
    • Requests per second (QPS).

Plugin Syntax

acl [ZONES…] {
    ACTION type QTYPE net SOURCE
    ACTION type QTYPE file LOCAL_FILE
}
  • ZONES zones it should be authoritative for. If empty, the zones from the configuration block are used.
  • ACTION (allow or block) defines the way to deal with DNS queries matched by this rule. The default action is allow, which means a DNS query not matched by any rules will be allowed to recurse.
  • QTYPE is the query type to match for the requests to be allowed or blocked. Common resource record types are supported. ANY stands for all kinds of DNS queries.
  • SOURCE is the source ip to match for the requests to be allowed or blocked. Typical CIDR notation and single IP address are supported. ANY stands for all possible source IP addresses.
  • LOCAL_FILE is a local file including a list of IP addresses or subnets. It allows users to import blacklist/whitelist from third party directly without setting rules by hand.

Implementation

Plugin acl is simply an IP filter and the core flow of the plugin is to check whether an ip address (source of incoming DNS query) is present in the user-configured set. Our original solution is quite naive: simply traversing all ip addresses (or subnets) in the set and matching source ip with them one by one. Obviously this algorithm performs poorly (in both time complexity and space complexity) when dealing with a huge set.

To optimize the performance of this plugin, there're several common techniques we could use:

  • Cuckoo filter: hashing.
    • Pros: space efficient, constant time complexity.
    • Cons: false positive (normal users being blocked) is possible, subnet not supported.
  • Trie: encoding all subnets / ip addresses in a binary tree (at most 32 layers for IPv4).
    • Pros: logarithmic time complexity (at most O(32) for IPv4), support both subnet and single ip address.
    • Cons: space not efficient enough (but still much better than an array).
  • Bit set: mapping every ip address to a single bit (at most 2 ^ 32 bits == 512M).
    • Pros: constant time complexity, cost at most 512M of memory in theory.
    • Cons: subnet not supported.

Currently, my plan is to combine trie and bit set together as the core filtering algorithm (trie for subnets and bit set for single ip addresses), but I'm pretty open for any comments or suggestions.

@yongtang @miekg

@corbot corbot bot requested a review from fastest963 Aug 9, 2019
@corbot

This comment has been minimized.

Copy link

commented Aug 9, 2019

Thank you for your contribution. I've just checked the OWNERS files to find a suitable reviewer. This search was successful and I've asked fastest963 (via /OWNERS) for a review.

If you have questions or suggestions for this bot, please file an issue against the miekg/dreck repository.

The bot understands the commands that are listed here.

@yongtang yongtang requested review from miekg, chrisohaver and johnbelamaric Aug 9, 2019
@yongtang

This comment has been minimized.

Copy link
Member

commented Aug 9, 2019

/cc @miekg @chrisohaver @johnbelamaric this PR is the acl we discussed in the email. Renamed to acl so that is will not conflict with firewall/policy.

@codecov-io

This comment has been minimized.

Copy link

commented Aug 9, 2019

Codecov Report

Merging #3103 into master will increase coverage by 0.31%.
The diff coverage is 91.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3103      +/-   ##
==========================================
+ Coverage   55.83%   56.14%   +0.31%     
==========================================
  Files         206      215       +9     
  Lines       10417    10882     +465     
==========================================
+ Hits         5816     6110     +294     
- Misses       4183     4314     +131     
- Partials      418      458      +40
Impacted Files Coverage Δ
plugin/acl/setup.go 90% <90%> (ø)
plugin/acl/acl.go 94.28% <94.28%> (ø)
plugin/file/reload.go 69.44% <0%> (-5.56%) ⬇️
plugin/file/zone.go 69.38% <0%> (-1.77%) ⬇️
plugin/file/xfr.go 0% <0%> (ø) ⬆️
plugin/sign/file.go 57.14% <0%> (ø)
plugin/sign/sign.go 0% <0%> (ø)
plugin/sign/signer.go 51.3% <0%> (ø)
plugin/sign/dnssec.go 100% <0%> (ø)
plugin/sign/nsec.go 72.22% <0%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eec24cb...f9acf40. Read the comment docs.

@rdrozhdzh

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2019

Couple thoughts on design

  1. It would be nice to make keywords type and net optional in policy definition with default values ANY.
  2. It would be nice to allow definition of multiple query types for a single policy. This way user could avoid creating multiple policies with different query types and the same network list.
  3. Regarding IP filter implementation, I think it's not acceptable to use unreliable (Cuckoo filter) and memory greedy (bit set) solutions. Trie approach looks fine to me. Also, you can look into available implementations, e.g. https://github.com/infobloxopen/go-trees/tree/master/iptree (Package iptree implements radix tree data structure for IPv4 and IPv6 networks. Ideally, each Rule should have a single IP tree for Policy list (not easy to implement), so that we do only one IP check per query.
@yongtang

This comment has been minimized.

Copy link
Member

commented Aug 17, 2019

@ihac Can you address the review comments?

@ihac

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 17, 2019

@yongtang Sure. Sorry for late reply.

As @rdrozhdzh suggested, I've already updated my code to make type and net optional for matching all.

Now I'm still working on the second comment. My plan is to allow to specify multiple query types in one line separated by ,, but I'm not sure whether there is any preference, like space over comma?

For the third comment, I'm a little bit curious about why bitset is not acceptable? Simply considering from the perspective of algorithm, I think bitset performs much better than any other tree-like data structure when it comes to a huge set of single ip addresses (e.g. blacklist).

Ideally, each Rule should have a single IP tree for Policy list (not easy to implement)

Yes, totally agree. I was going to follow this design, but abandoned it because combining multiple policies might introduce conflicts and break the matching order expected by users.

@miekg

This comment has been minimized.

Copy link
Member

commented Aug 17, 2019

upfront comment: in file we use * for any would be good to mimic existing conventions

@ihac

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 17, 2019

@miekg Currently both * and any are supported. I can remove any from docs and code.

Copy link
Member

left a comment

initial batch

plugin/acl/README.md Outdated Show resolved Hide resolved
plugin/acl/README.md Outdated Show resolved Hide resolved
plugin/acl/README.md Outdated Show resolved Hide resolved
plugin/acl/README.md Outdated Show resolved Hide resolved
plugin/acl/README.md Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
@rdrozhdzh

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2019

@ihac, as you mentioned, bitset require up to 512M memory in worse case (depends on particular IP list configured by user), and this is only for IPv4 addresses. Such amount of memory may look insignificant at first sight, but coredns may be run on little devices for which 512M is very much.
In case of IPv6, it will take up to 2^128 bits = 2^125 bytes which is obviously too much for any device. BTW, it seems your solution doesn't support IPv6, at least the Trie implementation. That's why I suggested to looks into third-party libraries.

Copy link
Member

left a comment

another batch

plugin/acl/setup.go Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup_test.go Outdated Show resolved Hide resolved
plugin/acl/setup_test.go Outdated Show resolved Hide resolved
@miekg

This comment has been minimized.

Copy link
Member

commented Aug 17, 2019

@ihac

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 17, 2019

Thank @miekg for comments. I'll fix them in the morning (UTC +8).

@miekg

This comment has been minimized.

Copy link
Member

commented Aug 17, 2019

@rdrozhdzh

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2019

I'm not sure whether there is any preference, like space over comma?

I think space is preferable. I didn't see using comma in other plugins. Also, caddy automatically distinguishes arguments separated by space.

@miekg

This comment has been minimized.

Copy link
Member

commented Aug 18, 2019

@miekg

This comment has been minimized.

Copy link
Member

commented Aug 18, 2019

@miekg

This comment has been minimized.

Copy link
Member

commented Aug 18, 2019

@ihac

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 19, 2019

Hi @rdrozhdzh

I took a look at the library iptree which you recommended. I agree that the variant of Radix Tree will perform better than ordinary Trie, because it reduces the both depth and size of the tree by splitting node dynamically.

I've replaced the original filter implementation with iptree. Would you mind helping me to check this? Thanks.

plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/acl.go Outdated Show resolved Hide resolved
plugin/acl/setup_test.go Outdated Show resolved Hide resolved
@miekg

This comment has been minimized.

Copy link
Member

commented Aug 20, 2019

Our original solution is quite naive: simply traversing all ip addresses (or subnets) in the set and matching source ip with them one by one.

Is this true? Aren't these trivial bit level calculations?

Writing anything to RAM doesn't scale esp, taking IPv6 into account (this is also why edns client subnet is so annoying to make efficient). Any plugin aiming for inclusion must make v6 a first class citizen.

@miekg

This comment has been minimized.

Copy link
Member

commented Aug 20, 2019

@miekg

This comment has been minimized.

Copy link
Member

commented Aug 20, 2019

@ihac

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 20, 2019

To add to this. Only when parsing you need to compare the string value. In the rule itself it can be integers. Saves quite a few bytes as well

On Tue, 20 Aug 2019, 17:12 Miek Gieben, @.> wrote: No these are better as constants. Much quicker to compare against. So please fix. On Tue, 20 Aug 2019, 17:10 An Xiao, @.> wrote: > @.**** commented on this pull request. > ------------------------------ > > In plugin/acl/acl.go > <#3103 (comment)>: > > > + Zones []string > + Policies []Policy > +} > + > +// Policy defines the ACL policy for DNS queries. > +// A policy performs the specified action (block/allow) on all DNS queries > +// matched by source IP or QTYPE. > +type Policy struct { > + action string > + qtype uint16 > + filter filter.Filter > +} > + > +const ( > + // ALLOW allows authorized queries to recurse. > + ALLOW string = "allow" > > Sorry but I think they have to be string since I'm using them to compare > with directives. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#3103?email_source=notifications&email_token=AACWIW4TZ27CQIF2HCLFM5TQFQJOHA5CNFSM4IKVBGV2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCCDPPKA#discussion_r315778649>, > or mute the thread > https://github.com/notifications/unsubscribe-auth/AACWIWZD6NMPVPPYKHUK6QDQFQJOHANCNFSM4IKVBGVQ > . >

That makes sense. Thanks.

@ihac ihac force-pushed the ihac:ihac_plugin_acl branch from 257a169 to 7dad32a Aug 20, 2019
@ihac

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 20, 2019

Working to add IPv6 support.

ihac added 6 commits Aug 23, 2019
Signed-off-by: An Xiao <hac@zju.edu.cn>
Signed-off-by: Xiao An <hac@zju.edu.cn>
Signed-off-by: Xiao An <hac@zju.edu.cn>
Signed-off-by: Xiao An <hac@zju.edu.cn>
Signed-off-by: Xiao An <hac@zju.edu.cn>
Signed-off-by: Xiao An <hac@zju.edu.cn>
@ihac ihac force-pushed the ihac:ihac_plugin_acl branch from 8fdf54a to f19ff36 Aug 28, 2019
@ihac

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 28, 2019

@miekg Sorry for wrongly force updating on the remote branch and making review work harder. Now updated.

@miekg

This comment has been minimized.

Copy link
Member

commented Aug 28, 2019

plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved
plugin/acl/setup.go Outdated Show resolved Hide resolved

p.qtypes = make(map[uint16]struct{})

p.filter = iptree.NewTree()

This comment has been minimized.

Copy link
@miekg

miekg Aug 29, 2019

Member

why doesn't this use createDefaultFilter?

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 29, 2019

Contributor

default filter includes the widest possible IPv4 and IPv6 networks, i.e. any IP hits this filter.
Here we need to start with empty filter, i.e. no IP hits this filter initially. Upcoming lines will add the configured networks

This comment has been minimized.

Copy link
@ihac

ihac Aug 29, 2019

Author Collaborator

newDefaultFilter matches all IP addresses while iptree.NewTree is an empty filter matches nothing.

Update
Signed-off-by: Xiao An <hac@zju.edu.cn>
@miekg

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

this will be merged soon, can you add an OWNERS file and add yourself? I'll also will send you an gitub invite and can you email your email to me (miek@miek.nl) so I can add you to the maintainers list?

w.WriteMsg(m)
RequestBlockCount.WithLabelValues(metrics.WithServer(ctx), zone).Inc()
return dns.RcodeSuccess, nil
}

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 29, 2019

Contributor

should we add here

else {
  break
}

If shouldBlock() returned false this means we found appropriate rule and the action was allow. Why we continue iterating over remaining rules?

This comment has been minimized.

Copy link
@ihac

ihac Aug 29, 2019

Author Collaborator

Well that depends on how you treat the relationship between different rules. Is it OR or AND? I think AND make more sense here because it supports a stricter control.

Besides, shouldBlock returning false only means that for this rule, the request should not be blocked. For example, not matched by any ACL policies. So using break here might not work.

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 29, 2019

Contributor

So, in conflicting cases block action will have higher priority than allow?
BTW, is it possible/legitimate to create several ACLs with the same or intersecting set of zones?

This comment has been minimized.

Copy link
@ihac

ihac Aug 29, 2019

Author Collaborator

in conflicting cases block action will have higher priority than allow?

Logically yes.

is it possible/legitimate to create several ACLs with the same or intersecting set of zones?

No, I don't think so. It does not make any sense to have conflicts in ACL rules. Ideally we should detect this and warn user, but it's really hard to do this.

This comment has been minimized.

Copy link
@ihac

ihac Aug 29, 2019

Author Collaborator

Although I'm not sure whether there is a case where users have to introduce conflicts in ACLs, using a fine-grained subdomain should handle most of the cases:

example.org {
  acl a.example.org {
    block ...
    // ...
  }
  acl b.example.org c.example.org {
    // ...
  }
}

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 29, 2019

Contributor

That's why I think iterating over remaining rules is just wasting time. In fine-grained config there should be no conflicts.
In case of conflicts, with my proposal the first encountered action will win. In your implementation block wins.

This comment has been minimized.

Copy link
@ihac

ihac Aug 29, 2019

Author Collaborator

Yes u r right. What about this:

isBlock, matched := shouldBlock(...)
if isBlock {
  // return dns.RcodeRefused.
}
if matched {
  // return with recurse
}
// continue checking with other rules.

This comment has been minimized.

Copy link
@ihac

ihac Aug 29, 2019

Author Collaborator

FYI: d09b3f6

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 30, 2019

Contributor

Oh, I see, you want to distinguish between allow and no match.

Alternatively, you can define

type action int
const (
  actionNone = iota
  actionAllow
  actionBlock
)
...
func matchWithPolicies(policies []Policy, w dns.ResponseWriter, r *dns.Msg) action {
   ...
}

and return actionNone if no policy match

This comment has been minimized.

Copy link
@ihac

ihac Aug 30, 2019

Author Collaborator

Thanks for suggestion. Now updated.

plugin/acl/acl_test.go Show resolved Hide resolved
plugin/acl/acl_test.go Show resolved Hide resolved
hasNetSection := false

remainingTokens := c.RemainingArgs()
for len(remainingTokens) > 0 {

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 29, 2019

Contributor

you can parse all arguments in a single loop, without nested loops.
See also methods c.NextArg() & c.Val()

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 29, 2019

Contributor

well, I may be wrong about using c.NextArg() & c.Val() in our case. But anyway you can do loop like

for _, arg := range c.RemainingArgs() {

and process each argument according to last saved section

This comment has been minimized.

Copy link
@ihac

ihac Aug 29, 2019

Author Collaborator

Good call. I spent a few time thinking about how to parse tokens elegantly. My goal is to support a loose but safe config syntax:

// legal ones
block type <xxx> <xxx>
block net <xxx> <xxx>
block type <xxx> <xxx> net <xxx> <xxx>
block net <xxx> <xxx> type <xxx> <xxx>
block type <xxx> <xxx> net <xxx> <xxx> type <xxx> ...

// illegal ones
block type net <xxx> <xxx> // users might wrongly delete the original qtypes.

That is why I implemented a new loadFollowingTokens method (refer to #3103 (diff)) to load sections one by one. Since Miekg recommended to keep code consistent with other plugins, I reverted it back with RemainingArgs.

Simply loop through c.RemainingArgs and process token one by one might not work when I want to check whether a type/net section is empty (block type xxx xxx net).

ihac added 3 commits Aug 29, 2019
Signed-off-by: Xiao An <hac@zju.edu.cn>
Signed-off-by: Xiao An <hac@zju.edu.cn>
Signed-off-by: Xiao An <hac@zju.edu.cn>
@miekg

This comment has been minimized.

Copy link
Member

commented Aug 30, 2019

Signed-off-by: Xiao An <hac@zju.edu.cn>
@@ -28,53 +28,60 @@ type Rule struct {
Policies []Policy
}

// Action defines the action against queries.
type Action int

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 30, 2019

Contributor

Normally identifiers should not be capitalized unless you want they to be used by external packages

This comment has been minimized.

Copy link
@ihac

ihac Aug 30, 2019

Author Collaborator

Ah, I just capitalized it simply because I wanna use action as variable name. But will change this if it's of idiomatic Go style.

This comment has been minimized.

Copy link
@ihac

ihac Aug 30, 2019

Author Collaborator

Sorry but I'm a little bit confused. If I change Rule, Action and Policy to lowercase, naming variables seems to be inconvenient...

I cannot use var rule Rule any longer.

This comment has been minimized.

Copy link
@ihac

ihac Aug 30, 2019

Author Collaborator

Anyway. PR updated by changing all types to private.

)

func (a acl) ServeDNS(ctx context.Context, w dns.ResponseWriter, r *dns.Msg) (int, error) {
state := request.Request{W: w, Req: r}
Check:

This comment has been minimized.

Copy link
@rdrozhdzh

rdrozhdzh Aug 30, 2019

Contributor

Can it be smth more meaningful, e.g. rulesLoop:

This comment has been minimized.

Copy link
@ihac

ihac Aug 30, 2019

Author Collaborator

Sure

ihac added 2 commits Aug 30, 2019
Signed-off-by: Xiao An <hac@zju.edu.cn>
Signed-off-by: Xiao An <hac@zju.edu.cn>
@yongtang

This comment has been minimized.

Copy link
Member

commented Sep 4, 2019

@rdrozhdzh I think the PR has been updated with all comment addressed. Can you take another look?

Copy link
Contributor

left a comment

I think we can merge this as initial version. We can fix issues (if any) in next PRs.

@yongtang

This comment has been minimized.

Copy link
Member

commented Sep 4, 2019

LGTM. Thanks @ihac for the work, and thanks all ( @miekg @rdrozhdzh) for review 👍 🎉 !

@yongtang yongtang merged commit 79f37a1 into coredns:master Sep 4, 2019
3 checks passed
3 checks passed
ci/circleci: kubernetes-tests Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
stickler-ci No lint errors found.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.