Performance improvement for Dump (and tests) by Ryman · Pull Request #12 · arlolra/check

Ryman · 2013-09-09T02:22:15Z

Sets up an interval tree (one optimised for small integer values) with all port ranges specified in the policies. The query time is relative to the output size, and the interval tree can in theory also perform multiple queries in basically the same time. (If needed in future for multiple ip/port queries. Not currently implemented as it's not needed)

It gets a set of rules as the result of a port query, and iterates them to find matching rules. It's writing straight to the output buffer currently as that should cause less allocations. I can fix up some of the handlers to do similar once they're considered stable (if required).

Interestingly, the current bottleneck for dumping is checking IPs for matched rules. Could do a number of fun things to make that super fast too (the rules per port are actually static after each exit list rebuild, therefore the set of IPs they accept/reject is also static), but we're already dumping 900 results for a given up/port in less than 0.0002 seconds...

So we're likely bottlenecked by other things (I haven't profiled the handlers) if at all so I don't think it's worth pursuing unless just out of curiosity/perfectionism :) And I'm pretty sure IsTor (main site & Torbutton) is the main hit for the service, so we're already good there. I can't imagine the dump being that frequent?

Updated tests Still not passing test based on PolicyDefaults

arlolra · 2013-09-10T21:05:51Z

The interval stabbing looks neat. Where'd you come across that data structure (and the rather recent literature)?

I like it a lot and I'll get this merged soon enough, though I noticed quite a few TODOs in there. In the meantime, I've rebased and merged the tests. Thanks a bunch.

In d5fb685, I added support for masking. It should be compatible with what you've done here, it only affects the IP matching. Unfortunately, in the process, I broke your startup script. The consensus only has exit policy summaries, which basically means wildcard addresses for everyone. I needed to also pull in cached server descriptors to search for the entire exit policy. metrics.tp.o seems to have that data so it should be possible to collect it. However, the directory structure is a little different, with each descriptor getting its own file, rather than a concatenate list.

Ryman · 2013-09-11T17:04:20Z

Pretty much just one of those data structures I came across years ago on wikipedia and never had a reason to use it. The recent limited range optimisation is hinted at on wikipedia. "If the endpoints of intervals are within a small integer range (e.g., in the range [1,...,O(n)]), faster data structures exist with preprocessing time O(n) and query time O(1+m) for reporting m intervals containing a given query point." But I had to do some awkward searching to find what paper wikipedia was failing to reference.

Regarding the TODOs (as listed by grep):

The error returns in Dump are suggesting fixing something that we don't already account for, no point wasting cpu on invalid input.
Excluding IPs that were already included shouldn't actually happen, it was something I thought of during my testing, but I think adding a preprocess step to warn about bad/duplicate rules would be a good idea.
The type assertion comment is really only a todo if perf ever becomes a problem, it can probably be removed.
I think the TODOs in the tests can be removed actually, I was hoping you could review, it's testing implementation details instead of results so I think the old implementation detail is irrelevant.
The policy todo is a valid concern, i.e. what do tor exits actually do if they say they accept and reject on the same port for wildcard. I'm hoping the data never really has this, but right now it's assuming the reject will take precedence. There's also a warning present incase a reject rule is specified twice, this may be deemed unnecessary by other input validation.
The language TODOs were already in master, also mybad! :)

Good to get the ip masking support, we'll have to be sure to get tests for that soon!

arlolra · 2013-09-14T06:03:18Z

Got a chance to review this tonight but, unfortunately, there's a problem.

There's also a warning present incase a reject rule is specified twice, this may be deemed unnecessary by other input validation.

That warning only makes sense for wildcard addresses. I added a test case in bc1b384 to catch this,

func TestDoubleReject(t *testing.T) {
  testData := `{"Rules": [{"IsAccept": false, "MinPort": 80, "MaxPort": 80, "Address": "222.222.222.222"}, {"IsAccept": false, "MinPort": 80, "MaxPort": 80, "Address": "123.123.123.123"}], "IsAllowedDefault": true, "Address": "111.111.111.111"}`
  exits := setupExitList(t, testData)
  expectDump(t, exits, "222.222.222.222", 80)
}

The issue is in GetRulesDefaultAccept. If you have two non-wildcard reject addresses, they collide and then IsAllowed will return conflicting answers.

For what it's worth, I've rebased your work on master here, https://github.com/arlolra/check/tree/stab. Assuming we can solve the above problem, I'm ready to merge. Though ... it's proving a little tricky.

arlolra · 2013-09-14T16:45:10Z

The policy todo is a valid concern, i.e. what do tor exits actually do if they say they accept and reject on the same port for wildcard. I'm hoping the data never really has this, but right now it's assuming the reject will take precedence.

The spec says,

These lines describe an "exit policy": the rules that an OR follows when deciding whether to allow a new stream to a given address. The 'exitpattern' syntax is described below. There MUST be at least one such entry. The rules are considered in order; if no rule matches, the address will be accepted. For clarity, the last such entry SHOULD be accept : or reject :.

So it'll follow whichever appears first.

Ryman · 2013-09-14T17:47:04Z

Excellent! I'll write up a test for the second comment and will try to fix this tomorrow.

arlolra · 2013-09-14T18:07:19Z

Another problem is in a policy like this,

reject 123.123.123.123:80
accept *:80
reject *:*

Here you're discarding the first reject, since the default is reject.

What I think needs to be done is insert policies in the instab with an ordering. Then sort the returned intervals by that order and keep track of which PolicyAddressNewLines have already been returned (or decided upon). That might be less performant, but should be correct.

Ryman · 2013-09-14T19:01:58Z

Yeah, the exit files I was testing with didn't have any explicit ips in them so the explicit ip tests I made are really weak. I think the other 2 issues needs a decent change to the preprocessing, but again I'll add a test around this and get it working. I think you're right about ordering, I had a short think about it after replying earlier, I think it might be cleaner to change the usage of the intstab so it's not returning rules anymore.

Ryman · 2013-09-14T22:00:59Z

Please check the changes to your stab2 branch here: https://github.com/Ryman/check/tree/stab2 (Log here https://github.com/Ryman/check/commits/stab2 , up to 9c58621 specifically)

If you're happy with the changes we can consider this PR closed and use new ones for future changes. 👍

arlolra · 2013-09-15T17:46:59Z

LGTM ... one question though, is this still a performance win over master (even w/o the caches)?

arlolra · 2013-09-15T21:55:19Z

Closing for now ... so sad. I'll leave stab3 around for a brighter day.

Ryman · 2013-09-15T23:15:49Z

Got a decent idea when I got to the cash register :) We can do the same caching behaviour (per port, not port/ip combos) on master as well, so I'm unsure if this is beneficial, but it's less likely to blow up it's cache being per port instead of port/ip combos (Higher expected hit rate). But could just use an LRU cache to fix that problem.

Ryman@00f9ad7

Ryman · 2013-09-16T03:43:22Z

Took another little stab at it before bed... optimising the cached data!
This one got it down to about 200,000ns/op
Ryman@24da118

This one gets it down to about 125,000ns/op
Ryman@93e01a9

go test check -benchtime 10s -bench ".Dump." -benchmem
PASS
BenchmarkDumpList     200000        124095 ns/op          16 B/op          1 allocs/op
ok      check   28.012s

I haven't even bothered to check what's causing the alloc... this is the profile result:

Total: 6046 samples
       0   0.0%   0.0%     5838  96.6% gosched0
       0   0.0%   0.0%     5832  96.5% testing.(*B).launch
       0   0.0%   0.0%     5832  96.5% testing.(*B).runN
       5   0.1%   0.1%     5830  96.4% check.BenchmarkDumpList
       0   0.0%   0.1%     5721  94.6% check.(*Exits).Dump
    1994  33.0%  33.1%     5595  92.5% check.(*Exits).IsAllowed
     324   5.4%  38.4%     2965  49.0% check.(*MatchRule).IsMatch
     410   6.8%  45.2%     2478  41.0% net.(*IPNet).Contains
     576   9.5%  54.7%     1681  27.8% net.IP.To4
     466   7.7%  62.4%     1305  21.6% net.networkNumberAndMask
    1105  18.3%  80.7%     1105  18.3% net.isZeros

So yeah, we should probably stop here, the IPNet.Contains call is taking ~40% (2478/6046samples) of the cpu time in the benchmark, so I think we're definitely on the verge of not being able to get improve without changing the stdlib :)

Again, both of these optimisations can probably be applied to master instead of stab, it's up to you :)

Ryman added 8 commits September 9, 2013 02:18

Parameterised Exits.Load to make it easier to test

001af75

Some basic initial tests for the datastore for arlolra#7

c782e88

Refactored tests, fixed filepaths for testing, added basic benchmark

770bf4b

Added benchmark to compare interval tree implementation for reference

89f0924

Implemented api with interval tree, ~100x perf

74d43f5

Updated tests Still not passing test based on PolicyDefaults

Rebased onto upstream master

e63e674

Set filter to just run regex matching tests.

2baab5c

Finished interval implementation, updated tests.

e5cc1a3

arlolra mentioned this pull request Sep 10, 2013

gzip vs no gzip #4

Closed

arlolra closed this Sep 15, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvement for Dump (and tests)#12

Performance improvement for Dump (and tests)#12
Ryman wants to merge 8 commits intoarlolra:masterfrom
Ryman:stabImpl

Ryman commented Sep 9, 2013

Uh oh!

arlolra commented Sep 10, 2013

Uh oh!

Ryman commented Sep 11, 2013

Uh oh!

arlolra commented Sep 14, 2013

Uh oh!

arlolra commented Sep 14, 2013

Uh oh!

Ryman commented Sep 14, 2013

Uh oh!

arlolra commented Sep 14, 2013

Uh oh!

Ryman commented Sep 14, 2013

Uh oh!

Ryman commented Sep 14, 2013

Uh oh!

arlolra commented Sep 15, 2013

Uh oh!

arlolra commented Sep 15, 2013

Uh oh!

Ryman commented Sep 15, 2013

Uh oh!

Ryman commented Sep 16, 2013

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ryman commented Sep 9, 2013

Uh oh!

arlolra commented Sep 10, 2013

Uh oh!

Ryman commented Sep 11, 2013

Uh oh!

arlolra commented Sep 14, 2013

Uh oh!

arlolra commented Sep 14, 2013

Uh oh!

Ryman commented Sep 14, 2013

Uh oh!

arlolra commented Sep 14, 2013

Uh oh!

Ryman commented Sep 14, 2013

Uh oh!

Ryman commented Sep 14, 2013

Uh oh!

arlolra commented Sep 15, 2013

Uh oh!

arlolra commented Sep 15, 2013

Uh oh!

Ryman commented Sep 15, 2013

Uh oh!

Ryman commented Sep 16, 2013

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants