Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate Limiting, fixes #35 #94

Merged
merged 2 commits into from Dec 21, 2014

Conversation

Projects
None yet
2 participants
@SGrondin
Copy link
Member

commented Dec 19, 2014

Fixes #35

# Unused in the last 5 minutes
if (limiter._nextRequest+(60*1000*5)) < time
delete limiters[key]
, 60*1000 # Every minute

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 19, 2014

Member

In addition to this, we need an upper limit on the number of limiters.

Try to do it like so: if you hit the upper limit, start making your keys based on the domain name + IP instead of the hostname + IP. In other words, lop off the subdomain. Then after things "cool down", you can return to keeping the subdomain.

This comment has been minimized.

Copy link
@SGrondin

SGrondin Dec 19, 2014

Author Member

I disagree with you.

Each DNS lookup is 1000 times bigger than the limiter it happens in. All the objects involved in Native-dns and the like are HUGE. The DNS response needs to be constructed and all the buffers, strings and concatenations involved are orders of magnitude bigger than a limiter (that is shared between requests anyway). It makes no sense to optimize the 0.001%, especially when it adds a lot of complexity to the code. And even if all the rest of the DNS code was optimized, it would still be 100+ times bigger than the limiter and even if it was on a massive server, it would still die from CPU exhaustion or run out of network bandwidth before objects less than a KB in size matter.

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 19, 2014

Member

Fair enough. That's a good point regarding the other objects in memory. Good argument @SGrondin! 👍


# Initialize the global rate limiter pool, only an interface will be visible from the outside
limiters = {}
# Garbage collect the unused limiters every minute.

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 19, 2014

Member

Every 5 minutes? (Is this a case of comments != code?)

This comment has been minimized.

Copy link
@SGrondin

SGrondin Dec 19, 2014

Author Member

No, every minute it flushes the limiters that haven't been used in the last 5 minutes.

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 19, 2014

Member

Ah, my bad, sorry, getting late.

@server.on 'request', @callback.bind(@)
@server.on 'request', (req, res) =>
key = "dns-#{req.address.address}-#{req.question[0]?.name}"
limiter = gThrottle key, -> new Bottleneck 1, 200, 2, Bottleneck.strategy.BLOCK

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 20, 2014

Member

This seems a little too harsh.... The reason I'm concerned is that I think I recall seeing browsers send requests for both IPv4 and IPv6 simultaneously. So maybe change the concurrence from 1 to 2?

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 20, 2014

Member

Keep in mind also that popular websites will be accessed simultaneously from the same IP from large businesses and schools. It's reasonable to expect several simultaneously requests for Google.com, Facebook.com, Twitter.com, etc. to come from the same IP. These settings seem far too harsh given that reality.

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 20, 2014

Member

Also, please make these configurable. We should not be making these decisions categorically for all servers. We can provide "sane defaults", but these settings belong in the configuration and should be overridable by sysadmins.

Edit: and don't forget the change penalty below and the other limiters...

This comment has been minimized.

Copy link
@SGrondin

SGrondin Dec 21, 2014

Author Member

Good point, I remember now that in Unblock IPv4 and IPv6 didn't share limiters, I'll simply add the protocol to the key. I'm very aware that large networks share the same public IPs. But the limiter values here are shorter than the time to blink your eyes. And on top of that there's a queue (max length of 2). The reason it works in practice is because bots do bursts of simultaneous requests, while humans don't (in the case of humans they're not perfectly simultaneous). Believe me, it looks really harsh, but it works really well in practice.

I'm willing to be convinced to make them configurable, but I don't think it's a good idea and here's why. No matter where you put your server, if it's accessible from the internet, there'll be bots. Those bots scan all of the IPv4 space to find openly accessible DNS servers, so EVERYBODY that hosts a DNS server gets the same bots. Therefore, it's our job to tweak those values to find the sweet spot and those default values will be good for everybody. Adding them to the config makes everything more complicated. Adding them to the config makes it possible to REALLY break DNSChain and tweaking them requires a very good understanding of concurrency and experience administrating DNS servers. Therefore, I'm against it, but if you have better reasons than mine I'm always willing to be convinced.

The penalty is only something that affects BLOCK mode, so that's why I only set it in that one specific case.

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 21, 2014

Member

We can add an "advanced" config section for this stuff. It should still be configurable, but with red warnings everywhere.

Don't assume you know what everyone's use case will be.

This comment has been minimized.

Copy link
@SGrondin

SGrondin Dec 21, 2014

Author Member

Fair enough.

if limiters[key]?
limiters[key]
else
limiters[key] = makeLimiter()

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 20, 2014

Member

Can't this be written like so? limiters[key] || (limiters[key] = makeLimiter()) or limiters[key] ? (limiters[key] = makeLimiter()) (forget the difference)

This comment has been minimized.

Copy link
@SGrondin

SGrondin Dec 21, 2014

Author Member

It can but I personally hate using || for assignment because it uses the weak equality of Javascript, and that's probably Javascript's single most dangerous feature. I should probably make it a one-liner with if .. then .. else ..

@server = http.createServer(@callback.bind(@)) or gErr "http create"
@server = http.createServer((req, res) =>
key = "http-#{req.connection?.remoteAddress}"
limiter = gThrottle key, -> new Bottleneck 2, 150, 10, Bottleneck.strategy.OVERFLOW

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 20, 2014

Member

Make sure to make these configurable too.

@log.debug gLineInfo "Tunnel: "+host

key = "https-#{c.remoteAddress}"
limiter = gThrottle key, -> new Bottleneck 2, 150, 10, Bottleneck.strategy.OVERFLOW

This comment has been minimized.

Copy link
@taoeffect

taoeffect Dec 20, 2014

Member

Hmm... I'm just observing that if we have a limiter here, then the limiter for HTTP is redundant, right?

This comment has been minimized.

Copy link
@SGrondin

SGrondin Dec 21, 2014

Author Member

No, because HTTP is accessible directly AND HTTPS (443) also serves Unblock, does SNI forwarding, etc etc.

@taoeffect

This comment has been minimized.

Copy link
Member

commented Dec 20, 2014

Note to self: after @SGrondin address this set of comments, double-check that cb for the bottlenecks are properly called everywhere. @SGrondin, maybe you can double-check all those code-paths too just to ensure we don't have any surprises.

@SGrondin

This comment has been minimized.

Copy link
Member Author

commented Dec 21, 2014

OH YES! It's extremely important. I spent 90% of the time implementing this feature making sure every single code path calls its callback, it'd be great if you could check to make sure I didn't leave something out. Btw in the case of DNS, make sure that cb is called AFTER the domain has been resolved and the response sent back to the user.

SGrondin added a commit that referenced this pull request Dec 21, 2014

@SGrondin SGrondin merged commit 748e9cf into dev Dec 21, 2014

@taoeffect

This comment has been minimized.

Copy link
Member

commented on 5e985d6 Dec 25, 2014

👍 Great job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.