New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error URI for Rate Limiting Purposes #510
Comments
Giving this some love definitely makes sense, in particular as I guess it popped up in real world usage, and is about closing small gaps which improve on practical use. In my eyes, these are the elements we need for a complete solution (and I might still have missed sth .. not sure):
The title of this issue suggests to me it is only about 4., and agreeing on one specific string as a standard error URI is good, easy and required, but the stuff around - the semantics - is what it makes the string into something of actual use in my eyes. Just my 2cts at least;) Let's discuss. |
While I agree that it would be good to discuss the 1-6 pts @oberstet named, even this small URI introduction - is already a good improvement. Let's introduce it. And start the discussion of other things in parallel. |
sure! good idea to split it up into smaller pieces. even though: the error URI should use a suggestive name, capturing/pointing to the right semantics, and thus the name cannot be fully separated from the other pieces
the spec talks only about WAMP actions I think .. at least for "publish" and "call" .. not sure. we definitely should use a consistent wording ... and "request" doesn't IMO ... is "publish" about limits of number of in the latter case, this only underlines even more the problem with using "request" .. similar for the REGISTER, SUBSCRIBE etc messages are different .. the load resulting in a router is more tied to the overall size of the set of registrations/subscriptions for the originating client, and from the effective resulting load later on eg 1 SUBSCRIBE to 1 wildcard but I think we can nevertheless follow your suggestion and intro only the URI in a first piece after we have generally agreed about the rough scope ... at least for me, this is pretty unclear at this point still ... is it about limiting client load (protecting clients, and routers only secondary), is it about limiting router load (not necessarily the same as the previous), is it about limiting total set size on registered URIs by clients or about load resulting from the former, is it about turnover of such registered/subscribed URI by client per period of time, and so on ... |
to give an idea of what I mean .. couple of URIs alternative/adding to
|
@oberstet I think you are talking about more fine-grained limit control, while the original @ecorm idea is about a generalized error URI. And actually, they don't contradict each other. So In my eyes While One WAMP Router implementation may do just a general limiter - it is fine, while the other one decides to go beyond and implement all fine-grained limits. |
Ok, makes sense! You are right, I was more thinking in WAMP higher level terms. If this issue is more about lower level WAMP terms, then we might think of it more in terms of WAMP transport limits and protection? If so, we should have a clear picture how that in turn differentiates/relates to the (still to be done) upper level WAMP limits (in general), and to limits below WAMP transports - namely pure network level limits, like from your standard IP or TCP level traffic limiter or DoS protection. In this case, rgd naming, here is a quick brain dump of options that come to my mind:
or similar. why? pls let me explain my thinking. first, I can't think of any other metrics than "number of messages per second" and "message data volume per second" which would make sense without reaching into error conditions which are better treated in the higher level limits (not treated in this issue). The main differentiator to lower levels (a general IP or TCP or TLS limiter): we can deal with such limits on a
basis, which a lower level limiter can not. the latter in fact will usually only see an encrypted TLS connection on top of a TCP connection on top of IP .. it doesn't have any idea about "individual WAMP messages", realms, authid or authroles .. following this perspective, the question then becomes: should the error URIs name errors differently depending on the subject the limit applies to? that is realm, session, authid or authrole. the object of error is always a limit violation (
writing this, I see above misses one final dimension: the direction (uplink and downlink). do we want the limits to cover only 1 direction? is it possible that a router fires a limit without the client sending anything? eg. when a subscribed client would receives an excessive amount of EVENTs if the router would honor the subscription? |
IOW, how about
So 16 in total.
Do we want to have 16 error URIs |
As @KSDaemon emphasized, I'm looking more for a generic rate limiting mechanism that doesn't care about message types, as long as they originate from the client. The algorithm I was contemplating is the one provided by Nginx (https://www.nginx.com/blog/rate-limiting-nginx/) with burst mode enabled. It is also called sliding window in HAProxy. Having this simple mechanism has an immediate practical effect in my router code: I can have a fixed-capacity per-session queue of client-initiated WAMP messages, and can prevent a client crashing the router by simply exhausting memory by spamming "requests". By also limiting the number of client connections and the maximum WAMP message length, I can put an upper bound to memory usage. Instead of number of messages, the per-session queue limit could also be based on the sum of queued message lengths. But that's an implementation detail and how it's done should not be mandated in the WAMP spec. But if you want to consider finer-grained limits, that's fine too, as long as it doesn't preclude the generic one. Even finer than per-message-type would be individual API endpoints (i.e. individual RPC URIs).
I would suggest exponential backoff, but I don't think this should be mandated by the WAMP standard.
Why not "request"? It is a message originated by the client for which a response is expected (request-response). Those familiar with HTTP/REST would immediately understand its meaning. Instead of GET, POST, etc, the verbs in WAMP are SUBSCRIBE, CALL, etc. I'm open to other suggestions, but I can't think of a better term myself.
Unless the client needs to react differently per URI, they don't need to be distinct error URIs and more details could be added to the ERROR message as payload arguments.
I agree, but I don't mind that we first discuss an error URI "namespace" (as suggested by oberstet) under which all future rate limiting errors would be put. |
This has to be the first priority in our discussions (but it doesn't have to be to the exclusion of others). Crashing an entire server is more damaging than crashing a single client. |
It should just be one item, and further details on limits/quotas would be communicated via HELLO/WELCOME details or meta procedure.
Perhaps a meta procedure? I'm already finding the HELLO/WELCOME messages way too verbose with the role/feature dictionaries.
I'm not sure I understand the question, but the router should be in control of its own limits via configuration. If clients have their own limits (e.g. EVENTs per second), I'm not sure what the router can do to respect them. There is no notion of QoS in WAMP. Does the router simply discard events that cannot be delivered to clients due to client limits?
This was the goal of my original post.
I don't think the exact algorithms should be baked into the WAMP spec, just the ERROR URIs and message options.
Again, the exact algorithm should not be specified by the WAMP spec. Same way as it's not specified for HTTP when a |
Instead of a combinatorial explosion of error URIs, there could be a single error URI with two payload arguments:
Argument
Argument
|
yeah ... but what's the problem? URI components are just arguments .. of a flavor different from regular having said that: I also feel a threat of "combinatorial explosion". but it's not tied to use of URI components vs args/kwargs. it is tied to the inherent complexity of subject matter (routing) that needs structuring. you cannot escape by using args/kwargs instead of URI components. that would be a following a chimera. in my mind/thinking at least;)
ok, yes, agreed! let's nail that first. rgd the URI namespace this pretty generic .. and fuzzy .. I'm just saying .. maybe that's what we want ;) how about
e.g. for a thermal limit exceeded: it cannot be set, but is given by the system hardware. it may or may not be caused/originating from a user. if not, why report to the user? basically, the only difference of the problem is related to a resource which is necessary for router operation, but is "limited" in amount available how about: or note that for my ears (and I'm not a native speaker), "resource exceeded" is more clear than "limit exceeded" .. the former implies there is some "resource" (assumed necessary for operation), and it's "empty/low". the latter only implies numbers with ordering and size (a metric) and a "threshold smaller/bigger"
really? are you sure? ;) there are many ways. don't get me wrong. it would be highly useful to have a systematic analysis of this!! e.g. if I can connect to a router at all, and if I can trigger any non-fatal error like "no such procedure", what about amplification attacks? e.g. can I trigger an error bigger than my request? like https://www.cloudflare.com/learning/ddos/dns-amplification-ddos-attack/ |
Having a large number of error URIs that is multiplied every time someone adds a new component choice is very problematic to me, because I map each of the known WAMP error URIs to a C++ If an implementation wanted to do string-based pattern matching on the ERROR, couldn't they just generate a string by concatenating the URI with each payload argument and a separator?
I don't know where you're going with this question.
"Exceeding a resource" does not make semantic sense in English. You exceed the limits of a resource in its usage. Let's take fuel usage as an example of a resource. You can't say "you exceeded fuel", nor "you exceeded fuel usage", but you can say "you exceeded the fuel usage limit". Or consider the English phrase that is well-known to even non-native speakers: "exceeding the speed limit". Don't tell me you never heard the phrase "exceeding the speed limit". 😁 Also, the limits are not necessarily tied to a single specific resource. If I impose a "request rate limit" of 100 requests per 10 seconds, it affects both memory and CPU resources.
No, because of the English semantics I explained above (exceeding a resource), but I would be okay with |
Another approach I'm toying with in my mind is that the error URIs prescribe how the client should react:
Details on the limit exceeded and scope would be in payload arguments. Sorry, too tired to elaborate more right now, so I'm just dumping the idea. |
I gave this some more thought, and this is not a good idea because there would be applications where the client is in the best position to know how it should react. For some operations, the client could decide that data loss is acceptable (to avoid overloading the server) and not bother retrying. For example, a client could be reporting a sensor value every minute. If an acked PUBLISH operation fails due to rate limiting, the client could decide to abandon the failed request and just wail until the next sensor update tick to publish the latest sensor value. |
Here is a big brain dump of me stepping back and looking at the big picture. I do not claim to be an expert in any of the computing fields mentioned below. I do not have time to become an expert, and only need a simplistic, generic rate limiting mechanism like the one provided by Nginx. The products I'm working on will be mainly accessible via local networks and will not be exposed directly to the Internet. I took the time to write this analysis so that we can establish an initial naming convention for error URIs related to rate limiting and resource exhaustion. I have no intention of implementing all the ideas I write below. ProblemsThere are several networking problems being addressed here, and the rabbit hole can go deep for all of them.
Usage Limits vs Resource ExhaustionWe should distinguish between two broad categories of errors:
Note that usage limits can be established to prevent the exhaustion of resources. And current resource usage can be used to dynamically set the usage rate limits. Usage LimitsUsage limits to mitigate the above listed problems can be configured/enforced at the router level, but it could also be done at the application level. For example, API usage limits could be directly enforced by a callee, or the router could be directed to enforce it on the callee's behalf. Limitation mechanisms can be generic (requests per second per user), or they can be API-specific (customer Bob can only call Generic limitations that could be configured/enforced at the router include:
Note that there need not be separate URIs for subscribe/publish/register/call limits, because the context already determines which action triggered the limit. For example, if a Scope of LimitsThe scope of these limits can be (as suggested by oberstet):
I would also add:
Resource ExhaustionResources that can become exhausted on the server include:
The feedback loop for avoiding resource exhaustion does not need to directly involve the client. Instead, the server limits (e.g. request rate) could be dynamically adjusted based on current resource usage. Instead of rate-limiting clients to mitigate resource exhaustion, they could also be dropped (load shedding). A load balancer could take care of forwarding a dropped client to a less busy server when it attempts to reconnect. Unless I'm mistaken, there is currently no ABORT error URI for this purpose. Client ReactionsWhen considering error URIs that are returned to the client, the client must make a decision on how to deal with the error. It can:
The prescribed reaction to a particular error can either be mandated by the application or micro-service, or be decided independently by the client. Some services could be safe to degrade, while others are not (prioritization). The reaction could also be performed by the server by dropping the client (e.g. DoS detection), in which case the error URI could be returned via a server-initiated ABORT message just before disconnection. Client OverloadWhen considering how a client can become overloaded, here are the use cases I can think of:
There are no EVENT acknowledgements, so there is no immediate way to notify the server of overload unless it's via another channel. There is also no notion QoS message delivery in WAMP. The client is responsible for graceful degradation in this case, and this could be achieved by dropping events or via an API that's associated somehow with the publisher. For excessive invocations, the client can easily return an ERROR that is forwarded back to the caller via the router. For progressive invocations (e.g. streaming), the graceful degradation could be done via progressive results understood by the caller. Note that only the raw socket transport establishes payload limits. There are no such limits negotiated via Websocket. There is already the |
Here's another argument in favor of the term "request": it is already used extensively in the spec as the |
First of all: @ecorm huge research written above! ^^^ Since I didn't have time to respond earlier, there are going to be a lot of different comments from my side now. In general, I believe that the WAMP protocol specification (which is an application-level protocol) should not describe specific algorithms and implementations of DDoS protection or traffic shaper mechanisms, and certainly not go down to lower transport layers (such as TCP/IP), but should only provide building blocks for such capabilities (by building blocks I mean error URIs, error payload descriptions and so on). Only such an option will allow on the one hand to have a concise and understandable specification of the protocol and provide opportunities for freedom of implementation on the other hand. Let's take the HTTP specification for comparison: it simply describes the response codes and their brief purpose (perhaps a bit shorter than it should be, but let's not talk about that). So the protocol itself provides the code Regarding error URIs and their number: on the one hand I understand the perfectionism and granularity on the part of @oberstet to have a separate URI for each specific type of error. But based on my personal experience I'm sure that few people will implement different variants of handling each error, rather it will be an inconvenience for the client - because in most cases the client doesn't care why his request was rejected (because of exceeding the number of calls from one IP or limiting the number of publishing events at a certain time of day). In the best case, the client will just try to repeat the request after some time (well if not immediately). And it seems to me that from a practical point of view, it is better to leave fewer different error URIs, hiding the details in the payload. This will simplify error handling in most cases without losing details for those rare cases when it will be needed. At least in none of my projects over the last 20 years have I had the need to implement such granular error handling (and sometimes there was simply no time for this). And here I like the @ecorm idea of having a single error URI with two payload arguments. I prefer to have one dictionary argument instead of a few positional arguments as a more self-expressive option, but that's the details. Regarding error URI, I still prefer
Regarding the term for
Yeah, we should clarify and agree on what we mean by that. And by
So I don't see any reason why the same That's my emm... not 50cents, but maybe 1 dollar) |
Oh, and btw, I was also thinking and looking into Nginx traffic shaper implementation as a reference :) |
I'm +1 on
which nicely matches the theme established by
The With those 2 or 4 URIs, implementations would be free to use payload arguments to provide supplemental information, like the For excessive authentication attempts, we can either:
|
Rgd auth flow: well, a router could return the |
The idea is for a client app to know when to display "Too many login attempts" instead of "Incorrect username and/or password". If |
Yeah! Sure! I got exactly the same idea! |
To be clear, I prefer the error URIs to be distinct for those two use cases. |
lots of discussion;) fwiw, just a quick note, I'm going to comment etc within the week, currently busy with other stuff |
There is currently no standard mechanism to allow the rate limiting of WAMP messages to mitigate DoS attacks, or just to prevent memory exhaustion during heavy loads.
HTTP has the
429 Too Many Requests
response status code, but WAMP does not have an equivalent error URI. WAMP router implementations wanting to reject messages due to rate limiting have to come up with their own non-standard error URI.While HTTP reverse proxies can do rate limiting for HTTP requests, they don't understand the WAMP protocol over Websocket, so they're of no help to us in that respect.
WAMP client libraries wanting to support message rate limiting would need to understand a
wamp.error.too_many_requests
(or similar) error URI and implement a backoff-and-retry mechanism (or otherwise pass that burden up to the app developer).The client-initiated messages that would be considered "requests" for rate limiting purposes would be:
SUBSCRIBE
UNSUBSCRIBE
PUBLISH
(acknowledged)REGISTER
UNREGISTER
CALL
A "request" that's rejected due to rate limiting would receive an
ERROR
response with an URI such aswamp.error.too_many_requests
.Note that an unacknowledged
PUBLISH
would be "lost" if it got rejected due to rate limiting.I'm not asking for a specific rate-limiting algorithm to be standardized. I think all that's needed in the WAMP standard is a new error URI for this purpose.
I am aware of issue #36 , but it's more about the router not spamming clients too quickly with pub/sub events. The issue I'm raising here is to prevent clients DoS'ing routers.
The text was updated successfully, but these errors were encountered: