Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A bit more informative errors of constraints #776

Closed
wants to merge 2 commits into from

Conversation

manifest
Copy link

Currently, we can't determine which of constraint fails and what a field for. in case the constraint returns false, we receive a not informative {reason, case_clause}. And when the constraint specified without a default value and a field isn't presented in the request, we receive {reason,bad_key}. We could make this behaviour a bit more informative by passing a field name and a constraint function with the exception {bad_constraint, Key | {Key, Constraint}}.

  • {bad_constraint, {Key, Constraint}}
    when the constraint returns 'false' and a field is presented in the request
  • {bad_constraint, Key}
    when the constraint specified without a default value and a field isn't presented in the request
%% Example

byte_size_eq(Size) ->
  fun         
    (Val) when byte_size(Val) =:= Size -> true;
    (_)                                -> false
  end.  

malformed_request(Req, State) ->
  ByteSizeEq1 = byte_size_eq(1),
  Constraints =
    [ {a, int, 42},
      {b, ByteSizeEq1} ],

  ErrorDesc =
    fun
      ({Key, F}) ->
        Field = atom_to_binary(Key, utf8),
        case F of
          int         -> <<"The '", Field/binary, "' field must be an integer.">>;
          ByteSizeEq1 -> <<"The '", Field/binary, "' field must be a bynary string of length 1.">>;
          _           -> <<"Invalid value of the '", Field/binary, "' field.">>
        end;
      (Key) ->
        <<"The '", (atom_to_binary(Key, utf8))/binary, "' field is required.">>
    end, 

  try cowboy_req:match_qs(Constraints, Req) of
    M ->
      {false, Req, M}
  catch
    _:{bad_constraint, Reason} ->
      error_logger:error_report([bad_constraint, {reason, Reason}]),
      Req2 = cowboy_req:set_resp_body(jsxn:encode(#{error => ErrorDesc(Reason)}), Req),
      Req3 = cowboy_req:set_resp_header(<<"content-type">>, <<"application/json">>, Req2),
      {true, Req3, State}
  end.
## Case 1

curl -4i -XGET "http://localhost:8081/?a=A"
HTTP/1.1 400 Bad Request
connection: keep-alive
server: Cowboy
date: Fri, 19 Dec 2014 13:59:39 GMT
content-length: 45
content-type: application/json

{"error":"The 'a' field must be an integer."}
=ERROR REPORT==== 19-Dec-2014::16:59:39 ===
    bad_constraint
    reason: {a,int}
## Case 2

curl -4i -XGET "http://localhost:8081/?a=1"
HTTP/1.1 400 Bad Request
connection: keep-alive
server: Cowboy
date: Fri, 19 Dec 2014 14:00:17 GMT
content-length: 32
content-type: application/json

{"error":"The 'b' field is required."}
=ERROR REPORT==== 19-Dec-2014::17:00:18 ===
    bad_constraint
    reason: b
## Case 3

curl -4i -XGET "http://localhost:8081/?a=1&b=BB"
HTTP/1.1 400 Bad Request
connection: keep-alive
server: Cowboy
date: Fri, 19 Dec 2014 14:00:50 GMT
content-length: 62
content-type: application/json

{"error":"The 'b' field must be a bynary string of length 1."
=ERROR REPORT==== 19-Dec-2014::17:00:50 ===
    bad_constraint
    reason: {b,#Fun<example_hello_handler.0.73648182>}
## Case 4

curl -4i -XGET "http://localhost:8081/?a=1&b=B"
HTTP/1.1 200 OK
connection: keep-alive
server: Cowboy
date: Fri, 19 Dec 2014 14:01:16 GMT
content-length: 15
content-type: application/json

{"a":1,"b":"B"}

@essen
Copy link
Member

essen commented Dec 19, 2014

I'm OK with the principle of making a more identifiable error, however it should really be an error and not a throw.

It should be a different error if the key is missing. I'm not convinced of the bad_constraint name either otherwise, it makes it sound like a constraint was invalid.

I'm also not convinced we should throw an exception as soon as an error is found. This is the way it is now because I went the easiest route first and then waited for feedback, but this is probably not how we want it if we want to adequately handle errors. We should probably throw something like error({badmatch, ListOfFailures}) (after all, we are doing a match), and then ListOfFailures can be a list like [{missing, Key} | {invalid, Key, Constraint}]. And invalid Key may be found more than once if the value fails more than one constraint.

This should give us all errors should we want to print a proper error message to the user.

@manifest
Copy link
Author

  • What reason to use an error here? From my opinion, we don't need a callstack, so throw looks like a better (lightweight) alternative.
  • How are you seeing the execution flow of constraints? I've thought the idea that value are passing through the constraints like it happens in a pipe. If one of constraint failed, the value wouldn't passed to the next one. It useful in case we want to reuse the constraints which transform the value (for instance, [fun base64/1, fun byte_size_32/1, ...] and not like [fun base64_and_then_byte_size32/1]). If so, the invalid Key may be found only once in the outcome list of errors. Am I wrong?

@essen
Copy link
Member

essen commented Dec 20, 2014

Hm not sure why you think throw has no stacktrace. In Erlang it is catch which generates the stacktrace. It doesn't matter if it is an error, exit or throw, they all end up generating a stacktrace if you catch them. (catch always does even when there is no exception; and try .. catch only does if there is an exception.)

So we use error because an error happened. Input was invalid: this is an error.

Good point on the pipe thing, we can't try the other constraints for the value. Forgot about that. On the other hand we can try other keys and still return a list of the above form. Just Key will be unique.

Because it is unique it might be more interesting to use a map instead, like #{Key => missing | {invalid, Constraint}}.

@fishcakez
Copy link
Contributor

catch always does even when there is no exception

Can you explain this in more detail?

@essen
Copy link
Member

essen commented Dec 20, 2014

catch (function()) will generate a stacktrace when an exception is triggered within function() but also when function() returns normally.

@fishcakez
Copy link
Contributor

I think you are confused. The expensive bit you may want to avoid is the part of catch that is equivalent to calling the erlang:get_stacktrace() when catch catches an error exception. This is of course only unnecessary expense if you don't make use of that stacktrace.

@essen
Copy link
Member

essen commented Dec 20, 2014

Reading again about it (http://erlang.org/pipermail/erlang-questions/2013-November/075928.html) it seems I am misremembering what actually happens, yes. catch is dangerous when it may receive error, but pretty much any code can trigger an error (function_clause, case_clause, badmatch...) so catch should just not be used if you care about performance.

In the case of try .. catch there is no difference between throw and error or exit however so there is no performance reason to go with one or another. The construct also has the advantage that it doesn't catch everything; you can simply catch the match error and let the rest through if needed. One may want to also catch error:_ though Cowboy will already send a 400 if you don't.

Looking further though I see the documentation constraints error specifically to runtime errors, which this isn't. So I retract my previous statements and would advise something like throw({badmatch, Map}) instead.

Thanks for telling me I was confused. :-)

@manifest
Copy link
Author

I've updated the source code.

curl -4i -XGET "http://localhost:8081/?a=A"
HTTP/1.1 400 Bad Request
connection: keep-alive
server: Cowboy
date: Sun, 21 Dec 2014 12:38:38 GMT
content-length: 89
content-type: application/json

{
  "errors": [
    "The 'a' field must be an integer.",
    "The 'b' is required."
  ]
}
=ERROR REPORT==== 21-Dec-2014::15:38:38 ===
    badmatch: #{a => {invalid,int},b => missing}

Two more questions about constraints:

  • Why are we using a boolean type for the outcome of a constraint function? May be we should switch true | {true, any()} | false to ok | {ok, any()} | error? Last ones are looking more familiar to Erlang.
  • Could we delegate catching constraint's badmatch exception to cowboy (cowboy_rest) insead of doing it in the handler? If exception was thrown, cowboy would catch it and call some new callback function (malformed_error) of ours. If the callback wasn't defined, cowboy would use its own predefined (reply with 'text/plain' for instance). By doing this, we will get a clear handlers and an informative error output out of the box.

@essen
Copy link
Member

essen commented Dec 21, 2014

Hm today Cowboy catches and send a 400 (or will, it may still send a 500 today, but I think I changed that). There is the onresponse hook which gets all info except the reason for the reply. Maybe we can have a separate hook for error replies but Cowboy needs to make sure a reply is sent, so the solution is not obvious and need some thinking.

As for true/false instead of ok/error, well, it's simply because it's much easier to use, for example like char_constraint(Value) -> Value >= 0 andalso Value =< 255..

@tuscland
Copy link

Hi,

I would like to mention that this "constraints" problem is not limited to Cowboy route constraints but also other types of input validation and transformation.

I have written a small module that uses function composition (think "type composition") and validate input the same way Cowboy does at the moment, that is, return {true, NewValue} or {false, Reason} where Reason is a programmatic description of the error (an erlang term).

This code is currently used for JSON or XML input as well. I use it with Cowboy constraints too, by waiving the Reason away, and returning false instead.

If Cowboy provided a callback function to allow for error processing, then route constraints would not be to be more complex than to return {false, Reason} when a constraint fails. The custom error handler would then take care of doing what is needed (e.g. format the error, set the body and reply the proper error code).

Camille

@essen
Copy link
Member

essen commented May 6, 2015

The subject of constraints has come up again in the context of reverse routing: #816

At this point I want to do a few more interface changes and tag 2.0.0-pre.2. I don't think constraints will be solved yet. This is a much larger problem than I thought.

In summary, so far, constraints need:

  • To be able to convert data to a usable type.
  • To have a normal and inverse operation int(binary()) -> integer() and inverse_int(integer()) -> binary() to make reverse routing possible.
  • To return {ok, Value} or {error, Reason}. Indeed this makes more sense.
  • The Reason should probably be an atom.
  • For the cases where the error is not catched in user code we should probably have a hook. This hook can be used to do formatting of the error and send an appropriate response. Otherwise Cowboy would just send a 400 with its own formatting.

I believe a hook is better because then you don't have to worry about it per handler, it will be used anywhere in the listener.

Am I missing something?

@tuscland
Copy link

tuscland commented May 6, 2015

Hello Loïc,

Please let me contribute my experience on that matter.

We have developped a simple module that we use to convert data into usable type for this very purpose. It is compatible with Cowboy's constraints, and we use it also to parse and validate JSON values. I'd be happy to send this code to you if you want to gather ideas.

It uses function composition to express the parsing intent, e.g. if you evaluate list(integer()), or any(null(), uuid), you get a function that can be used to parse binary input, and the result of calling this function is {true, Result} or {false, Reason}.

So in our system, we have a description of the structure and hierarchy of the URLs. This description is compiled to a list of strings that are used by Cowboy for routing, and to a list of functions that we use to generate URLs (reverse-routing if you will). This makes the routes definition practically empty since it is auto-generated.

Best,
Camille

@essen
Copy link
Member

essen commented Aug 10, 2015

This is interesting to get inspiration from when it comes to providing default constraints: http://livr-spec.org/

@essen essen modified the milestone: 2.0.0 Aug 18, 2015
@essen essen modified the milestones: 2.0.0, 2.0 Breaking Changes Feb 3, 2017
@essen
Copy link
Member

essen commented Jun 28, 2017

I've done both this and reverse constraints in c221730. Feedback welcome!

@essen essen closed this Jun 28, 2017
@manifest
Copy link
Author

it would be useful to access not only a value but also its key in case of format_error.

@essen
Copy link
Member

essen commented Jun 29, 2017

This module only validates values. I can see the exit exception having the field though, there it would definitely make sense (and it probably would make even more sense to aggregate all errors and only throw at the end).

@essen
Copy link
Member

essen commented Jul 1, 2017

Something like this?

diff --git a/src/cowboy_req.erl b/src/cowboy_req.erl
index 3eca379..f65ecc6 100644
--- a/src/cowboy_req.erl
+++ b/src/cowboy_req.erl
@@ -762,28 +762,43 @@ kvlist_to_map(Keys, [{Key, Value}|Tail], Map) ->
 		kvlist_to_map(Keys, Tail, Map)
 	end.
 
+filter(Fields, Map0) ->
+	case filter(Fields, Map0, #{}) of
+		{ok, Map} ->
+			Map;
+		{error, Errors} ->
+			exit({validation_failed, Errors})
+	end.
+
 %% Loop through fields, if value is missing and no default, crash;
 %% else if value is missing and has a default, set default;
 %% otherwise apply constraints. If constraint fails, crash.
-filter([], Map) ->
-	Map;
-filter([{Key, Constraints}|Tail], Map) ->
-	filter_constraints(Tail, Map, Key, maps:get(Key, Map), Constraints);
-filter([{Key, Constraints, Default}|Tail], Map) ->
+filter([], Map, Errors) ->
+	case maps:size(Errors) of
+		0 -> {ok, Map};
+		_ -> {error, Errors}
+	end;
+filter([{Key, Constraints}|Tail], Map, Errors) ->
+	filter_constraints(Tail, Map, Errors, Key, maps:get(Key, Map), Constraints);
+filter([{Key, Constraints, Default}|Tail], Map, Errors) ->
 	case maps:find(Key, Map) of
 		{ok, Value} ->
-			filter_constraints(Tail, Map, Key, Value, Constraints);
+			filter_constraints(Tail, Map, Errors, Key, Value, Constraints);
 		error ->
-			filter(Tail, Map#{Key => Default})
+			filter(Tail, Map#{Key => Default}, Errors)
 	end;
-filter([Key|Tail], Map) ->
-	true = maps:is_key(Key, Map),
-	filter(Tail, Map).
+filter([Key|Tail], Map, Errors) ->
+	case maps:is_key(Key, Map) of
+		true ->
+			filter(Tail, Map, Errors);
+		false ->
+			filter(Tail, Map, Errors#{Key => required})
+	end.
 
-filter_constraints(Tail, Map, Key, Value0, Constraints) ->
+filter_constraints(Tail, Map, Errors, Key, Value0, Constraints) ->
 	case cowboy_constraints:validate(Value0, Constraints) of
 		{ok, Value} ->
-			filter(Tail, Map#{Key => Value});
+			filter(Tail, Map#{Key => Value}, Errors);
 		{error, Reason} ->
-			exit(Reason)
+			filter(Tail, Map, Errors#{Key => Reason})
 	end.

@essen
Copy link
Member

essen commented Jul 1, 2017

Well I pushed in most recent commit. :-)

@manifest
Copy link
Author

manifest commented Jul 1, 2017

I see. So, I've got

{error, Constrains} = cowboy_req:match_qs(...)

Next, I could execute

maps:map(fun(Key, Val) -> fun cowboy_constraints:format_error(Val) end, Constrains)

Does the map is a right container then?

@essen
Copy link
Member

essen commented Jul 1, 2017

Depends on what you're going to do with it. Would certainly be better if you were going to put errors next to form fields in an HTML page for example. And conversion from this map should be fast if you need a list (it's unlikely to be a large map).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants