Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Driver panics on user data #190

Closed
or-else opened this issue May 20, 2015 · 17 comments
Closed

Driver panics on user data #190

or-else opened this issue May 20, 2015 · 17 comments

Comments

@or-else
Copy link
Contributor

or-else commented May 20, 2015

The driver panics on user data in
https://github.com/dancannon/gorethink/blob/master/query_control.go#L26
The panic is not documented. The doc just states "If the value cannot be converted, an error is returned at query .Run(session) time" which is only partially true.

Panicking on user data is not a good pattern. The driver should be able to handle any input without crashing.

If I do an Insert on user-provided data, the only way to avoid the panic is to pre-parse it myself to ensure the maximum nesting depth is not exceeding the driver-imposed limit. Which is a lot of unnecessary code duplication and extra CPU cycles.
Wrapping every call to gorethink with Recover does not seem like a clean solution either.

Why not just

return Term{
    termType: p.Term_DATUM,
    data:     nil,
}

instead of panicking?

@dancannon
Copy link
Collaborator

Hi, you have raised some pretty good points in the issue however I don't think it's as simple as just removing the panic (although I agree panicking on user data is not a good idea). My worry with just removing the panic is that you hide what is going on, for example you could try to persist some data and the driver would accept this data as being valid but silently discard some data.

Here are all the possible solutions I can think of, I think it would be good to see if anybody else has any other ideas.

  • Keep the panic
  • Remove the panic and return a nil Term, could cause silent data loss as mentioned above
  • Remove the panic and return a nil Term but log what has happened
  • Change the Expr function to return both a Term and and error, this is the safest option but makes calling the function significantly less tidy. For example you wouldnt be able to do something like this r.Table("test").Insert(r.Expr(...))
  • Save the error in the term and raise an error when building the query (when calling Run)

@kofalt
Copy link

kofalt commented May 20, 2015

It does looks like that pattern is already used later on:

if err != nil || data == nil {
    return Term{
        termType: p.Term_DATUM,
        data:     nil,
    }
}

In general, I am not a fan of losing control flow information this way - we can look forward to a very confused bug report on how data sometimes turns into nil at an arbitrary complexity threshold. Unfortunately, this pattern eliminates any distinction between an error and Expr(nil), a common problem when juggling JSON-like structures.

Unless I misunderstand, Expr is precluded from using a multi-return pattern as it needs to be chainable. Hitting a recursion limit does sound like exceptional behaviour, and I assert that responsible web development includes the use of some variant of a recovery handler.

@kofalt
Copy link

kofalt commented May 20, 2015

... I did not see @dancannon's post before mine, whoops!

@or-else
Copy link
Contributor Author

or-else commented May 20, 2015

Yes, saving the error in the Term to return it on Run seems like a much cleaner solution than panicking or silently dropping the data.

@or-else
Copy link
Contributor Author

or-else commented May 20, 2015

Regardless of the way you choose to approach the problem, panicking on user data is bad. If a malicious user knows that the service is written in Go and backed by Rethink, he/she will be able to predictably crash the service at will.

@kofalt
Copy link

kofalt commented May 20, 2015

predictably crash the service at will

Ref my comment:

I assert that responsible web development includes the use of some variant of a recovery handler.

Mainly for reasons unrelated to gorethink; for example, an out-of-bounds array access causing a production server process to fall over would be irresponsible indeed!

saving the error in the Term to return it on Run

This approach sounds like a good idea, and it's the fifth posibility proposed by @dancannon.

I would support this approach as it has the least loss of control flow information without significantly disrupting the usability of Expr.

@or-else
Copy link
Contributor Author

or-else commented May 20, 2015

@kofalt, it seems like you are commenting on the assumption that I have not read yours or @dancannon's comments. Please rest assured I have read all comments on this page.

@dancannon
Copy link
Collaborator

So I have come up with a quick solution in the feature/expr_term_errs branch, however when creating some test cases I noticed that the following code also panics.

type A struct {
    A *A
}

func main() {
    a := &A{}
    a.A = a

    r.Expr(a) // panics, same as encoding/json
    // runtime: goroutine stack exceeds 1000000000-byte limit
    // fatal error: stack overflow
}

Maybe since encoding/json doesn't even attempt to check to check for infinitely nested data structures (the struct <-> map encoding in GoRethink is based on this package) the driver shouldn't either, after thinking about it some more I think it's up to the application to properly validate the data from users not the database driver.

What do you think?

@or-else
Copy link
Contributor Author

or-else commented May 21, 2015

In your example a developer has created an object that cannot be marshaled into a valid json. There is no way to receive valid json from a user then unmarshal it into this recursive structure. Panicking on a bug is fine.

The other case is different. You can receive a valid deeply nested json, unmarshal it into a map, pass it to gorethink and have it panic:

https://play.golang.org/p/k4kL6IhTEF

@kofalt
Copy link

kofalt commented May 21, 2015

I can sympathize with wanting to check recursion depth - I don't offhand know of a golang deserializer that behaves intelligently in the face of untrusted input - but punting on that responsibility seems entirely reasonable.

I would be comfortable with either leaving the check or removing it at your digression. User input is capable of causing exceptions, full stop. But an unconfigurable N-deep limit isn't perfect either.

@dancannon
Copy link
Collaborator

I have decided to remove the depth check from Expr, my reasoning for doing this instead of passing the error to Run is that this was quite a messy solution and to be honest the DB driver should not be responsible for validating your data structures.

This change will probably be released in v1.0.

@kofalt
Copy link

kofalt commented May 23, 2015

👍

@or-else
Copy link
Contributor Author

or-else commented May 24, 2015

I think this is reasonable. Thanks.

@or-else
Copy link
Contributor Author

or-else commented May 24, 2015

As for passing the error to Run, it's not a bad idea in general. You have a few places where you just drop the error. It would be nicer to pass it to to Run instead.

@dancannon
Copy link
Collaborator

Merged to develop.

@dancannon
Copy link
Collaborator

Merged to master.

@or-else
Copy link
Contributor Author

or-else commented Jun 8, 2015

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants