-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pushing predicate generation into backend (important security issue) #60
Comments
been thinking about how to fix this. In general, predicates are generated in two ways:
Backend knows the initial predicates and the predicate functions. So to completely have the backend generate the predicates, we just need to have the client send the input (which is a data tuple) of the predicate function. However, it is still possible for a hacker to send an "evil tuple" which causes the predicate function to generate an evil predicate. So I think we need to do some validations of the predicates generated... e.g. do not contain semicolons. |
nice. typically, best practice is to send no "code" from clients and only
fixed, parseable atomic values or very simple expressions.
one idea is to parse for simple (nested) function calls, then provide a
library of functions. For user defined functions, require a prefix, e.g.
udf_myfunc1(), e.g. OR(AND(UDF_MYFUNC1(arg1, arg2), expr1, expr2)).
the parser is trivial and generator is safe if you whitelist the functions
using the prefixes, so they can't call system internal functions that could
cause trouble or break-out of this jail.
…On Thu, Mar 28, 2019 at 10:10 PM Wenbo Tao ***@***.***> wrote:
been thinking about how to fix this.
In general, predicates are generated in two ways:
- Initial predicates specified using the declarative model;
- produced by the predicate function of a jump
Backend knows the initial predicates and the predicate functions. So to
completely have the backend generate the predicates, we just need to have
the client send the input (which is a data tuple) of the predicate function.
However, it is still possible for a hacker to send an "evil tuple" which
causes the predicate function to generate an evil predicate. So I think we
need to do some validations of the predicates generated... e.g. do not
contain semicolons.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#60 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAIi4v8KDoW86imVvq6-Xc3VXDibBs1kks5vbXYNgaJpZM4bXi6d>
.
|
in the latest PR I basically did what you suggested -- writing a parser which only allows predicates conforming to a format like OR(AND(col1='str', col2='str'), AND(...)) no udf is allowed at the moment given no applications so far required it (most predicates we've seen are pk-fk). |
nice!! |
right now the predicate functions are run in the client. The resulting predicates are then passed to the backend. This will invite SQL injections...
Tentative solution is to put the generation back into the backend. Also, it's necessary to write a parser to ensure the predicates generated are in "good form"
The text was updated successfully, but these errors were encountered: