pushing predicate generation into backend (important security issue) #60

tracyhenry · 2019-02-28T19:17:10Z

right now the predicate functions are run in the client. The resulting predicates are then passed to the backend. This will invite SQL injections...

Tentative solution is to put the generation back into the backend. Also, it's necessary to write a parser to ensure the predicates generated are in "good form"

tracyhenry · 2019-03-29T02:10:21Z

been thinking about how to fix this.

In general, predicates are generated in two ways:

Initial predicates specified using the declarative model;
produced by the predicate function of a jump

Backend knows the initial predicates and the predicate functions. So to completely have the backend generate the predicates, we just need to have the client send the input (which is a data tuple) of the predicate function.

However, it is still possible for a hacker to send an "evil tuple" which causes the predicate function to generate an evil predicate. So I think we need to do some validations of the predicates generated... e.g. do not contain semicolons.

asah · 2019-03-29T04:06:03Z

nice. typically, best practice is to send no "code" from clients and only fixed, parseable atomic values or very simple expressions. one idea is to parse for simple (nested) function calls, then provide a library of functions. For user defined functions, require a prefix, e.g. udf_myfunc1(), e.g. OR(AND(UDF_MYFUNC1(arg1, arg2), expr1, expr2)). the parser is trivial and generator is safe if you whitelist the functions using the prefixes, so they can't call system internal functions that could cause trouble or break-out of this jail.

…

On Thu, Mar 28, 2019 at 10:10 PM Wenbo Tao ***@***.***> wrote: been thinking about how to fix this. In general, predicates are generated in two ways: - Initial predicates specified using the declarative model; - produced by the predicate function of a jump Backend knows the initial predicates and the predicate functions. So to completely have the backend generate the predicates, we just need to have the client send the input (which is a data tuple) of the predicate function. However, it is still possible for a hacker to send an "evil tuple" which causes the predicate function to generate an evil predicate. So I think we need to do some validations of the predicates generated... e.g. do not contain semicolons. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#60 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAIi4v8KDoW86imVvq6-Xc3VXDibBs1kks5vbXYNgaJpZM4bXi6d> .

tracyhenry · 2020-05-05T15:39:54Z

in the latest PR I basically did what you suggested -- writing a parser which only allows predicates conforming to a format like OR(AND(col1='str', col2='str'), AND(...)) no udf is allowed at the moment given no applications so far required it (most predicates we've seen are pk-fk).

asah · 2020-05-05T16:19:10Z

nice!!

tracyhenry added bug front-end labels Feb 28, 2019

tracyhenry self-assigned this Mar 29, 2019

tracyhenry added a commit that referenced this issue Apr 22, 2020

fixes #60.

ee818e7

tracyhenry mentioned this issue May 5, 2020

bug fix laundry list #147

Merged

tracyhenry closed this as completed in #147 May 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pushing predicate generation into backend (important security issue) #60

pushing predicate generation into backend (important security issue) #60

tracyhenry commented Feb 28, 2019

tracyhenry commented Mar 29, 2019

asah commented Mar 29, 2019 via email

tracyhenry commented May 5, 2020

asah commented May 5, 2020

pushing predicate generation into backend (important security issue) #60

pushing predicate generation into backend (important security issue) #60

Comments

tracyhenry commented Feb 28, 2019

tracyhenry commented Mar 29, 2019

asah commented Mar 29, 2019 via email

tracyhenry commented May 5, 2020

asah commented May 5, 2020