Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fully deprecate textual coercion in where(), filter(), order_by(), add docnotes not to pass untrusted input to thsee #4481

Closed
21k opened this issue Jan 31, 2019 · 35 comments

Comments

@21k
Copy link

21k commented Jan 31, 2019

@zzzeek
1、SQLAlchemy through 1.2.17 and 1.3.x through 1.3.0b2 allows SQL Injection via the order_by parameter.
2、exp

  • save the code to local: https://github.com/21k/sqlalchemy/blob/master/dal.py
  • exec code at shell terminal
    ---------------------------------
    python dal.py 'if(1=1,create_time,username)'
    python dal.py 'if(1=2,create_time,username)'
    python dal.py 'create_time'
    ---------------------------------
  • the vul happens at fun order_by

3、reserved by CVE-2019-7164

@zzzeek
Copy link
Member

zzzeek commented Jan 31, 2019

hi there -

why do you consider order_by() to be something that would receive untrusted user input? Please be specific why these are not also CVEs:

    stmt = select([table]).where(<string>)
   
    sqlalchemy.text(<string>)

    connection.execute(<string>)

    exec(<string>)

Mitigations for "SQL injection" deal completely with statement parameters (see https://en.wikipedia.org/wiki/SQL_injection#Mitigation), and not the syntax of the statement itself. SQL injection is not expected to be a vulnerability at the syntax level.

All elements of the select() object accept plain SQL as an argument which is assumed to be programmatic input. This has been the explicit contract of these methods for many years. In recent versions they emit a warning when the coercion to text() occurs, but we haven't created a formal deprecation schedule, although the 1.3 series is a good place to do that (see #4393). But we can't just take it out retroactively because it would break existing code. We can certainly add documentation and make the warning messages more dire, but I wonder why Python's own docs at https://docs.python.org/3/library/functions.html#exec say nothing similar for exec().

@zzzeek zzzeek added the sql label Jan 31, 2019
@21k
Copy link
Author

21k commented Feb 1, 2019

1、In some website, developers use dynamic sort operation like this

2、If we don't remid developers the potential risks of using untrusted input at order_by ,sql injection may occur.
3、The sqlalchemy demos for python framework like django、flask just use order_by(raw_user_input) to show the sort functions ,developers may think the instructions are safe and sure of security.Most of them think the use of orm is 100% guard against sql injection.
4、So can we add a fun like 'order_by_safe' which receive field only in models

@kamikaze
Copy link

kamikaze commented Feb 1, 2019

never trust input data

@21k
Copy link
Author

21k commented Feb 1, 2019

never trust input data

It is truth.
But most developers and even adepts may forget this principle sometimes,or they just misunderstand that the use of orm is safe.
In front end,frameworks like vue、react、angular ,renders of dom use {{}} which is sure of guard against xss.They also provide trustAsHtml、dangerouslySetInnerHTML for unsafe scenes.

@zzzeek
Copy link
Member

zzzeek commented Feb 1, 2019

the steps to take here are:

  1. promote the warnings that are currently emitted to deprecation messages, so the text coercion can be removed in 1.4 (it is likely not used by now anyway since it has been emitting a warning for a couple of years now).

  2. add deprecation notes to the docstrings for all of order_by(), where(), etc. and everywhere else that text coercion is occurring here and that it is deprecated

  3. the deprecation note will have a sub-section that is a dragon "warning" that untrusted user input should not be passed.

  4. if there is flask documentation that is illustrating the passing of what is explicitly noted to be raw user input from a web form directly into order_by(), you need to show me that because that is an extraordinary claim. THAT is your CVE.

@zzzeek zzzeek added this to the 1.3 milestone Feb 1, 2019
@zzzeek zzzeek changed the title sql injection via the order_by parameter fully deprecate textual coercion in where(), filter(), order_by(), add docnotes not to pass untrusted input to thsee Feb 1, 2019
@zzzeek
Copy link
Member

zzzeek commented Feb 1, 2019

never trust input data

It is truth.
But most developers and even adepts may forget this principle sometimes,or they just misunderstand that the use of orm is safe.
In front end,frameworks like vue、react、angular ,renders of dom use {{}} which is sure of guard against xss.They also provide trustAsHtml、dangerouslySetInnerHTML for unsafe scenes.

in a front-end JS framework, EVERYTHING is untrusted user input. The source code can be directly rewritten by a hostile application in an adjoining window. There's no comparison between a front-end framework guarding against XSS, which can literally be anywhere, and a database driver that must ensure that parameters are bound.

@zzzeek
Copy link
Member

zzzeek commented Feb 1, 2019

I've googled a few flask tutorials and the official flask documentation and all examples of order_by() illustrate an ORM-mapped column object being passed in, no strings. People generally don't even know these methods accept strings.

@21k
Copy link
Author

21k commented Feb 1, 2019

I've googled a few flask tutorials and the official flask documentation and all examples of order_by() illustrate an ORM-mapped column object being passed in, no strings. People generally don't even know these methods accept strings.

flask sqlalchemy also has this problem https://github.com/21k/sqlalchemy/blob/master/flask/__init__.py

@sqla-tester
Copy link
Collaborator

Mike Bayer has proposed a fix for this issue in the master branch:

Remove all remaining text() coercions https://gerrit.sqlalchemy.org/1126

@zzzeek
Copy link
Member

zzzeek commented Feb 6, 2019

hi did you also submit CVE-2019-7548 ? why is there no bug for that one here?

@21k
Copy link
Author

21k commented Feb 11, 2019

hi did you also submit CVE-2019-7548 ? why is there no bug for that one here?

not me

@zzzeek
Copy link
Member

zzzeek commented Feb 20, 2019

well, you and whoever else posted that CVE have lit up the whole internet so just the group_by and order_by being blocked I think I should just release in 1.2.19.

@Beuc
Copy link

Beuc commented Mar 11, 2019

I see that Debian LTS and RHEL 7 ship sqlalchemy-0.9.8, and both mark it as vulnerable (it is).
However 30307c4 builds on prior de-coercion work - I tried backporting it today for Debian LTS and the test suite doesn't like it much.
Do you know is there's any plan on updating this version?
Or would you recommend an alternate fix?

@zzzeek
Copy link
Member

zzzeek commented Mar 11, 2019

0.9.8 is EOL for many years on the upstream side. RHEL will be porting patches internally (UPDATE: they are "wontfix"). All the tests that assume this functionality need to be changed or removed. It will also cause breakage for any applications that are dependent on the behavior.

@zzzeek
Copy link
Member

zzzeek commented Mar 11, 2019

to patch the CVEs you only need:

this line:

30307c4#diff-9deda8c7c10c7034cd9463f530ca81f0R769

and this method:

30307c4#diff-e9e388f99dea16cf0a1ad8f227cf6920R4495

nothing else is part of the CVE.

@Beuc
Copy link

Beuc commented Mar 11, 2019

Hmmm, 0.9.8 lacks visit_textual_label_reference and AFAICS all the "Textual*" prior work.
Any ETA for RHEL?

@zzzeek
Copy link
Member

zzzeek commented Mar 11, 2019

they are assigned to me so sometime in the coming weeks, but if the change is too involved in the 0.9 series if might not be worth it. note these "CVEs" are against an API feature that has been part of SQLAlchemy's tutorial for over ten years - all components of the select() construct accept plain text. Only in 1.0 was it changed to favor the use of text() over plain string and a warning was emitted if text() was omitted.

@Beuc
Copy link

Beuc commented Mar 11, 2019

Thanks a lot for the detailed feedback!

@sqla-tester
Copy link
Collaborator

Mike Bayer referenced this issue:

Illustrate fix for #4481 in terms of an 0.9 patch https://gerrit.sqlalchemy.org/1165

@zzzeek
Copy link
Member

zzzeek commented Mar 12, 2019

Alrighty, the 0.9 version is illustrated at https://gerrit.sqlalchemy.org/1165. There is the significant caveat that it breaks compatibility with the textual use case, that is order_by("some expresion") is passed through. However this patch can be applied to the downstream distributions if they wish to disallow the entire feature, which might be fine considering this is a little known feature in any case.

It also breaks compatibility with a subset of that feature that is still supported today, which is where select([table.c.col1]).order_by("col1") will match up the string "col1" to the table.c.col1 object. That capability is still used in the most recent 1.3 series; it's "safe" because the given string is not passed through for render, it is matched against a column expression elsewhere in the select() which is used instaed. that is, it's just like saying select([table.c.col1]).order_by(table.c['col1']).

with this patch, none of the above features are allowed and a blanket exception is raised requesting that the text() construct be used, e.g. order_by(text("some expression")).

@Beuc
Copy link

Beuc commented Mar 12, 2019

Hi!

I tested f128dd6.diff.base64 and it seems the POC still works though:

root@jessie-lts:~/sqlalchemy# python dal.py username
/usr/lib/python2.7/dist-packages/sqlalchemy/ext/declarative/api.py:55: SAWarning: Ignoring declarative-like tuple value of attribute username: possibly a copy-and-paste error with a comma left at the end of the line?
  _as_declarative(cls, classname, cls.__dict__)
['dal.py', 'username']
raw_sql:SELECT "user".id AS user_id, "user".create_time AS user_create_time 
FROM "user" ORDER BY username
3 2019-03-11 11:48:08
1 2019-03-11 11:42:19
2 2019-03-11 11:44:30
root@jessie-lts:~/sqlalchemy# python dal.py 'if(1=1,create_time,username)'
/usr/lib/python2.7/dist-packages/sqlalchemy/ext/declarative/api.py:55: SAWarning: Ignoring declarative-like tuple value of attribute username: possibly a copy-and-paste error with a comma left at the end of the line?
  _as_declarative(cls, classname, cls.__dict__)
['dal.py', 'if(1=1,create_time,username)']
raw_sql:SELECT "user".id AS user_id, "user".create_time AS user_create_time 
FROM "user" ORDER BY if(1=1,create_time,username)
1 2019-03-11 11:42:19
2 2019-03-11 11:44:30
3 2019-03-11 11:48:08
root@jessie-lts:~/sqlalchemy# python dal.py 'if(1=2,create_time,username)'
/usr/lib/python2.7/dist-packages/sqlalchemy/ext/declarative/api.py:55: SAWarning: Ignoring declarative-like tuple value of attribute username: possibly a copy-and-paste error with a comma left at the end of the line?
  _as_declarative(cls, classname, cls.__dict__)
['dal.py', 'if(1=2,create_time,username)']
raw_sql:SELECT "user".id AS user_id, "user".create_time AS user_create_time 
FROM "user" ORDER BY if(1=2,create_time,username)
3 2019-03-11 11:48:08
1 2019-03-11 11:42:19
2 2019-03-11 11:44:30

(I expected something like Textual SQL expression 'if(1=2,create_time,userna...' should be explicitly declared as text('if(1=2,create_time,userna...'.)

The test suite (py.test --db sqlite --db postgresql --db mysql) passes OK on Debian Jessie (save for a few mysql unicode errors that were already there prior patching).
Only I had to fix a missing ) in this patched test, before .all():

diff --git a/test/orm/test_query.py b/test/orm/test_query.py
index 444bd3c..337bb91 100644
--- a/test/orm/test_query.py
+++ b/test/orm/test_query.py
@@ -1349,7 +1349,8 @@
         q2 = s.query(User).filter(User.name == 'fred').with_labels()
         eq_(
             s.query(User).from_statement(union(q1, q2).
-            order_by('users_name')).all(), [User(name='ed'), User(name='fred')]
+            order_by(text('users_name')).all(),
+            [User(name='ed'), User(name='fred')]
         )

@zzzeek
Copy link
Member

zzzeek commented Mar 12, 2019

0.9 does this more crappily, missed a few, one moment

@zzzeek
Copy link
Member

zzzeek commented Mar 12, 2019

OK please see patch set 2 at https://gerrit.sqlalchemy.org/1165

@Beuc
Copy link

Beuc commented Mar 12, 2019

The POCs are now blocked.
The patch didn't apply cleanly on 0.9.8 for one test but that was easily fixable.
I get more test failures though (https://pastebin.com/HvjHbXaY), those are missing text() AFAICS.
I packaged this and I'll advertise it for users to test https://people.debian.org/~beuc/lts/sqlalchemy/

@zzzeek
Copy link
Member

zzzeek commented Mar 12, 2019

FYI we just had our first official "someone was relying on this behavior" report in #4538, depite that it's been emitting a warning for four years, so this might break a very small portion of downstream applications

@sqla-tester
Copy link
Collaborator

Mike Bayer referenced this issue:

Illustrate fix for #4481 in terms of a 1.2 patch https://gerrit.sqlalchemy.org/1184

@sanderfoobar
Copy link

I was hoping to find a gaping hole in SQLa, instead I see a wannabe sec researcher trying to make a name for himself. Boooo.

Sorry for off-topic.

@ccaapton
Copy link

ccaapton commented Jul 12, 2019

Will there be any mechanism for whitelisting some functions?

I'm using clickhouse with parametric functions. In version 1.2, I can use func.__getattr__('quantiles(0.9)')(col) to represent the query, and sqlalchemy will render it as quantiles(0.9)(col), but now it becomes "quantiles(0.9)"(col) thus no-longer a valid query in clickhouse. Do you have any suggestion on this?

@zzzeek
Copy link
Member

zzzeek commented Jul 12, 2019

here is a new hack, however, it is not guaranteed to remain functioning, since this is overall not how func was intended to be used:


from sqlalchemy import func
from sqlalchemy.sql import quoted_name

print(
func.__getattr__(quoted_name('quantiles(0.9)', False))('foo')
)

we right now don't have a built-in construct for a first-class SQL function like that, the most legit way to implement this in your project would be to use @compiles with GenericFunction:

from sqlalchemy import column
from sqlalchemy import func
from sqlalchemy import literal_column
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql import functions



class quantiles(functions.GenericFunction):
    def __init__(self, arg, expr):
        self.arg = arg
        super(quantiles, self).__init__(expr)

@compiles(quantiles)
def compile(element, compiler, **kw):
    return "quantiles(%s)(%s)" % (
        compiler.process(element.arg, **kw),
        compiler.process(element.clauses, **kw),
    )


print(func.quantiles(literal_column(0.9), column('x')))

see https://docs.sqlalchemy.org/en/13/core/functions.html?highlight=genericfunction#sqlalchemy.sql.functions.GenericFunction

@tongxiaoge1001
Copy link

tongxiaoge1001 commented Jul 12, 2021

I find the MR (https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/1184/) is abandoned, So how do you fix this bug in the rel_1_2 branch?@zzzeek

@zzzeek
Copy link
Member

zzzeek commented Jul 12, 2021

1.2 is EOL. if you are stuck on 1.2, use the warnings filter to promote the warning to an error.

@tongxiaoge1001
Copy link

1.2 is EOL. if you are stuck on 1.2, use the warnings filter to promote the warning to an error.

Can I use the patch (https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/1184/ ) to fix CVE-2019-7164 and CVE-2019-7548 ? I use sqlalchemy 1.2.19.

@zzzeek
Copy link
Member

zzzeek commented Jul 14, 2021

sure

@tongxiaoge1001
Copy link

sure

ok, thanks for your quick reply

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants