New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepared Statements and Security #1608
Comments
The justifications are in the original issue i believe, one of them being that the drivers didn't support it - Not sure if that is the case anymore? Can you point to any reading material on bound parameters being inheritently more secure than framework escaping? |
For a fairly decent treatment of the topic: http://stackoverflow.com/questions/8263371/how-prepared-statements-can-protect-from-sql-injection-attacks This node library implements prepared statement support: https://github.com/sidorares/node-mysql2 The main gist here is that you are passing a 'template' to the server first and then you are passing the parameters to be used within the template separately. As a result, there is 0% chance of the server (assuming it is handling its arguments correctly) misinterpreting a value in a bound parameter as an instruction that should be executed as part of the template. Any library that implements parameter escaping instead is inherently less secure as it is possible that malicious 'instructions' could be inserted into the string that is ultimately passed to the server. |
We use node-mysql though, plus we have support for sqlite and postgres aswell. |
The link you provided seems to cover general SQL injection protection - I was hoping for resources digging into framework vs database. But as i said, i see your point ;) In any case the server is most likely implemented "better" than any framework. |
Wait, are you seriously resending the whole SQL for EVERY CRUD OPERATION? Do you realize that this means that the server will have to PARSE SQL on every operation, which is ridiculous and totally unnecessary? Not to mention the overhead to escape things and so on. It's unfortunate that Node.js ORM software is such a joke. Please fix yesterday. |
@bill-myers Awaiting your PR. |
@mickhanson if it still interests you. http://security.stackexchange.com/questions/3611/sql-injection-why-isnt-escape-quotes-safe-anymore From a theoratical point of view, prepare statements would prepare the access plans awaiting the parameters to be passed in, hence shouldn't be slow, as compared to javascript escaping the queries. And plus, it was viewed that it's never possible to cover the myriad of possibilities to inject given all possible permutations of sql constructs. Though general escaping would reduce such possibilities to near zero. Prepare statement is a solution that attacks the fundaments in a very common sense way. That is, if the thing that you were intending to do has changed so much after the parameters have been passed in, the third party must have passed in something that changed what you have intended. Thats quite rock solid. |
@mickhansen Overall I'm disappointed this is not considered a priority over other issues or features. The security benefits are well documented, personally I would think that would trump most if not all other issues outside of bugs breaking basic functionality. And then there are the performance benefits you get as a bonus. I get this is an open source project and if I really needed this I could submit a PR myself (which I don't, since I'm not using sequelize). But there is a difference between "we as maintainers don't have the time to implement new features like this" vs "we as maintainers have time to do some stuff, but we've decided this isn't important enough". I consider something that would improve security greatly and make the web a safer place of critical importance, so if other features are being prioritized over this it really bums me out. |
@efuquen I understand your sentiment. Can you quantify the exact security benefits? So far the single security vulnerability in sequelize has been from an injection into the column name, which prepared statenents wouldn't have done much again. As for performance benefits, i'm still actively looking for some documentation or benchmarks on that (other than that i'm apparently an idiot for not obviously accepting that fact). (also say switching to node-mysql2 rather than node-mysql might take a performance hit) I don't doubt that security would improve from a theoretical point of view, however from a practical point the effects are probably minimal. It's still a priority for us but this change is more a bottom-up refactor than a 2-3 hour bugfix/feature. "Important enough" is a dangerous term to use, there's a factor of 10 more people who think that getting polymorphic associations or JSONB implementated is more important than prepared statements. (although i guess in fairness they might not be running penetration tests on their codebase). @efuquen Your comment was appreciated while highly negative and doesn't really seem to take into account the resources needed to run a project of sort-of-decent size - But i've put investigating the effort needed to implement this for postgres, sqlite, mysql and mssql on my list: #3495 and am hopefully able to do a review soon. @mbroadst Does the MSSQL driver support prepared statements? |
@mickhansen sure does! I'll definitely take over migrating that over if you end up choosing a pattern for the other dialects. http://pekim.github.io/tedious/parameters.html for reference. |
I also believe that prepared statements can improved performance if used However, the key here is "if prepared statement can be reused". This If this can't be achieved, it could actually decrease performance a lot. The MySQL 2 driver supports 1 but not 2 ( The Postgres driver ( For what it's worth, I think that prepared statements support could be Again, you are doing great work with sequelize. I follow and use quite a Cheers, Manuel On 9 April 2015 at 01:26, Mick Hansen notifications@github.com wrote:
|
@mdarveau Prepared statements (atleast in the case of PG) can only be cached per connection - And since we use pooling which will often release and create new connections it's hard to tell what impact this would have :) |
And thanks for the kind words @mdarveau |
I've been following this discussion on the sideline, but I feel I should give my input as well. First of all let me say that I am completely behind mick on this one. We appreciate that the issue has been brought up, but we have lots of other things to tend to - We are looking into it, but we are not going to fix it yesterday. I see the theoretical security benefit, but not much other than that. As for performance I doubt there is much to gain. I can't see how any DBMS would implemented prepared statements that support @mdarveau's 2nd point. If the prepared statement can be used by several connections, how will the DBMS know when to garbage collect the statement? Of course it could rely on the user to explicitly call unprepare when the statement is no longer needed, but I doubt that. Pg explicitly states that prepared statements are per connection http://www.postgresql.org/docs/9.4/static/sql-prepare.html and I'm pretty sure mysql and mssql are the same. And thanks for your words @mdarveau :) |
At least for postgresql (and node-pg) there is a difference between "prepared statement" and "using bind parameter". A prepared statement is done by sending the "prepare" sql statement, and can then be used (on that connection) as often as needed. It would usually have parameters, but that is not a must. bind parameters can be used without explicit prepare.
will send "foo" as bind parameter (watch the pqsql log). Using bind parameter should be more secure than quoting and embedding in the sql. just to say, I would much like to see bind param to be used. |
@User4martin Agreed, the title is a bit misleading here - bind parameters would give the security improvement, not prepared statements ;) |
Yes, there are measurable improvements in database throughput and especially latency when you can reuse a query plan. It's been a few years, so I don't know where things are today, but around the time Oracle 10 came out we were trying to diagnose a throughput problem. Disk, CPU and Network were all well below red line, quite a head scratcher. It came down to the revelation that Oracle could only run a certain number of query plans simultaneously before they would queue. The culprit was the guy who wrote our most important search filter query. He didn't take us quite as seriously as I hoped when we carefully explained prepared statements to him, and so only about 2/3rds of the optional parameters were bound. We were flooding the query plan cache with unique queries and they were queuing up waiting for others to finish. Googling around it looks like perhaps Oracle still works this way. There may be some benefit from Postgres but they're a bit vague on the details. Can't speak for the others, but at scale this could matter quite a bit. |
There's also this article: https://securityblog.redhat.com/2015/05/20/json-homoiconicity-and-database-access/ This is kind of a big deal, as in "One of the classic blunders" big deal, and I can completely sympathize with @bill-myers for his brash "fix this yesterday". It's frustrating for us old guys to watch every crop of languages over multiple decades repeat the same security mistakes. Bind variables guarantee that the parameters are only ever interpreted as parameters, regardless of user treachery and the vagaries of the Unicode spec. You are working far too hard trying to emulate that behavior, and I'm curious to know how many of the library dependencies are only there for that purpose. |
@jdmarshall The performance improvements of reusing a query plan assumes you are constantly reusing the same connection (which is not necessarily the case with pooling). I can certainly sympathize with anyone that wants a change in an OSS library, it's not very often they put in any work for to help accomplish it though :) Using bound parameters is still a priority on our list, but this project is not light on requests and we don't have as much time as we'd like. |
@mickhansen I was only trying to say that prepared statements without bound parameters don't scale very high. Would it help if someone opened another ticket titled 'Bind Parameters for Security'? As @janmeier stated a while back, PSs in and of themselves don't do anything for security. In fact in the case I mentioned they gave a false sense of security because someone didn't follow procedure. I get that you have to have a stock answer for people shouting at you about their feature X that you guys are too lazy to implement, or whatever else people insinuate. But there's a big difference between people complaining about missing features and someone opening a ticket about a fundamental security issue in a widely deployed library. This ticket is 20, 21 months old? I think the message about priorities has been communicated clearly. I prefer your API designs over the other guys', but I personally can't peddle something while I know it's not architected for safety. I hope you do (and I think you can), but meanwhile I have to take care of my own projects. |
@jdmarshall Increased security would be the goal, and we're generally pretty quick about fixing security issues. I guess this one hasn't proven too immediate. What we need help with is a strategy for rebuilding all queries with bound parameters from the ground up. |
If node-mysql isn't using prepared statements/bind parameters, isn't it inherently vulnerable to SQL injection? Shouldn't that be a concern? |
@cullylarson not really It's definitely easier to make mistake in driver's implementation of escape/interpolation of parameters but thus far there are no known problems |
Similar to postgres, MSSQL also has a different between prepared statements and parameterized queries. Via tedious, queries are already being send as In MSSQL, it isn't just about security but also about memory usage and speed. Each query will be unique and generate it's own query plan. SQL Server does support forced-parameterization, which appears to help mitigate the issue, but a number of the generated queries include disqualifies. @mickhansen I was thinking about work-arounds that wouldn't require mass refactoring. One thought would be to replace most calls to |
@kjvalencik That might work |
node-sql correctly constructs parameterized queries for many dialects (I think all the ones sequelize tries to support). |
@taoeffect Using a separate query builder is highly impractical for us since we have an API that doesn't map to it. There's also no garantee that a separate query builder supports everything we need. But it's worth investigating |
Sqlite and postgres already has support for bind parameters in raw queries, thanks to @User4martin (#4688) - So all it takes for this to be fixed is for someone to begin converting the query generator to return |
@janmeier , I think this is an oversimplification. It's true that using bind parameters with non-prepared queries addresses one set of risks (SQL encoding bugs in sequelize itself). But the risk that applications process JSON data and thus run SQL queries that where never intended still remains. |
@fweimer, not sure how you could run unintentional queries when using bind parameters - My understanding is that bind parameters (which translate to parameterized queries in postgres and sqlite) do not allow queries. The query is planned before the parameters are inserted, and thus does not allow you to inject SQL, https://github.com/brianc/node-postgres/wiki/Prepared-Statements#parameterized-queries |
@janmeier, the structural alteration of the query happens before sequelize is called, in application code. The difference to regular SQL injection is that it's not about strings and missing boundaries, but about JSON data. |
@fweimer Oh, are you talking about raw queries? In that case yes, users can definitely be "fooled" to create malicious queries, but sequelize doesn't really have control over that - We can only help them by recommending bind parameters. For queries generated by sequelize on the other hand, we do control the queries, and can use parameterized queries properly - We don't do at the moment, but the infrastructure is there |
I think @fweimer is referring to using something like |
User.findAll({
where: req.body.where
}); We should be able to fix that by changing the query generator to use bind parameters - Since the query is planned with the placeholders, so even if |
@janmeier That doesn't fix the security issue, say you do: User.findAll({
where: {
id: req.body.id
}
});
Of course in my opinion that query should be scoped to some kind of ACL anyways so it becomes less of a problem if you have proper defensive coding in place. |
Ah yes, that is still an issue - But not something that bind paramters can solve :) |
Trying to handle I have to admit, I was a bit surprised to find that sequelize doesn't use parameterized queries by default. I understand that could be a significant overhaul, though. Just to make sure I understand, when using raw queries with bind, the native driver bind / parameterization capabilities do get utilized, correct? But they aren't utilized when using standard Model methods? |
Unexpected data in @dvlsg Correct - raw queries use bind in sqlite and postgres, sequelize metohods do not |
To be honest I am extremely surprissed that bind/prepared statements is not used for a modern SQL framework. It is much much safer, and will in practice prevent catastrophic failures. I appreciate this project and the fact that it is open source, but leaving security to the developers when it can be handled at the framework level, is not going to be a sound strategy. |
@holm I appreciate your feedback. It's simply just a massive refactor when it's not built-in from the ground up. But we've not forgotten it and are still thinking about how best to refactor. |
Just a sidenote (in case it affects anyone else!) There is discussion above about whether or not this would affect performance at all. I've a confirmed use-case where the design decision not to use prepared statements/parameterised queries does have a dramatic affect on performance, which I'm documenting here so that someone else doesn't have to re-research this behaviour! On MSSQL (tedious), with a SQL table such as:
And a matching model such as:
If I create an 'artifact' with a content field that is a Buffer which contains about 1.3MB of data, then it takes SQL server ~30seconds to round-trip. If I take the SQL sequelize generates to perform the create;
If however I re-write the sql used in my test app to be of the following form:
and provide the tedious request with the 'data' parameter containing my Buffer, the result is instantaneous (or near enough!) We spent some time analysing the SQL server side of things and we think it is something to do with SQL Server trying to find an optimised plan by matching the query directly (all the wait is cpu-based) but we're not sure why this would be (as when we replicate the sequelize SQL from directly within the SQL enterprise manager we cannot replicate the delay!) I appreciate this isn't directly helping move this bug forward, but it may help someone else struggling with why the persistance of 'BLOB' types in MSSQL under Sequelize is so slow! If you really want to subvert sequelize but still interact with shared transactions (ymmv!) I used the following code.
... But you almost certainly shouldn't be doing this ;) |
Very surprised there's no parameter support for mysql. So if anyone passes in say ';DELETE * from table;'. Then your data are all gone. I guess that's one way to get people to use postgres.. lol |
I just wanted to pop in here and weigh in since I've been using this library for the last 2 days and this immediately got my attention the moment I noticed in console that full query strings were being emitted. I was naturally very concerned and end up here. I have since done a penetration test on my application, and while the lack of parameterized queries is a flag raised in said testing, all sql injection attempts fail. My sequelize layer is hidden behind a graphql layer, so I have already one library which is doing type safety and input control on my inputs. Sequelize protects against sql level injections by doing escape work on all parameters. Is this ideal? Probably not, but it is effective. Through the API gateway, I have no injection exposure. So, basically, sanely structure your queries, and if you are really nervous, sanitize all user input yourself to remove sql keywords. That said, I do hope that the security benefit of a second lock on the database door (prepared/parameterized statements) is realized soon by this library. This will not however stop me from using it in production. |
@DraconPern Why would that be the case? Unless you're personally injecting raw statements into raw queries then sequelize won't allow bobby to drop tables. @lassombra Nobody is argueing that prepared/parameterized statements is a bad idea, it's mostly a matter of resources. |
@mickhansen As the original poster of this issue, I never thought it would see this much continued activity. I see that now in v3, Postgres and Sqlite (I think) both support parameterized queries at least in the raw query documentation. Is the same true when using Sequelize Models? I also noticed that it looks like Sequelize has migrated to MySQL2 which supports bound parameters but according to the documentation for the latest release of v3, MySQL prepared statements are still not supported. Could you give me a short rundown on what still remains to be done so that we could have prepared statements and server bound parameters for MySQL? Professionally I've been working on other projects in other languages for the past while, but this might be something I could take a look at tackling if I understood the scope of the problem. |
@mickhansen I think you misunderstood my statement perhaps. I am confident that the current implementation, when behind a service/api layer (as it is for me) and with sane service implementation is secure. I was simply throwing my voice in so as to voice support for prioritizing this whenever and however possible. I look forward to the conversion being complete. I know that it is no small matter to rebuild an entire query language system (having done so once myself in another language) so I accept that this isn't a fix yesterday thing, but I wanted to express that I still think it's a priority. That said, order/limit with includes is maybe a higher one? |
@noah-goodrich It's had quite the attention. We supported parameterized raw queries but generated queries don't currently make use of this. Basically all query generation has to be rewritten to return a query and a set of values, rather than embed values themselves. @lassombra Re-reading your comment i might have, just wanted to clarify that no maintainer is against the idea. |
Should this issue be re-opened? I see that INSERT support was added for bind but it's still missing from SELECT (I've noticed it from my own tests in UPDATE statements): #9431 This would help with tooling to group queries that are the same (with different params) and to prevent leaking of data to external tooling like pganalyze. Edit: seems to be tracked in various other issues/PRs and looks like support is coming: #14524 |
We do support bind parameters everywhere nowadays, I think. Its support is even better in v7 Prepared statements are also planned, but #3495 is the relevant issue for them. We have some draft design documents for them over here https://gist.github.com/ephys/50a6a05be7322c64645be716fea6186a#prepared-statements |
According to #998, the issue of prepared statements and parameter binding through the native mysql driver was considered and rejected previously.
I would like to reopen the issue as a security concern for queries being run against the server.
Frameworks in other languages use bound parameters natively because it is inherently more secure than doing the escaping in the framework and passing pure strings to the mysql server to process.
Can you explain the justification for not using prepared statements by default in Sequelize for security purposes?
The text was updated successfully, but these errors were encountered: