New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

true server-side Prepared Statements #274

Open
jokeyrhyme opened this Issue Aug 14, 2012 · 83 comments

Comments

Projects
None yet
@jokeyrhyme

jokeyrhyme commented Aug 14, 2012

I noticed that Prepared Statements seem to be emulated client-side by escaping certain characters.

Any plans to fully support service-side Prepared Statements? This can be done via the binary protocol, but there's a slower SQL-based approach available for non-binary clients:
http://dev.mysql.com/doc/refman/5.5/en/sql-syntax-prepared-statements.html

@felixge

This comment has been minimized.

Collaborator

felixge commented Aug 14, 2012

Yes, prepared statements are on my todo list. I don't need them myself, so unfortunately they kind of linger at the bottom of the list unless somebody wants to sponsor some of my time to work on the feature.

That being said, the SQL based approach looks interesting as a stop-gap solution for the short term.

@jokeyrhyme

This comment has been minimized.

jokeyrhyme commented Aug 14, 2012

The only downside with the SQL-based approach is that you probably still end up needing to do client-side escaping. Still it does offer a little bit more structure, so it might still buy some protection. Depending on how you do it, it might also simplify the escaping part.

Unless I'm mistaken, you are already implementing the actual protocol at the lower levels of your driver. I wonder how much more you need at that level to finish?
http://dev.mysql.com/doc/internals/en/command-packet-details.html

@felixge

This comment has been minimized.

Collaborator

felixge commented Aug 14, 2012

Prepared statements use a range of additional packets that are currently not implemented by my driver:

I have not yet analyzed how much work it would be to implement them, but my gut feeling is ~5 days of work.

@jokeyrhyme

This comment has been minimized.

jokeyrhyme commented Aug 14, 2012

How does https://github.com/sidorares/nodejs-mysql-native handle this? Any reason not to just borrow parts of the way it's done over there?

I'm still somewhat struggling with the number of different MySQL drivers for Node.JS. I think Node makes it way too fun to write network protocol code. :P Maybe in a year or so the community will have coalesced around one or two really solid libraries.

@felixge

This comment has been minimized.

Collaborator

felixge commented Aug 15, 2012

How does https://github.com/sidorares/nodejs-mysql-native handle this?

It seems to implement the parts of the protocol that are required for prepared statements.

Any reason not to just borrow parts of the way it's done over there?

Yes, I didn't have the time to work on this yet. I'm also not in the business of copying code unless it's up to my personal coding standards. So even with good inspiration like this, it will still take me some time.

Maybe in a year or so the community will have coalesced around one or two really solid libraries.

This library is solid. It just does not implement all features.

@dresende

This comment has been minimized.

Collaborator

dresende commented Nov 19, 2012

Couldn't we just prepare and execute statements using SQL instead of raw packets?

@jokeyrhyme

This comment has been minimized.

jokeyrhyme commented Nov 19, 2012

@dresende the SQL method still winds up tampering with values to make them safe (escaping quotes, etc), whereas the protocol method explicitly separates query from values so tampering is not necessary. To be fair, as long as its impossible to smuggle a query in as a value, the driver is plenty secure enough. I suppose I'm just being a nitpicky ex-PHP developer who wants everything to be conceptually elegant. :P

@dresende

This comment has been minimized.

Collaborator

dresende commented Nov 21, 2012

I just asked because I don't know the protocol deeply, but besides that it seems easy to implement based on the current module structure. I just don't have the knowledge about the protocol and documentation is scarce..

@fzaninotto

This comment has been minimized.

fzaninotto commented Jan 30, 2013

I also need this feature from a security point of view. Your implementation of parameter escaping in JS only is naive (no harm intended - I've written things like that myself in the past), and I'm pretty sure it's open to any semi sophisticated SQL injection bot (e.g. using UTF-8 escape strings). MYSQL Prepared statements uses MYSQL own escaping subsystem, which has been refined for years, and is maintained by security experts.

Support for prepared statements and parameter binding is a must if you want your library to be used in corporate world.

@felixge

This comment has been minimized.

Collaborator

felixge commented Jan 30, 2013

I also need this feature from a security point of view. Your implementation of parameter escaping in JS only is naive (no harm intended - I've written things like that myself in the past), and I'm pretty sure it's open to any semi sophisticated SQL injection bot (e.g. using UTF-8 escape strings).

If you care about security, please don't spread FUD.

Support for prepared statements and parameter binding is a must if you want your library to be used in corporate world

I see that you're a prolific open source contributor yourself, so you should know that this isn't exactly the most successful strategy for requesting features. Of course this is an important feature and needs implementing, but unless somebody steps up to the challenge, it won't happen anytime soon.

@fzaninotto

This comment has been minimized.

fzaninotto commented Jan 30, 2013

Again, no harm intended. I don't know your security background, and both your past answers in this issue and the current implementation of parameter escaping made me think that you may not care about this as much as it deserves to. If I was wrong, sorry to have offended you.

So I understand that you care about this issue but won't implement it yourself, is that correct? I am surprised you don't put this one on top of your list, but maybe you're a much better expert in security that me and I overestimate the risks.

@felixge

This comment has been minimized.

Collaborator

felixge commented Jan 30, 2013

and the current implementation of parameter escaping made me think that you may not care about this as much as it deserves to.

This is what I mean by FUD. Please point out an actual attack vector. If it exists, it will be fixed.

So I understand that you care about this issue but won't implement it yourself, is that correct? I am surprised you don't put this one on top of your list, but maybe you're a much better expert in security that me and I overestimate the risks.

As I said before, I don't have any time to hack on this module right now / don't need this feature myself. This says nothing about the importance of the feature, it's just my current situation. If somebody was to sponsor the development costs, I could make room in my schedule, otherwise this feature won't land until somebody else contributes a patch.

but maybe you're a much better expert in security that me and I overestimate the risks.

I'm not a security expert. But I implemented the same escaping approach outlined by the MySQL documentation, which is also used by libmysql as well as mysqlnd. Afaik, it is correct and secure.

Additionally, multiple statement execution is disabled by default in this module, this limits the impact of any possible exploit could have.

@cblage

This comment has been minimized.

cblage commented Feb 22, 2013

PHP has had tons of different vectors of SQL injection attacks before everyone started using prepared statements by default. Escaping is better left for MySQL's engine, rather than trying to reinvent the wheel on the client-side (in this case, the client is the app making the query).

For example, if I remember correctly, in PHP one attack vector involved the escaping mechanism trying to escape strings for a certain encoding, and the connection using a slightly different encoding, causing some characters to not be escaped properly for what MySQL was expecting.

Prepared statements are also important for queries that are re-used a lot. There are significant performance gains to he bad by using the same prepared statement multiple times rather than the fully query every time.

@cblage

This comment has been minimized.

cblage commented Feb 22, 2013

Some literature on the encoding thing:
http://security.stackexchange.com/questions/9908/multibyte-character-exploits-php-mysql
http://stackoverflow.com/questions/5741187/sql-injection-that-gets-around-mysql-real-escape-string/12118602#12118602

I'm not a accusing the library of being bad / insecure, but PHP battled with SQL injection problems for many years. The only "real" solution they came up with after all these years is using prepared statements.

@efuquen

This comment has been minimized.

efuquen commented Feb 22, 2013

So there is event more issues then security with this, which regardless should obviously be the most important reason for implementing this. I've done my own personal benchmarks vs mysql-native, which has prepared statements, and this library and it's over twice as performant, that's a pretty big jump. I'm not sure if this will convince you more that this might be a feature you need, honestly I think the whole security aspect should do it alone IMO.

I'm sympathetic to the idea of if you want this feature, submit a patch. But I have a few issues with the attitude taken here:

  1. This isn't some minor feature. A lot of people, myself included, would consider prepared statements essential to any driver considering itself to be mature and stable. As of now, from what I can see this is the main driver the node community seems to be coalescing around. To not have prepared statements be a priority I think is a real problem because of that.

  2. You haven't really abandoned this project, from what I can see. There is active development going on, there is a 2.0 alpha 7 version. From the commit logs it doesn't seem you're doing it yourself now, so I would be interested to see the developers who are actually actively working on this chime in. Stemming from the point above I see it as a real concern that a 2.0 version of mysql driver is being developed and prepared statements just aren't "that important" to be included.

  3. If you care above the Node js community then you should care about this issue. If you care about security or radically increasing the performance of your code base then you should care about this issue. If you don't care about any of the above I'm a little concerned for the Node.js community in general, if this is the attitude essential libraries and their maintainers are going to take.

Yes, I'm free to not use this code, I'm free to not use Node.js in general. But if you care about actually improving the community and improving the quality of code out there on node js then the attitude shouldn't be either pay me or write it yourself. In the end this was your project and you decided to contribute back and still have developers actively working on it. And I think if you really want to keep contributing to the community you should reevaluate your attitude to this sort of issue.

@dresende

This comment has been minimized.

Collaborator

dresende commented Feb 22, 2013

I think everyone is convinced but none of us is payed to do it. We have other projects (some give as money, others give us joy..), so we can't be 100% on this. If anyone wants to step up and do something, just do it and make a pull request. It doesn't need to be perfect or even complete, we can improve it then. Feel free to be part of that community and actually improve the quality of this code.

@dresende dresende closed this Feb 22, 2013

@cblage

This comment has been minimized.

cblage commented Feb 22, 2013

Why are you closing this issue? You don't think not supporting prepared statements is an issue? People can still contribute to this issue and discuss it. If you close it, someone else will eventually open a duplicate of it.

@dresende

This comment has been minimized.

Collaborator

dresende commented Feb 22, 2013

Supporting prepared statements is not an issue. Having someone do it is an issue. Since when someone does something about it and does a pull request, a new issue is opened, this is not an issue anymore.

@cblage

This comment has been minimized.

cblage commented Feb 22, 2013

I don't think anyone here is demanding that you drop everything and implement prepared statements. But I do feel like the lack of prepared statements is a bigger issue than you guys are making it out to be, and closing the ticket on github seems like an attempt to sweep this under the proverbial rug instead leaving it open to encourage a transparent and constructive discussion, inducing collaboration / contributing with code. Just my 2 cents.

@felixge felixge reopened this Mar 7, 2013

@felixge

This comment has been minimized.

Collaborator

felixge commented Mar 7, 2013

@dresende I'm re-opening this. The open github issues should list all reasonable suggestions for improvements, this is one of them.

@efuquen please go away. People arguing that me not implementing this feature in my spare time with virtually no benefits for myself is the same as not caring about the community is incredibly hurtful. I've poured my heart and soul into this library, and yes, I didn't get to implement every feature yet, and yes, this is unfortunately a very important one, but seriously, it's people like you that make me regret giving away so much of my work for free sometimes

@efuquen

This comment has been minimized.

efuquen commented Mar 14, 2013

go away? I've made what I feel are valid arguments. I'm sorry I hurt your feelings but I was only being direct and I certainly wasn't rude and I don't feel like that means I should "go away" or that I'm somehow a bad person like you've implied in your comment. You've made a library that right now is the defacto mysql library in the nodejs world, there are no other maintained alternatives. There is a "2.0" version being developed missing a critical feature that was brushed aside, I tried to highlight this with new arguments because previous ones haven't worked. Anyway there doesn't seem much more to be gained by commenting anymore so I'll leave it at that.

@dresende

This comment has been minimized.

Collaborator

dresende commented Mar 14, 2013

@efuquen :
You seem not to understand the issue. Everyone agrees that prepared statements are important, no need to argue on that. Instead, if it's so important to you please step forward. I don't have the knowledge to do it. If I did, it would be already in an alpha version, and I don't have time to search for it. Do you? Can you please explain me how to do it? I'll happily implement it if you don't want to contribute and code some lines and just give me a technical explanation. I'll wait for a comment explaining or pointing to a clear explanation page and I'll do it.

This applies to anybody really wanting this feature. I want it to, it's just not my top priority and I don't have time for research.

@felixge

This comment has been minimized.

Collaborator

felixge commented Apr 4, 2013

Everybody who wants/needs prepared statements, here is your chance: @pyramation contacted me to make this happen, setup a crowd funding campaign, and made an incredibly generous initial donation: https://www.crowdtilt.com/campaigns/prepared-statements-for-nodemysql/description

If the funding goal is met, I've guaranteed to handle the implementation. However, if somebody from the community would like to lead this effort, I'd be happy to mentor / share the funds. I'm mostly thinking about @dresende who's been the driving force of goodness behind node-mysql for several months now and would really deserve to be paid for some open source hacking. So if anybody is interested, let me know - otherwise I'll write the code.

@dresende

This comment has been minimized.

Collaborator

dresende commented Apr 4, 2013

I'm in :) 👍

@cblage

This comment has been minimized.

cblage commented Apr 4, 2013

I will try to help as much as I can. What's the game plan? :D

@pyramation

This comment has been minimized.

pyramation commented Apr 5, 2013

Great to see @dresende is in!

@cblage one easy thing people can do is share this link where people who are interested will find it: https://www.crowdtilt.com/campaigns/prepared-statements-for-nodemysql/description

@cblage

This comment has been minimized.

cblage commented Oct 15, 2013

What? LRU caches discard the least recently used items first, MRU do the opposite (they discard the most recently used ones first). It's about prioritization of what is retained vs what is ejected. I believe that for the case of caching prepared statements, an LRU cache would probably generate a higher cache hit ratio.

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Oct 15, 2013

During nearly 4 years of prepared statements support in mysql-native and later, mysql2 all issues related to leaking memory on client or server due to growing number of prepared statements were of similar pattern:

    for (var i=0; i < 10000000; ++i)
      conn.execute('select foo from bar where id=' + i); // ...

which obviously results in max_prepared_stmt_count error and not the way prepared statements are intended to be used.

I can't imagine application where you would need more than few hundred of prepared statements. On the client hash of 100 ints adds very little memory footprint. Not sure how much memory it takes for mysql server to store PS, but they are all disposed after connection is closed.

@rwky

This comment has been minimized.

Contributor

rwky commented Oct 23, 2013

I also think the revised api by @sidorares is the best. I've a couple of projects that could benefit from prepared statements.

@doug65536

This comment has been minimized.

doug65536 commented Dec 13, 2013

Any SQL driver that doesn't have REAL prepared statements is useless and not worth consideration. Even node-mysql2 is useless: It doesn't have a separate prepare and execute. What is the point of a prepared statement that you use once? With all due respect, this seems to be a case of "it's too difficult to support prepared statements, so let's pretend that we don't need them".

@felixge

This comment has been minimized.

Collaborator

felixge commented Dec 13, 2013

Any SQL driver that doesn't have REAL prepared statements is useless and not worth consideration. Even node-mysql2 is useless: It doesn't have a separate prepare and execute. What is the point of a prepared statement that you use once? With all due respect, this seems to be a case of "it's too difficult to support prepared statements, so let's pretend that we don't need them".

Your comment is really discouraging to people like me who have donated large amount of time to this community. Please read: http://felixge.de/2013/03/07/open-source-and-responsibility.html . You're welcome to contribute.

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Dec 13, 2013

@doug65536 its prepared once and can be used many times. If it's useless for you - please don't waste your time writing comment. And 30000+ downloads a week for node-mysql is a good sign that it's actually useful for many people.

@doug65536

This comment has been minimized.

doug65536 commented Dec 31, 2013

Sorry, I was completely wrong about node-mysql2. It does indeed prepare once. The API led me to believe that it didn't have a separate prepare and execute, but I looked at the code and found that it does have a cache that reuses the prepared statement if you use the same query text.

I was a bit nasty when I posted that comment. I was frustrated. Sorry about that.

@EnzoMartin

This comment has been minimized.

EnzoMartin commented Mar 20, 2014

Would love to see this functionality added, any chance of a status update?

@lewispham

This comment has been minimized.

lewispham commented Aug 29, 2014

About prepare statement, I think it is not just about security, it also greatly improves performace. Because we can reuse queries with different corresponding values instead of recreating new queries.

@felixge I really surprise that you didn't receive a penny from this project. This is so sad. Your work is amazing. I strongly believe, that someday, node-mysql will be much more valuable. I even think that this module can be a nodejs built-in package. It's so real that more and more projects are depending on node-mysql. I am one of them. Thanks for your great job.

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Aug 30, 2014

@Tresdin there is no huge speed improvement with prepared statements - cost of parsing SQL is usually not that big compared to actual data access and IO. Of course this depends on your data and type of queries. Latency wise it's same sequence "request - wait for data - request etc" as with text based protocol (aka non-prepared statements). If you have real life examples and data sets I'd be happy to see your benchmark results

@lewispham

This comment has been minimized.

lewispham commented Aug 30, 2014

@sidorares My point is just based on this article.
http://dev.mysql.com/doc/refman/5.1/en/c-api-prepared-statements.html

According to this mysql dev docs, using prepare statment gains performance benefits because:

  • Query is parsed only once.
  • Prepared execution also can provide a reduction of network traffic because for each execution of the prepared statement, it is necessary only to send the data for the parameters.
  • Prepare statement uses a binary protocol that makes data transfer between client and server more efficient.

But you're right. It also heavily depends on specific situation.

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Aug 30, 2014

My point is based on actual benchmarks using node-mysql2 library. You are right as well, but usually one can get much better performance for example by going from single connection to a pool than by switching from text protocol to binary

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Dec 6, 2014

Hi all! I'd like to give node-mysql prepared statements some attention during christmas break. I changed my node-mysql2 api to conform my proposal discussed here, if anyone can look at sidorares/node-mysql2#133 pr or try 'standalone-prepare' branch that would be helpful. I'd like to see some feedback on api in node-mysql2 before introducing it to node-mysql - because of number of users here it would be harder to fix. I'm going to release mysql2 with new api in a couple of days and start working on prepared statements for node-mysql late December (unless @dougwilson have this already implemented :) )

@dougwilson

This comment has been minimized.

Member

dougwilson commented Dec 6, 2014

unless @dougwilson have this already implemented :)

I don't :) I saw you working on your branch and if there is an implementation here, that would be amazing!

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Dec 6, 2014

there is very little to re-use except for tests - the internals are too different, but I'm really keen to implement the same here

@felixge

This comment has been minimized.

Collaborator

felixge commented Dec 7, 2014

@sidorares wow, this sounds great. I'll have a look once you got a branch for node-mysql going.

@Redsandro

This comment has been minimized.

Redsandro commented Sep 15, 2016

Any progress on this?

I'm more interested in the speed benefit of using prepared statements with big data and analysis than the alleged security benefits. Preparing an insert query once and executing millions of times will surely benefit more than node-mysql batching 100 non-prepared statements at a time.

Just curious. I'm very happy with node-mysql (I mean this module, called mysql, but that's ambiguous). I understand very much the issues of volunteering spare time to contribute and getting payed in joy, and was just wondering if any progress was made on this in the past year.

@felixge I feel bad about how some people in the community have difficulty expressing their appreciation for the work you have done. Know that it is appreciated. 👍

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Sep 15, 2016

I'm more interested in the speed benefit of using prepared statements with big data and analysis than the alleged security benefits. Preparing an insert query once and executing millions of times will surely benefit more than node-mysql batching 100 non-prepared statements at a time.

this is probably not the case. ( mostly because of serial nature of protocol ) 10000x 100batched non-prepared is going to be way faster than 1m prepared. Even more faster way is to use LOAD INFILE (basically whole 1m in one batch with very little overhead)

@felixge

This comment has been minimized.

Collaborator

felixge commented Sep 16, 2016

@Redsandro thanks for the kind words. I can't comment on progress, I haven't written any node.js stuff in several years now.

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Sep 16, 2016

@Redsandro unfortunately I didn't kept my promise and not invested my time into porting prepared statements to here (yet). I might have a look again and see how much work left

@Redsandro

This comment has been minimized.

Redsandro commented Sep 17, 2016

@sidorares

this is probably not the case. ( mostly because of serial nature of protocol ) 10000x 100batched non-prepared is going to be way faster than 1m prepared.

Really? Hmm, in that case I'm a tad bit disappointed by the speed, but perhaps the server is to blame. I'm talking about "Nested arrays are turned into grouped lists (for bulk inserts)". I assumed the translating would add some overhead.

Even more faster way is to use LOAD INFILE (basically whole 1m in one batch with very little overhead)

Unfortunately in my particular case I can't (or don't want to) do this, because I can only generate a 100-ish records at a time, depending on input data. Right now it's very resilient. It just waits if the table is locked (in case something else does a big query), data is in the table directly and isn't delayed by a day writing an intermediate INFILE first, and it picks up where it left off.

I have experimented with node not waiting for the INSERT to call back, but then the queue can fill up and choke the script.

I might have a look again and see how much work left

I applaud this! 👍
Although as you pointed out, I might not benefit from this in my particular use case. But I'm looking forward to experimenting with this.

@Turneliusz

This comment has been minimized.

Turneliusz commented Jan 2, 2018

@sidorares we're using mysql quite extensively and we're having quite broad set of integration tests for our applications. If we could help somehow, even with testing non-production ready prepared statements implementation, happy to help. Looking forward and thanks for the effort!

@sidorares

This comment has been minimized.

Collaborator

sidorares commented Jan 2, 2018

@Turneliusz unfortunately not very much spare time with day job and 6 kids. At the moment the only way to bump priority is to hire me for a few days to work full time on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment