-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes default encoding from UTF8_GENERAL_CI to UTF8MB4_UNICODE_CI #1408
base: master
Are you sure you want to change the base?
Conversation
Travis configuration should be updated:
|
As for this change, I definitely think it needs to wait for 3.0, probably going along with actually fixing the way encoding works in this module, which is still fundamentally broken. The reason I think this is a major version change is because it is breaking (it will break against old server versions that do not understand this encoding) and also I noticed it will actually act differently against people using the current encoding and writing to mb4 columns with chars outside the BMP, thus changing their data storage without a major. As for your trouble testing, not sure why you are having trouble. I literally just ran all these tests at the beginning of this week and everything worked. I am not going to remove 0.6/0.8 testing as long as we support those versions, and removing support is also a major version bump. I agree that we can definitely remove on 3.0, but this should not be because it's not technically possible to test, just less headaches. |
Ah, @dresende, I see why the tests are failing: you committed failing commits directly to master. Please revert commits on master that fail CI. Please us branches to test commits before making them to master, or at least revert them on master as soon as you see them fail. Otherwise everyone coming to this project for the past 2 hours will think this is a joke project that cannot even pass it's own tests. |
@dresende for now I just moved those commits over to this branch so we can take whatever time is needed figuring out what the test failures are and can get the build badges back to "green" in the meantime (I'm surprised I figured out how to do this from my phone :O while I'm away from the computer). |
Apparently even after reverting those changes from |
About the encoding, I would be happy with change from |
About committing to master, it's supposed to be unstable. And AppVeyor is to blame on this too. I prefer to see a project failing on master with a commit from 1 week ago than a project succeeding with a commit 2 months old :) |
It's still a major version bump in my mind, unless you can explain how it would not change the behavior of people's apps when upgrading? It would seem to me if it doesn't change anything then there is no point and if it does change something... well now we're breaking people and it would need to be a major version bump.
No, not on this repo, which is run with a stable master. If we think it should be unstable, then let me know what needs to be changed so that the badges on our published npm version wont be red and failing, driving users away.
Sure, but if something is pushed to master and breaks the build, the committer should not be doing anything else but fixing the issue. Yes, CIs can break, but I don't think that after all these hours any effort has been put into trying to fix it? |
The commit may have been old, but I literally just pushed it up this week, it it has been succeeding for just a few days now. Anyone can click on the build and see when it was last run. |
I had pushed up this weekend because I wanted to cut a patch with the pending changes. Now that the build is failing, I will not release any patches until it is resolved and passing so we know we are releasing something that functions on Windows. |
Actually that commit was just pushed up two days ago, not months ago. You can see on the build page https://travis-ci.org/felixge/node-mysql/builds that "Update bignumber.js to 2.3.0" was only just built 2 days ago, not months ago, so this means the project was building fine just recently. And the same with AppVeyor: https://ci.appveyor.com/project/dougwilson/node-mysql/history. This project was building just fine only 2 days ago. |
So since I got to a computer, I was able to read the AppVeyor upgrade notes and just update the path, it was pretty straight forward. Back on topic to this pull request, I think changing the charset makes sense, just my point is that it would be a major version bump unless there is some reason it's not that I'm not seeing (http://semver.org/ has the general guidelines for how to determine these things). If this is agreeable, we can mark this as milestone 3.0. Currently everything in https://github.com/felixge/node-mysql/milestones/3.0 is slated for 3.0, with some ready to go, just waiting for a 3.0 to land on, to other not started, but are really BIG problems, for example the really non-sense charset handling in this module (#804). Since this seems so closely related to #804 to me, I think it would make the most sense to land them together. I know #804 does not have much detail, but it boils down to this module just completely ignores all the charset information coming in the fields packets, creating a bunch of garbage in various cases. In fact, this module literally assumes that the charset sent in the handshake was accepted the server, which is not always the case, and changing this to a mb4 one could make this case more common, and case a lot of issues if #804 is not fixed first. For example, the client starts asking for mb4, the server doesn't link it and just falls back to latin1. Now this module is decoding as UTF-8, not latin1. |
Check the response on stack overflow that I mentioned, I added on purpose. |
Right, I read the SO post :) and I agree, this change makes sense, even as-is being MB4-based. I guess my only opinion is just that I think it makes sense to target 3.0, not that the change itself is bad, if that makes sense. |
About supporting v0.6 and v0.8, I was just considering tests. We shouldn't worry about versions that are no longer maintained (v0.10 will be off soon). And to properly update the module to handle encodings I think those versions will be an obstacle. |
Indirectly, it can force people to upgrade to more stable and maintained versions (in terms of security). |
Since doing that I couldn't figure out how to make supporting it working in a backwards-compatible way, we can drop any versions we want during that change since it's major anyway. I figure no point on even trying to go below 0.10 these days. |
I think, changing the default encoding is a good thing. |
48ff604
to
e9dcd8a
Compare
9355039
to
b6b7f06
Compare
5847ec8
to
be37e88
Compare
65c4c0c
to
fa96a75
Compare
for some reason the code below works ok with utf8 encoding and not with utf8mb4 (I'd expect the opposite) c.query('select "test 💩" ', (err, rows, fields) => {
console.log(rows, fields);
}); When UTF8MB4_UNICODE_CI is used as charset, all chars outside of BMP are replaced with '?' ( single 0x3f byte ) - in field name only, but not in the value, so the output using this branch is
and with master it's
@dougwilson , do you have explanation for this mysql behaviour? |
638d79d
to
88bade0
Compare
946727b
to
37fbbdd
Compare
Does anyone have another opinion on this?
http://stackoverflow.com/a/766996/977422