-
Notifications
You must be signed in to change notification settings - Fork 428
UTF8 Encoding isn't consistent with .Net Framework #1679
Comments
The difference is that on .Net Core 3.0+, some invalid byte sequences produce two �, while they produced only one � on older frameworks. This was an intentional breaking change to follow Unicode best practices, see dotnet/docs#13547 for more details. (Also, this has nothing to do with .Net Standard. If you write a .Net Standard library and then run it on .Net Framework or .Net Core 2.x, you should see the old behavior.) |
Thanks, you would think they would provide an overload to get the old behaviour as there is no way to easily fix this. I can’t just utilize the new encoding, the mechanisms are used for existing hashed password authentication and this makes it difficult to upgrade to .Net Core. It’s good that they fixed this finally, however breaking backwards compatibility is a problem. The only way for me to fix this is to roll my own UTF8 encoding |
Supporting invalid UTF-8 sequences for passwords sounds like a bad idea to me and something nobody would intentionally use. Can't you change your policy to disallow such passwords (and suggest password reset if anyone actually tries to log in with such a password)? |
we weren’t supporting this in passwords but rather the auto generated salts that are stored as binary data. We can't validate the existing password hashes correctly if we can't decode the existing UTF8 salts. I discovered the system was creating unicode strings and validating using UTF8 which is a bug in our system and now fixed, but there's not much I can do about validating existing salts without the same UTF8 handling as in .Net Framework. |
I'll just leave this here in case anyone else runs into this and its a blocker - I created a Nuget package that provides UTF8 encoding as .Net Framework does it: Text.UTF8.Legacy |
<security hat on> I'm glad you found a workaround for your scenario. However, I want to provide a general warning that this is not the right way to go about doing things. The The safest course of action for the scenario mentioned above would be to perform a one-time conversion of the existing data in the database, performing the UTF-8 decoding step, then re-storing this information back in the database (as binary) and marking it as tainted. The next time a user attempts to log in, the salt is read from the database and treated in pure binary form rather than having the application attempt to stringify it. Additionally, if the salt is marked tained, the application should re-generate the salt using the full amount of required entropy, then store this new salt back in the database and remove the tainted flag. This process should have the effect of automatically upgrading all users' login data the next time they log in. The process should be fully transparent to users and shouldn't result in any "you must reset your password" like interruptions to their workflows. Hope this helps! |
This is only possible when you have a way to UTF8 decode the original data
in order to validate at least once so that an upgrade and re-encoding of
the data can be performed. In our case, this means having all users for a
particular database tenant being upgraded before the old UTF8 data can be
upgraded permanently. This means having access to the old UTF8 encoding is
still required and the Microsoft team made no option available for
backwards compatibility requiring custom solutions.
This is the 6th non-backwards compatible change in .net core 3+ I’ve found
and seemingly all introduced by a single developer at Microsoft.
…On Thu, Apr 23, 2020 at 10:11 AM Levi Broderick ***@***.***> wrote:
<security hat on>
I'm glad you found a workaround for your scenario. However, I want to
provide a general warning that this is not the right way to go about doing
things. The Encoding classes are intended for processing structured data,
not random data (as you would get from cryptography). The end result is
that much of the entropy of the "salt" data will be lost during the
conversion process. *If the target application is bound by legal or
regulatory requirements, a security audit will find such application out of
compliance.*
The safest course of action for the scenario mentioned above would be to
perform a one-time conversion of the existing data in the database,
performing the UTF-8 decoding step, then re-storing this information back
in the database (as binary) and marking it as *tainted*. The next time a
user attempts to log in, the salt is read from the database and treated in
pure binary form rather than having the application attempt to stringify
it. Additionally, if the salt is marked *tained*, the application should
re-generate the salt using the full amount of required entropy, then store
this new salt back in the database and remove the *tainted* flag.
This process should have the effect of automatically upgrading all users'
login data the next time they log in. The process should be fully
transparent to users and shouldn't result in any "you must reset your
password" like interruptions to their workflows.
Hope this helps!
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#1679 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATJ54XRNJMXSNFX6F4BK5LROBZETANCNFSM4KFQLFQQ>
.
|
.NET Core does not hold itself to the same back-compat bar as .NET Full Framework did. The philosophy behind this is that for .NET Full Framework, updates are in-place and are pushed out over Windows Update. Developers and consumers often have no control over when updates are deployed to a machine. Imagine an application works on Monday, then you go to bed and wake up on Tuesday, and the application stops working. That kind of change is generally unacceptable. So we do try to hold ourselves to a very high bar there. One of the consequences is that we often can't even introduce seemingly innocuous bug fixes. Somebody, somewhere, may have taken a dependency on the buggy behavior. For .NET Core, we're more free to make such changes. The reason for this is that framework updates are a deliberate action by the developer. By definition .NET Core applications can't run into the "works one day, fails the next day" scenario I mentioned above. We're not ignorant to the fact that breaking changes are sometimes painful, and we're not in the habit of making such changes blithely. When we do make such changes we weigh the benefit vs. harm to the ecosystem as a whole. For example, we generally see improved interoperability with accepted industry standards as a large overall ecosystem benefit. We also document such changes at https://docs.microsoft.com/en-us/dotnet/core/compatibility/breaking-changes. You said this is the 6th breaking change you've encountered in .NET Core 3.x. Check the link above (or, specifically, https://docs.microsoft.com/en-us/dotnet/core/compatibility/corefx). If we missed something, then please open issues in the docs repo. Or comment here and I can open issues on your behalf. |
if you store as the upgrade process would have 2 steps:
|
I found a subtle difference that was revealed in a bunch of hashing code I had written a while back for .Net Framework. I wrote a multi-platform test that shows UTF8Encoding is treated slightly different in .Net Standard and I don't really have a good way to solve it yet.
Consider the following - I encoded a string in hex to guarantee the bytes are the same for the test:
(hopefully github doesn't mangle the expected string, it looks correct after previewing)
This test will pass on .Net Framework 4.8, but will fail on .Net Standard 2.0
The text was updated successfully, but these errors were encountered: