-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove BinaryFormatter from the shared framework in .NET 5 #29976
Comments
BinaryFormatter has the advantage that it's incredibly fast. Faster than any other formatter, by a lot. If you consider to remove it, please consider to provide a reasonable alternative, which is fast and secure. If you do, then I don't see any reason to not agree with that, regardless whether my personal opinion counts or not. |
@Symbai I think a very important thing to keep in mind here is that BinarySerializer is not fast and secure. It is fast largely because it is not secure. |
@Symbai If you're looking for something where the payload format is human-readable, check out the work taking place in this repo in the In the proposal I have in this issue, I do suggest that we should create a |
Pushing it into its own package seems a reasonable compromise. |
is it a good idea to refer to benaadams/System.Ben#107 Too soon ? |
Than maybe replace the BinaryFormatter implementation under the hood with ProtoBuf? |
I'm sure you are considering partners here which are relying on BinaryFormatter in .NET Core. Keep in mind to loop them in whenever you are starting this effort. |
@samsosa just to be explicit re:
That isn't a realistic option. Protobuf is a powerful tool, but it has a different feature-set to |
I would like to try an put a stop to the trend I'm seeing in multiple issues of ProtoBuf being the solution to all things serialization. It is not a good serialization format, and it was never designed to be one. It is designed to be a compact way to represent a limited set of features for sending an object graph to a gRPC endpoint. It has very severe limitations when trying to be used for general purpose serialization. It doesn't have support for references so if an object appears in an object graph multiple times, it will be completely serialized multiple times. We've had customers get tripped up by this before when they overrode The lack of references means it can't handle cycles in your object model which will cause infinite recursion on serialization. It simply cannot handle that scenario at all. There have been attempts to come up with a way to support references to allow for these two scenarios. The one that protobuf.net does needs explicit opt-in for the types/members which need this support. This requires understanding your entire object model and know where cycles might exist and when having duplicates will be a problem. This could be a LOT of work depending on how complex your data model is. It greatly complicates your on-wire representation and none of the benchmarks I've seen have been against an OM using these features. Any benchmarks on size and speed of serialization need to take this into account. My understanding is that the Google implementation of gRPC serialization doesn't support this feature and expects you to support this mechanism at your OM level and not at the serializer level. As the response from the ProtoBuf team has been that they aren't going to build this capability into the protocol and go implement your own solution, you lose the interoperability that ProtoBuf is supposed to bring as there will always be competing ways to do this by different ProtoBuf libraries. Protobuf is a bad format for use with an existing set of classes. It's best use is when designing a new OM with these limitation in mind. E.g. I'm creating a new gRPC endpoint and so define my data model using IDL and then convert that to classes. Going the reverse direction is going to be problematic for many people as Protobuf is not a general purpose serialization format. Incidentally, I read a blog post comparing MessagePack, ProtoBuf, JSON.NET and DataContractSerializer comparing size of output and speed of serialization and deserialization where it showed DataContractSerializer to be the worst on both fronts. I cloned their code and added a variation for DataContractSerializer using a binary XmlDictionaryReader/Writer and the size was about 10% bigger than ProtoBuf, and the time to serialize and deserialize was about 10% slower than ProtoBuf. If you were to have multiple duplicate objects and you turned on the reference support in DCS, you could easily create a case where DCS is both faster and smaller than ProtoBuf. |
@mconnew Thank you for the excellent analysis! You're right - I should have been more careful with my wording in my earlier comment. I didn't intend to suggest that ProtoBuf is The One True Replacement(tm) going forward, just that it's one of many options available to developers depending on their scenarios. I think part of the solution is that we need to convince the ecosystem that a general purpose serializer that can handle arbitrary object graphs is not a sustainable concept. Serializers like Whether this is a documentation issue that gives best practices for creating DTOs which can be safely serialized, whether we recommend switching to a safe but restricted-capability serializer like Another option - though one that might be more difficult to swallow since it'll require significant work within the ecosystem - would be to create a set of educational guidelines that explains in detail the concepts behind serialization, trust boundaries, and how the two interact. This would make individual developers responsible for analyzing their specific scenarios and choosing appropriate data models and serialization formats which are appropriate for their needs, while also giving them the knowledge to make trade-offs when necessary. I'm not a fan of this solution by itself because it presents a steep learning curve for our audience and goes against .NET's general premise of creating usable frameworks. But perhaps these educational guidelines could be useful in conjunction with another more approachable solution. |
@GrabYourPitchforks, my comment was in response to my seeing multiple places where people were suggesting using ProtoBuf for serialization. It's NOT a serialization format and I just wanted to make sure I put my thoughts out there somewhere contextually relevant. |
@mconnew I don't want to distract from the conversation about BinaryFormatter, so I don't suggest "here", but I'd love to have an offline conversation with you about what does and doesn't qualify as a "serialization format" and why; it seems we have some different ideas there, and I'd love to hear a new perspective. I'm easy to get hold of if you're willing :) |
IMO would be useful for all devs have an "open" discussion/listening on it...all devs have to serialize something sometimes in career and the hide issues and choices are not so trivial. |
@MarcoRossignoli great idea; I've dropped a discussion here; I'd love your input there if you have the time, @mconnew |
BinaryFormatter is an established format to transfer data through the clipboard and during drag'n'drop operations and likely to be in use in existing Desktop Framework applications. If you rip that out of the .NET Core framework without a replacement that can produce/consume the same binary data then .NET Core 5 applications will be unable to interact with Desktop Framework applications through drag'n'drop/clipboard when custom formats are in use. |
Thanks @weltkante for your feedback! Your point that there needs to be some replacement is well-taken, and I think the proposal outlined here addresses the issue. Let me elaborate. The proposal here is to remove In the scenario you describe, even though the Framework itself won't have default built-in support for deserializing arbitrary objects on clipboard / drag-and-drop operations, the application can continue to support the legacy behavior via hooking the clipboard / drag-and-drop handlers. For example, if the application hooks the "paste" handler and determines that the incoming data is a This should provide a viable outline for applications which absolutely need to retain the existing Full Framework behaviors to be ported to .NET Core. |
Ok, I see where you are going with that, moving it into a separate component may work. However note that just deserialization won't be enough for all compat scenarios, interacting with Desktop Applications through clipboard or drag'n'drop may also require serialization (e.g. if you want to put something into the clipboard that the existing Desktop Application should consume, or drag something onto a Desktop Application Window). |
Just a quick reminder that new client / server applications that cost millions to develop and will have to be maintained for more than a decade are currently being built using Moving the
goes too far from my point of view. Especially as it's not completely clear that a viable (=feature-complete; drop-in - with some configuration) more secure replacement that will handle serialization AND deserialization will be available. Currently we're using .NET Framework on the client (MvvmCross, WPF) and .NET Core 2.2 (ASP.NET Core) on the server planning to migrate both to .NET Core 3. Having to live with the fear (=project risk) that Microsoft may "drop" / declare obsolete APIs just added to their newest platform is not a great outlook. Still I appreciate that this discussion is happening in the open and customers can participate and hopefully be heard ;) |
@HaraldMuehlhoffCC Thank you for your feedback. Your comments hit on a few key points which I think are prime examples of why I had proposed this work item to begin with. You are correct in that new applications (in 2019!) are being built with
As a security professional, I strongly disagree with this statement. This might be an appropriate assumption for your application today, as perhaps you control both the client and the server in tightly restricted environments. But as the application evolves, it's almost certain that this assumption might not hold going forward. Want to move your service to the cloud? Want to deploy the client on customers' workstations? Want to add the ability for additional less-authenticated users to access the application (perhaps in a read-only mode)? These processes are all now significantly more difficult because the application protocol is based on the insecure Another way to think about this is to ask the following questions: "Would the server be comfortable allowing the client to upload an arbitrary .exe to the server and run it as part of request processing? Would the client be comfortable allowing the server to transmit an arbitrary .exe and run it as part of response processing? Is it acceptable for this behavior to be a permanent part of the application's request/response protocol?" I would wager that vanishingly few of our developer audience would answer affirmatively to all three questions. But because our messaging on |
@GrabYourPitchforks Thanks for your comment. We have indeed control both of the client and the server and it's running in a tightly restricted environment (electric utility companies). And this won't change for the client I'm talking about. Our web client is of course using a different protocol (JSON). I don't quite get your running arbitrary code analogy. In a sense all protocols use some binary encoding (in the JSON & XML cases it's just UTF-8 ...). Just because the Finally my primary concern with your proposal was
How does this make the systems that will still be relying on |
Just because the BinaryFormatter isn't using a human readable format doesn't mean that it's creating some sort of executable. I'm afraid you're incorrect. It's not a matter of human readable versus binary, it's how the serialization works. BinaryFormatter puts the class information inside the serialized data. When it's deserialized that information is used to create the objects embedded in the file. This is dangerous, as there are classes within .NET which can be abused to deliver code execution. In fact json.net has the same option, but it's off by default. In BinaryFormatter we can't turn it off at all. There have been numerous presentations about this at various hacker conferences, the seminal work being Are you my type? and the best recent one is Friday the 13th JSON Attacks. |
we'd be happy to switch to another Formatter AS LONG AS it would be feature-compatible and not much slower.
It is hard to comment on "feature-compatible" here without talking about
what features you're using, as several of BinaryFormatter's "features" are
really anti-features, but: you might find it relatively easy and performant
to try protobuf-net (I'm hugely biased, disclosure blah blah). I should
note that protobuf-net *also* had a similar optional / opt-in "bake your
type meta into the data" feature, but I'm killing it hard in V3 for
multiple reasons including the security aspect and the fact that a lot of
types have moved around in .NET Core. Fortunately protobuf has a related
"any" concept, which I'll be implemented very soon.
|
And if you want to see examples of binary formatter abuse in action; https://youtu.be/kz7wmRV9xsU?t=1273 |
Thank you, @blowdart and @mgravell! I get that security is not free & I've been actively following quite a few security folks on Twitter. Nevertheless for our very large object model I wouldn't necessarily want to add as much metadata as protobuf-net seems to require (even though we generate part of our model via a custom ORM and could generate attributes for some of the classes programmatically). We also do some clever stuff serializing LINQ-Queries using custom serializable Expression trees; even serializing Types (via their names & structural info in some cases). But we make sure on the server that the resulting queries are safe. I would prefer some sort of whitelisting the classes over all this verbosity. Actually shouldn't I be able to do exactly this in a custom Also we control the server, the client, the network and secure the communication channel; so I'm still not completely sold that the security implications are so pressing in our case. And as other formatters can be made unsafe by adding types / classes (see https://www.newtonsoft.com/json/help/html/SerializationSettings.htm#TypeNameHandling) |
As soon as we start adding limitations it's no longer compatible. We can't break the entire world like that. And surrogates have their own problems with DoS and OOM attacks. |
@blowdart "With great power comes great responsibility" ;) The currently available API should make it possible to add whitelisting types if required; see I'm with you that it's a good idea to make using the But forcing your customers to roll their own solutions won't make the world safer either. Thanks everyone for the enlightening discussion; I have to get a few hours sleep as I have to get up quite early over here in Germany. |
Moving it to a separate package doesn't mean you roll your own, you just have to opt-in to using it, with suitable dire warnings. And then point dotnet/corefx#4 in Levi's plan would get you a safer one anyway. |
@blowdart Then just keep it supported (or at the very least hand it to the community like WCF) ... otherwise customers would still be required to "roll their own" or use some other formatter - that wouldn't necessarily be safer - once critical (security) bugs were discovered. (I'm not talking about the architectural shortcomings you've written about)
I will look into adding whitelisting for our |
Per @blowdart's earlier comment (and my original description in the issue), surrogates have their own problems that won't always protect you. The simple and unfortunate fact is that the That fact is what led to this issue being opened in the first place. Most customers aren't aware that the above sentence explains the state of the world. Hence my desire for somewhat drastic measures to make them aware of this, to migrate existing code away from |
I've moved it to future. |
Thanks for moving it. I'm finishing up the new plan and should have something to publish in the next few weeks. That should supersede this issue, at which point this can be closed. |
sigh Having moved to use Immutable* in a few places, once again I've been hit by this same issue in .Net Core 3.1 vs. Framework - and I thought I'd managed to get away from it when I wrote my own Random ("because we won't add [Serializable] to it" and stopped using delegates. Nope. Please can I request a more useful response than "Don't use BinaryFormatter"? What tool do I use to serialize/deserialize object graphs of classes where:
|
Can you clarify this? I don't quite follow. Your object graph contains instances of some arbitrary type |
Let's see!
Yes.
Don't care. Shouldn't be the type author's concern unless they're trying to ensure something special like serialization being compatible between versions, implementations, or architectures.
But also didn't necessarily do any work to make it not serializable. System.Random is a good example of that; for a given version on a given architecture, it serializes just fine.
No. I'm looking for a tech that allows me to test - for a particular version of all code and libraries, accepting that new versions of any code may change my results - whether a particular type happens to be serializable to within my acceptable limits, and to use those test results if I find out that the type happens to work. I would strongly prefer it if the tech had warnings plastered all over it that that's all it does, that it's insecure, and that you should keep your eyes peeled when using it. Caveat user.
Same as any other black / white box is made reliable in use: by eyeballing the code, testing, and by deducing that you can't do it if the tests don't pass. If I can't get reliability that's guaranteed by the author - which is quite a high burden for the author - then I want to be able to examine the type and judge reliability for myself, in my own use case. What I don't want is for someone who doesn't know my use case to be able to prevent me from examining and testing their type to determine whether it's sufficiently usable for my use case. Does that help? |
Your problem isn't solvable in general purpose but maybe thats not your goal Things like interop handles and pointers cannot be serialized without cooperation of the type author. e.g. trying to forcefully serialize classes from a UI framework will not be doable without the frameworks cooperation since there is a lot of interop with the underlying platform. Even outside of UI framework, even simple things like weak references that are tied with the runtime will not be serializable without cooperation. Also any class can have numeric indices into static caches/arrays, so unless you serialize the whole program state of static variables those indices will be wrong upon deserialization as well. |
Sounds like you want a "damn the torpedoes" style serializer. Something that recursively iterates over all the fields in a type and bitblts them into a byte buffer somewhere, with no regard for any deserialization-related attributes or API conventions. That's an interesting use case I've never heard of before. I'm not aware of any in-box serializer that follows this logic. But in theory it should be straightforward enough to write one, assuming the serializer isn't intending to special-case pointers / handles / reflection types / etc, and just relies on the caller not passing it object graphs which contain those fields? |
@weltkante - agree entirely that this isn't solvable in general, and you're also absolutely correct that that isn't my goal. I agree with all of your points about ways of stopping this from working, and I've no doubt there are many more. I've got considerable experience at VM and image level with Smalltalk, so the whole special classes / weak references / oddnesses thing isn't new to me. Many Smalltalks can save image segments, which are effectively portions of the Smalltalk object graph; support for reloading those and, in particular, re-linking them with what's outside the segment is distinctly variable. @GrabYourPitchforks - Almost. It's definitely a "Doctor, it hurts when I pass the serializer something that breaks it / Well, don't do that!" serializer. I want something that respects serialization information provided by type authors when it is present, but does not conflate a lack of explicit serialization provision ("I haven't said you can...") with an explicit lack of serialization provision ("... therefore you can't"). That's more complex than a straight blitter. It's actually very close to BinaryFormatter as it currently stands in Framework; except for the combination of BinaryFormatter assuming it's locked out of a type unless it sees SerializableAttribute, and the base library authors only having SerializableAttribute to mark information for multiple different use cases and being very conservative with the types they choose to mark. I wouldn't even mind having to define explicitly when instantiating the serializer whether it should serialize types that don't provide any guarantees on architecture or version. It's perfectly reasonable to provide barriers to this behaviour such that folks who wish to damn the torpedos have to opt into said damning, and make it very explicit that you're opting in to behaviour that is not guaranteed by the type authors. |
@Ozzard I can attest that nuget contains many serialisers that have the capabilities you require, though like you said, caveat user. So I guess the question is, what can we do from the perspective of the framework to better support such libraries? |
This is already achievable to a large extent through opt-in via a surrogate selector. It allows you to take full control of the serialization pipeline for arbitrary types, including types that aren't otherwise serializable. But as you mentioned, caveat user. using System;
using System.IO;
using System.Reflection;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
Person p = new Person("John", "Doe");
var ms = new MemoryStream();
var bf = new BinaryFormatter();
// bf.Serialize(ms, p); // this line would throw
// !! WARNING !!
// This is not a good idea, and it's not officially supported. Attempting to BinaryFormatter serialize
// an instance of a type that is not marked as serializable can lead to all sorts of problems, and
// even a seemingly innocuous servicing release could bust your application. But if you wish to
// serialize them anyway, this sample shows how to do so.
SurrogateSelector selector = new SurrogateSelector();
selector.AddSurrogate(typeof(Person), new StreamingContext(StreamingContextStates.All), new SimpleReflectionSurrogate());
bf.SurrogateSelector = selector;
bf.Serialize(ms, p);
ms.Position = 0; // rewind stream
var deserializedObj = (Person)bf.Deserialize(ms);
Console.WriteLine("Deserialized: " + deserializedObj);
}
}
// n.b. Not marked [Serializable]!
class Person
{
// parameterful ctor
public Person(string firstName, string lastName)
{
FirstName = firstName;
LastName = lastName;
}
public string FirstName { get; }
public string LastName { get; }
public override string ToString() => $"{LastName}, {FirstName}";
}
class SimpleReflectionSurrogate : ISerializationSurrogate
{
public void GetObjectData(object obj, SerializationInfo info, StreamingContext context)
{
// walk type hierarchy if necessary
var fieldInfos = obj.GetType().GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
foreach (var fieldInfo in fieldInfos)
{
info.AddValue(fieldInfo.Name, fieldInfo.GetValue(obj));
}
}
public object SetObjectData(object obj, SerializationInfo info, StreamingContext context, ISurrogateSelector selector)
{
// walk type hierarchy if necessary
var fieldInfos = obj.GetType().GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
foreach (var fieldInfo in fieldInfos)
{
fieldInfo.SetValue(obj, info.GetValue(fieldInfo.Name, fieldInfo.FieldType));
}
return obj;
}
}
} |
As someone who as spent much of their career looking at serialization, I'm
horrified by trivialising it; the type author should absolutely have a say
in how it is serialized. If you want to serialize someone else's hostile
type: stop - create your own type, and serialize that instead!
As for your wishlist: I could offer you everything there except "graphs" -
I'm actively trying to kill the reference-tracking support I already have
in my serialization lib.
…On Fri, 10 Jul 2020, 21:53 Levi Broderick, ***@***.***> wrote:
folks who wish to damn the torpedos have to opt into said damning, and
make it very explicit that you're opting in to behaviour that is not
guaranteed by the type authors.
This is already achievable to a large extent through opt-in via a
surrogate selector. It allows you to take full control of the serialization
pipeline for arbitrary types, including types that aren't otherwise
serializable. But as you mentioned, *caveat user*.
using System;using System.IO;using System.Reflection;using System.Runtime.Serialization;using System.Runtime.Serialization.Formatters.Binary;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
Person p = new Person("John", "Doe");
var ms = new MemoryStream();
var bf = new BinaryFormatter();
// bf.Serialize(ms, p); // this line would throw
SurrogateSelector selector = new SurrogateSelector();
selector.AddSurrogate(typeof(Person), new StreamingContext(StreamingContextStates.All), new SimpleReflectionSurrogate());
bf.SurrogateSelector = selector;
bf.Serialize(ms, p);
ms.Position = 0; // rewind stream
var deserializedObj = (Person)bf.Deserialize(ms);
Console.WriteLine("Deserialized: " + deserializedObj);
}
}
// n.b. Not marked [Serializable]!
class Person
{
// parameterful ctor
public Person(string firstName, string lastName)
{
FirstName = firstName;
LastName = lastName;
}
public string FirstName { get; }
public string LastName { get; }
public override string ToString() => $"{LastName}, {FirstName}";
}
class SimpleReflectionSurrogate : ISerializationSurrogate
{
public void GetObjectData(object obj, SerializationInfo info, StreamingContext context)
{
// walk type hierarchy if necessary
var fieldInfos = obj.GetType().GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
foreach (var fieldInfo in fieldInfos)
{
info.AddValue(fieldInfo.Name, fieldInfo.GetValue(obj));
}
}
public object SetObjectData(object obj, SerializationInfo info, StreamingContext context, ISurrogateSelector selector)
{
// walk type hierarchy if necessary
var fieldInfos = obj.GetType().GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
foreach (var fieldInfo in fieldInfos)
{
fieldInfo.SetValue(obj, info.GetValue(fieldInfo.Name, fieldInfo.FieldType));
}
return obj;
}
}
}
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29976 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAEHMGY5COHZKLWMWTXB2DR255T7ANCNFSM4KQAONBQ>
.
|
@eiriktsarpalis - well, you asked :-):
@GrabYourPitchforks - many thanks, Levi! I'm very happy with the warnings in the code. I expect to trip over a significant number of gotchas in use that will need ironing out; but at least it's possible to iron them out now. @mgravell - I know the feeling. I started looking at general object serialization for C in the late '80s using (of all things) Sun's XDR libraries, as part of a project to ship object graphs between processes across TCP, and have done quite a bit of work on it since, though by no means most of my career! I don't like the idea of taking over serialization of types without the co-operation of the type author, but consider my options here:
As with almost all technology development, my job is to maximise the total benefit to my client of a solution I create versus the effort involved in its creation. "Benefit" is quite broadly drawn here, including things like reliability, security, and performance as well as pure functionality; and note that different projects have different trade-offs there. Given the options of a) not doing it, b) duplicating and maintaining my own versions of many already-implemented types, or c) bypassing safeguards on a few types in a known way and analysing and testing the heck out of it, my rational option is C. I'd prefer d) already having serialization support in the types I use, because that would improve my ability to get benefit for my clients out of .Net Core, but it would appear that that the .Net Core team won't give me that as an option; so let's go with what's possible. |
This plan is now out-of-date and I am closing the issue. Thanks all for the feedback! |
Linking dotnet/designs#141 |
I have used .NET professionally for more than 15 years and my company has built a multi-million-line codebase which supports our business operations. Binary serialization is a key tool we leverage extensively internally. There is no security concern because (among other reasons) all the development is internal within a small team. The ability to serialize arbitrarily complex graphs with high performance allows us to store and then retrieve intermediate results of our complex models, facilitating validation and incremental work. Please do not remove, this would be very negatively impactful to remove an existing key functionality that we have built upon for 15 years. Feel free to reach out to me off-line for more background. |
@dje-dev the design linked above might be a place to share input. |
@danmosemsft - given that the team appears to have ignored all other input in this area that demonstrates that people are using this functionality in real products, is there any chance of @dje-dev being listened to or are they wasting their effort? |
@Ozzard I think it is very likely that we will implement the linked proposal in some form, because of the long and continuing history of highly impactful security issues generated by BinaryFormatter use - even though we appreciate there are contexts where all data is trusted. This one is only the most recent. Feedback on the process, timeframe and approach are certainly welcome. I know that we need to provide more guidance on other serialization options and @GrabYourPitchforks is working on that. |
It's also worth noting that the proposal relates to .NET Core 5+ (really, 6+). .NET Framework applications are not affected and .NET Framework will be supported a very long time. https://github.com/dotnet/designs/pull/141/files#diff-9324781e4d7a839a358bc384f65a8327R7 |
@danmosemsft , I think there is a confusion between serialization for communication and for saving/restoring the state of an object graph. The security problem is caused by BinaryFormatter supporting Polymorphism, which is a necessary requirement for saving/restoring the state of object graphs. While it is right banning polymorphic serializers from communication protocols, since both parts must agree on the objects to exchange, by banning them in general, we de facto prevent saving/restoring an object graph that represents the state of an ongoing computation, which is unacceptable! Applications, of saving/restoring the state of an on-going computation, include but are not limited to:
While I can accept the removal of the BinaryFormatter, I can't accept the general principle of banning polymorphic serializers. While reading, the long term plan for BinaryFormatter removal it appears that the whole infrastructure will be removed including attributes, interfaces,....to prevent anyone else from providing an alternative. Is possible to ask, at least, that the attribute infrastructure is not removed? This way people that are using BinaryFormatter for saving/restore can provide a custom alternative WITHOUT losing tonnes of code decorated with those attributes? |
If its not removed but not supported either you have high risk that it erodes over time. Newly added classes won't be annotated correctly. Previously annotated classes will be changed in incompatible ways. If you keep the annotations you need to keep testing and supporting them for their purpose. (Just stating this as a fact, not as an argument for or against removing things.) |
@weltkante, |
We've known for a long while that
BinaryFormatter
is a dangerous type and that it's responsible for many of the security vulnerabilities real-world .NET applications have seen in recent years. In .NET Core 1.x, the type was removed from the product entirely, and it was brought back in .NET Core 2.0 under the guise of "app compat".Though as of the time of this writing there are no known types in .NET Core which can be leveraged by
BinaryFormatter
to achieve RCE, it is impractical to make this guarantee going forward. As new types are introduced (or existing types from the full Framework are re-introduced) I have no doubt that one of them will contain such a vector that we weren't able to catch during development due to the complexity of this system.Some applications use a custom
SerializationBinder
to mitigate attacks both on .NET Core and in full Framework. The engineering team's current understanding is that a properly-writtenSerializationBinder
is sufficient to block RCE attacks. However, this is not a guarantee, and even so, a customSerializationBinder
cannot mitigate other attacks like DoS within theBinaryFormatter
infrastructure itself.All of this said, the conclusion is that
BinaryFormatter
is not an appropriate serialization format for application developers to be using in a cloud-connected world. It was a mistake for us to bring it back in .NET Core 2.0, as us re-introducing it into the shared framework gives the type an undeserved air of respectability, like we're nodding and implicitly approving its usage by modern applications.To address this, and to eventually migrate applications off of
BinaryFormatter
, I propose the following steps. These steps will only affect customers who target .NET 5 or above for their applications.Remove
BinaryFormatter
from the shared framework and into its own standalone package that must be manually retrieved and installed into the application.Mark the
BinaryFormatter
type itself with[Obsolete]
, or include in the package a code analyzer which warns on the type's usage. Force developers to explicitly acknowledge that they're calling an extremely dangerous API and that they must be certain the data is coming from a trusted location.Document the
BinaryFormatter
code base as legacy only. Clearly communicate that Microsoft is making no further investments in this code base, and all bug reports or feature requests will be immediately closed.Create a separate "safe" (type-limiting, bounds-checking, etc.) serializer whose payload format is
BinaryFormatter
-compatible and which can be used to deserialize existing payloads in a safe fashion, even if those payloads originate from an untrusted source. The API surface will look different than the traditionalBinaryFormatter.Deserialze
call due to needing to configure the serializer on a per-call basis, but at minimum it gives applications a way to read existing payloads. This allows them to migrate their data without leaving them hanging. This serializer could be in the same package thatBinaryFormatter
is in.These steps would serve to modernize the runtime and SDK by eliminating a type which has no place in today's development cycle, it serves as a forcing function to get application developers to fix potential vulnerabilities in their code, and it provides on an opt-in basis a way for applications to continue to work if they really must continue using the type going forward.
The text was updated successfully, but these errors were encountered: