Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SecureString obsoletions and shrouded buffer proposal #147

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
188 changes: 188 additions & 0 deletions accepted/2020/better-obsoletion/securestring-obsoletion.md
@@ -0,0 +1,188 @@
# SecureString obsoletions and shrouded buffer proposal

Author: @GrabYourPitchforks

## SecureString obsoletion proposal

Obsolete the following APIs as warning, following the [better obsoletions](https://github.com/dotnet/designs/blob/master/accepted/2020/better-obsoletion/better-obsoletion.md) spec:

- All `System.Security.SecureString` [ctor overloads](https://docs.microsoft.com/dotnet/api/system.security.securestring.-ctor)
- All `System.Net.NetworkCredential` [ctor overloads](https://docs.microsoft.com/dotnet/api/system.net.networkcredential.-ctor) which accept SecureString as a parameter
- [`System.Net.NetworkCredential.SecurePassword`](https://docs.microsoft.com/dotnet/api/system.net.networkcredential.securepassword)
- All [`System.Diagnostics.Process.Start`](https://docs.microsoft.com/dotnet/api/system.diagnostics.process.start) overloads which accept SecureString as a parameter
- [`System.Diagnostics.ProcessStartInfo.Password`](https://docs.microsoft.com/dotnet/api/system.diagnostics.processstartinfo.password)
- All overloads of the following `System.Security.Cryptography.X509Certificates.X509Certificate` APIs which accept SecureString as a parameter:
- [The ctor](https://docs.microsoft.com/dotnet/api/system.security.cryptography.x509certificates.x509certificate.-ctor)
- [`Export`](https://docs.microsoft.com/dotnet/api/system.security.cryptography.x509certificates.x509certificate.export)
- [`Import`](https://docs.microsoft.com/dotnet/api/system.security.cryptography.x509certificates.x509certificate.import)
- Any equivalent ctor or method on [`X509Certificate2`](https://docs.microsoft.com/dotnet/api/system.security.cryptography.x509certificates.x509certificate2)
Comment on lines +9 to +18
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't call this out explicitly, but it looks like all these APIs have string equivalents (Process.Start doesn't, but one can construct ProcessStartupInfo and set PasswordClearText). I assume the (implicit) goal is that we'll ensure that the scenario (passing a credential) can still be achieved without using SecureString, correct?


> See later in this document for the list of APIs which would be preferred instead.

This does not change the user-observable runtime behavior of the obsoleted APIs. Any existing code compiled against these call sites will continue to work.

The above only proposes to obsolete the SecureString constructor and some specific receivers. The intent is that if an application does wish to continue using SecureString as an exchange type or for interop or manipulation purposes, we don't want to force them to sprinkle a bunch of unnecessary suppressions throughout the code base. Suppressing just the small handful of instances where these types are created or consumed should be enough for those developers to signal that they're comfortable using the type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppressing just the small handful of instances where these types are created or consumed should be enough for those developers to signal that they're comfortable using the type.

Assuming we use a single diagnostic ID for all secure string related obsoletions they wouldn't have to sprinkle anything -- they could just globally turn this particular obsoletion off.


## Background \& motivation

The specter of removing SecureString has been a topic of debate for some time. It first came up during [the .NET Core 1.x API discussions](https://github.com/dotnet/apireviews/blob/ac765393a7281184ec30948b7a6d86b7ad58af1c/2015/07-14-securestring/README.md). At that time introduction of the SecureString type was rejected. It was later added during the .NET Core 2.0 push to migrate a wider set of existing .NET Framework APIs. And more recently there has been a [desire to move people away](https://github.com/dotnet/runtime/issues/30612) from the type given that [it cannot fulfill the security promises](https://github.com/dotnet/platform-compat/blob/master/docs/DE0001.md) it guarantees.

We want users to be able to rely on the contracted guarantees of .NET cross-platform. This means that the framework should discourage the use of APIs which users believe may incorrectly fulfill a compliance requirement for them. `CryptProtectMemory`, which the Windows implementation of SecureString relies upon, [is even documented as inappropriate](https://docs.microsoft.com/windows/win32/api/dpapi/nf-dpapi-cryptprotectmemory) for the scenario that SecureString uses it for.

## Today's use cases for SecureString

> This list is largely drawn from the feedback at https://github.com/dotnet/runtime/issues/30612 plus our analysis of internal consumers.
GrabYourPitchforks marked this conversation as resolved.
Show resolved Hide resolved

### Storing credentials
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hardly an expert in this domain but I've built enough web sites and console apps for cloud services now that I see emerging patterns: tokens and OAuth. Both rely on secrets being passed to APIs and both tend to be strings coming from configuration or environment variables, both of which tend to be strings. NetworkCredential could be used (because both userName and password are nullable) but that seems like a clutch to me. Also, it would suffer the same problem as SecureString in that the origin is already a string (set of strings). The concern here isn't so much security/clear text data in memory, it's more the exchange type problem. If the origin is a string, people are super likely to use that as the exchange type and everything else becomes annoying. See Uri. We end up in a world where a ton of APIs are either only taking string, only taking Uri, and some both. This forces users to frequently convert back and forth which is annoying. And that was even an area where we had tooling early on to push people towards Uri (FxCop).

So without covering entry points and having good analyzers/fixers, promoting a type as the exchange mechanism (whether or not the type already exists) will be hard. Also, because string is so pervasively used today, we'll have to make it easy to convert back and forth between whatever exchange type we'll want and string. But this also means that the other benefits, such as avoiding the data in crash dumps, will be harder to achieve. I still believe it's doable but we should think of a strategy to get there and pair it with analyzers and code fixers to help folks to get better over time, by, for example suggesting to not convert a <the exchange type> to a string if the receiver also accepts <the exchange type>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I expect SensitiveData<byte> to be used more frequently than SensitiveData<char>. This matches aspnet's existing Secret class today, whose purpose is to hold cryptographic key material. There is sample code for converting a SensitiveData<T> back into a T[] or a string (see line 94 of this doc). Is that code sufficient, or do you expect there will need to be first-class APIs for this?


Some users may use SecureString as a convenient way to store credentials for a remote server, such as a SQL server. This allows them to pass around a "credentials object" that has somewhat more foolproof usage patterns than handing around a simple string.

For these scenarios we should strongly encourage users to use passwordless mechanisms, such as using X.509 certificates or platform-specific credential management. These provide better, more modern security guarantees than typical username + password combinations.

For scenarios which continue to use username + password, we should encourage users to pass around a `NetworkCredential` object instead. The NetworkCredential class abstracts away how the password is represented in memory: it can be a string, a legacy SecureString, or any future mechanism we create. Encouraging users to pass around this container rather than a raw SecureString instance allows their code to take advantage of future improvements we make to this space, even while they continue to use username + password credentials.

### Preventing secrets from getting into crash dumps

Users might rely on the fact that the SecureString backing memory is enciphered (only on Windows) to prevent secrets from getting into crash dumps. This is not ideal for a variety of reasons. The secret won't be protected before it's converted into a SecureString or after it's converted from a SecureString (such as during a p/invoke call). `SocketsHttpHandler` [converts SecureString instances to normal strings](https://github.com/dotnet/runtime/blob/0e23ef47ff8a2de6a5b5898a849ba00303aba8df/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/AuthenticationHelper.cs#L176-L185) as part of its operation. WPF's `PasswordBox` control [contains contorted code](https://github.com/dotnet/wpf/blob/53812c590c5f140297ed96ee2a78d05dffa7be35/src/Microsoft.DotNet.Wpf/src/PresentationFramework/System/Windows/Controls/PasswordTextContainer.cs#L74-L97) to manipulate SecureString instances from string-based input. `SqlClient` [copies SecureString data into an unpinned buffer](https://github.com/dotnet/SqlClient/blob/ead806e87a15be9f54c99db2ebe8918025c20fc9/src/Microsoft.Data.SqlClient/netfx/src/Microsoft/Data/Interop/SNINativeMethodWrapper.cs#L923) during manipulation. In all of these cases the secret has a high likelihood of ending up in the memory dump.

That secrets originate as strings deserves a bit more attention. The secret almost certainly entered the application via an HTTP request, or it was read from config, or it was entered into a WinForms textbox. These copies of the secrets are still floating around in memory (potentially even in a shared pool!), and SecureString does nothing to mitigate their being leaked.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this (that you don't control all the buffers) is also a potential criticism of the proposal to have GC reliably zero memory (dotnet/runtime#10480) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Having the GC eagerly zero memory relies on arrays not being pooled, etc. Documents that discuss hypothetical "high security" modes for the runtime often hit on this point as well, saying that a good end-to-end solution also involves adding Kestrel flags which prohibit pooled array usage, etc.

Copy link
Member

@danmoseley danmoseley Apr 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a mode should probably also zero memory going back into the array pool, as well as zeroing it before handing out the first time. That presumably would be much cheaper than not pooling at all. Maybe we should have an issue for that - the closest is dotnet/runtime#7532

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That presumably would be much cheaper than not pooling at all.

Not as secure, though. The "avoid the pool" flag would be opt-in for applications which want better guarantees about avoiding potential use-after-free issues. Zeroing arrays when they're returned to the pool will not avoid this issue, unfortunately. :(


The only reliable mechanism for preventing secrets from getting into crash dumps is to disable crash dumps for applications which process sensitive data. For web servers, this might mean separating out the "sensitive" parts of the application (login, payment processing, etc.) into a separate service running in a separate secured environment. The remaining aspects of the web service can then run with normal crash dump diagnostics provided they never have access to the plaintext form of the sensitive data.
GrabYourPitchforks marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

@terrajobst terrajobst Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should expand the list of secrets a bit more and also list how much we can/should protect them. For example, what about tokens? OAuth client IDs/secrets? Private keys for signing? Certs? Passwords are just one class of secrets.

Practically, I'm not sure I know how I can extract the secrets processing in my apps -- ultimately I need to send a token in every request. I guess that would mean that we don't consider a token a secret because it's short lived? Again, I'm not an expert so maybe that's obvious to experts, but we might want to spell this out explicitly to make it clear what the scope is we're trying to solve here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. What is secret is context specific and is up to the reader to define as it's domain knowledge. If we list things people will go "It's not on the list so it's fine"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point isn't to be exhaustive. The point is to enumerate common secrets to ensure we have sensible end-to-end across the APIs/technologies that need we need to (1) get them into the process, (2) passing them around and (3) handle them to the correct receiver.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experience shows that's not how it will be read. It will become a check list, in an audit, just like SecureString use is now. I don't want to see that happen.

To take your token example, it's still context sensitive, based on what the token is, what it's sent over, how it's created, and if it's backed by token binding. Simply saying token is not enough, and to be enough you have to be exhaustive, and suddenly the paragraph becomes bloated and unwieldy, and will take constant maintenance as new secrets come along.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can provide an example showing something Immo mentioned in another fork: how do I read from the environment variables or config and populate an instance of this type? It's also a good opportunity for us to show very clearly "it's ok to materialize a temporary array or string; we're not trying to prevent this." It doesn't really answer the question of "what is a secret?", but TBH that question isn't really important. What is important is how to use this as an exchange type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably it's already best practice to secure access to the disk any dumps may be written to, and attempting to keep data out of dumps is defense in depth? Or, is securing access to that disk not possible to do reliably?
I guess the same question could be asked for logging.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best practice is to turn dumps off entirely.


### Minimizing accidental exposure of sensitive data

Sometimes the goal is not to provide a foolproof mechanism for keeping data out of crash dumps. Rather, the goal might be to reduce the window in which secrets are available. One side effect of this is that unsafe-equivalent code (such as use of the uninitialized `ArrayPool<T>.Shared`) has less chance to inadvertently leak these secrets.

One way to accomplish this is by using .NET 5.0's new [pinned object heap](https://github.com/dotnet/runtime/pull/32283). Arrays allocated on the heap are guaranteed not to be moved during a GC, so applications don't need to worry about sensitive contents being copied before the application logic has had a chance to clear the memory. This has the benefit that an application can treat the data like a normal array.

Another way is to invest in [a proposed runtime feature](https://github.com/dotnet/runtime/issues/10480) which would make the GC more aggressive about zeroing memory on compaction. This has the benefit that _many_ different types – including standard strings – would become candidates for having their data zeroed out during a compaction phase. High-sensitivity applications may wish to opt in to this GC mode anyway depending on their risk analysis.

### Minimizing misuse of sensitive data

Here, the goal isn't to hide secrets from memory, but to signal to developers "this data is sensitive and should be handled cautiously." In other words, the goal is to minimize the risk of a developer _accidentally_ using the data in an insecure fashion, such as writing it to a log file. SecureString actually does a fine job of this by virtue of the fact that callers must drop down to unsafe code or use the `Marshal` class to interact with it.

There is opportunity for .NET to provide a better type for this scenario, described in more detail later.

### Using it because it has "secure" in the name

Based both on internal and external conversations, there seems to be a significant audience using the SecureString type because it has "secure" in its name. I have seen many code bases which have contorted themselves into using SecureString in as many places as possible in order to fulfill of a well-intentioned but ill-executed "security compliance" edict. Since this sort of maneuvering generally involves unsafe code, this often ends up _introducing_ security bugs. (One partner was even using unsafe code to zero-out string instances after converting them to SecureString.)

During these conversations it's apparent that they're not attempting to mitigate any specific threats. They're instead using SecureString because it's listed as a requirement in their engineering processes. These users would be best served by using the existing string class or by using the `NetworkCredential` wrapper as suggested earlier.

## A proposal for shrouding data

Of all the scenarios mentioned above, the one that strikes me as most interesting is the use of SecureString as a marker class to mean "you should be careful when using this data." It's possible to extend this into a more generalized proposal that can serve as a wrapper for arbitary types _T_, so it can be used to shroud both `byte` and `char`.

```cs
namespace System.Runtime.InteropServices
{
public class ShroudedBuffer<T> : IDisposable where T : unmanaged
GrabYourPitchforks marked this conversation as resolved.
Show resolved Hide resolved
{
public ShroudedBuffer(ReadOnlySpan<T> contents);
public int Length { get; }
public void CopyTo(Span<T> destination);
public void Dispose();
protected virtual void Dispose(bool disposing);
}
}
```

In this proposal, "shrouded" means that the buffer is nominally hidden from view, but it's not guaranteed to be enciphered or otherwise protected from inspection. How the data is stored behind the scenes for any given implementation can vary. For our built-in implementation, we protect the backing buffer to a reasonable degree from being copied during GC, and the `Dispose` method will attempt to zero out the buffer. (Whether the backing memory is managed or native is an implementation detail.) The contents are copied by the ctor, and the buffer is immutable. The caller can extract the contents into a destination buffer of their own choosing.

The API surface clearly delineates buffers for which the caller maintains responsibility (the `ctor` and `CopyTo` methods). If the caller wishes that the plaintext data is never copied during a GC, then the destination buffer they pass to `CopyTo` should be allocated from the pinned object heap, the stack, or the native memory pool. The API surface gives them full control over that process.

```cs
// Example showing unshrouding a ShroudedBuffer<char> into a temp char[].
ShroudedBuffer<char> shroudedBuffer = GetShroudedBuffer();
char[] tempArray = GC.AllocateArray<char>(shroudedBuffer.Length, pinned: true);
try
{
shroudedBuffer.CopyTo(tempArray);
// use 'tempArray' here
}
finally
{
Array.Clear(tempArray);
}

// Example showing unshrouding a ShroudedBuffer<char> into a string.
ShroudedBuffer<char> shroudedBuffer = GetShroudedBuffer();
string unshroudedString = string.Create(
shroudedBuffer.Length, shroudedBuffer, (span, shroudedBuffer) =>
{
shroudedBuffer.CopyTo(span);
});
```

We would clearly document that this type is not intended to offer cryptographic protection of the buffer contents; it's simply intended to enforce more hygienic usage of data by callers.

The type also intentionally does not give callers access to the raw backing span. This is because how the backing data is stored is strictly an implementation detail (it might not even be in-proc!), and exposing the span would leak that detail into the public API surface. The `CopyTo` method is the only way to get the raw contents. This also means that such buffers are immutable once created.

__We should take care not to position this "shrouded buffer" type as a security feature.__ It's intended solely for code hygiene purposes.

This proposal intentionally excludes `ToString`, `ToArray`, or other methods which might unshroud the buffer. `ToString` is called frequently by library code as part of logging, and we always want unshrouding to be a _deliberate_ action. Additionally, we want the caller to maintain full control over the unshrouded buffer where the contents are written, which precludes "opinionated" APIs like `ToArray` which create such buffers themselves.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy that ToString() is called by logging and other code paths that don't need to unshroud the buffer; but to me this doesn't mean we can (or shouldn't) provide a method like UnshroudToString() or UnshroudToArray(). Otherwise it will be hard to insert the type into existing code bases where virtually everyone uses strings today. The benefit of using a well-known method like UnshroudToString() is that it can be audited. If everyone just does the string.Create trick it's harder to audit (and harder to tool with an analyzer).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm I can see both sides of that argument :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Providing a well-known method would be fine in spirit IMO. The point of this type is to prevent accidental materialization of the string / array. If somebody is intentionally doing it, who are we to stop them?

Though to be honest the reason I didn't add such methods in the first place is that I assumed people would be nitpicky about how they wanted the information to be materialized. "I want it in a pinned array!" or whatever. CopyTo is intended to be a compromise which allows the caller to allocate the buffer themselves, then the secret is copied into that buffer. So CopyTo becomes the auditable method call. But if we need a simple accelerator, we could add one.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much rather not. A single CopyTo is auditable, adding more methods means more noise, and more chance for a bad audit to miss things. Keeping it to a single method is better in terms of auditing, there's less things to miss.


The proposal also intentionally leaves the type unsealed but provides no virtual methods other than `Dispose`. This is because all aspects of the implementation must be free to change between .NET versions. If there is a desire to modify the implemenation itself, we could consider an interface `IShroudedBuffer<T>` to accompany the concrete type.
GrabYourPitchforks marked this conversation as resolved.
Show resolved Hide resolved

## New proposed SecureString-replacement APIs

For the X.509 APIs, we should expose APIs which take the password as a `ROS<char>` or a `ROS<byte>` directly. This gives the caller full control over what buffer the data is being read from, allowing them to put the password into a pinned buffer if they so desire.
GrabYourPitchforks marked this conversation as resolved.
Show resolved Hide resolved

```cs
namespace System.Security.Cryptography.X509Certificates
{
public class X509Certificate
{
public X509Certificate(byte[] rawData, ReadOnlySpan<char> password, X509KeyStorageFlags keyStorageFlags);
public virtual void Import(byte[] rawData, ReadOnlySpan<char> password, X509KeyStorageFlags keyStorageFlags);
/* ... */
}
}
```

For `Process.Start` and `ProcessStartInfo` we should expose overloads which take a `NetworkCredential`. These overloads would only work on Windows.
GrabYourPitchforks marked this conversation as resolved.
Show resolved Hide resolved

```cs
namespace System.Diagnostics
{
public class Process
{
public static Process Start(string fileName, NetworkCredential credential);
}

public class ProcessStartInfo
{
public NetworkCredential Credentials { get; set; }
}
}
```

For `NetworkCredential` we should expose a property to accept a shrouded password.

```cs
namespace System.Net
{
public class NetworkCredential
{
public IShroudedBuffer<char> ShroudedPassword { get; set; }
}
}
```

Finally, there should be a conversion mechanism between `SecureString` and `ShroudedBuffer<char>`. This should help transition developers who currently have SecureString instances to APIs which accept shrouded buffers.

```cs
namespace System.Security
{
public sealed class SecureString
{
public ShroudedBuffer<char> ToShroudedBuffer();
}
}
```

## Suggested behavioral changes to SecureString for .NET 6

If we implement the shrouded buffer concept in .NET 6, I suggest we rewrite `SecureString` in terms of it. This will provide a uniform implementation across both Windows and non-Windows: the data is intentionally difficult to get to but is not enciphered on any OS.

There is already [a working prototype](https://github.com/GrabYourPitchforks/runtime/pull/5) of this rewrite of SecureString. There is no user-visible behavioral change to the SecureString APIs after this rewrite.