New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sema] Warn on unsound pointer conversions #20467
[Sema] Warn on unsound pointer conversions #20467
Conversation
WARNING(cannot_convert_non_ephemeral_warning,none, | ||
"passing temporary pointer argument of type %0 to parameter expecting " | ||
"pointer that outlives the duration of the call leads to undefined " | ||
"behaviour; this will be an error in a future release", (Type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This diagnostic feels a bit too wordy, but I can't think of how to improve it – any ideas @jrose-apple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, well, we could remove that "u" from "behavior". :-)
warning: temporary pointer will be destroyed after call to %0
Okay, that sounds sufficiently scary, but the problem is that the user probably doesn't think they've written a pointer. They wrote an inout expression or an array or string.
error: cannot pass array to %0; argument %1 must be a pointer that outlives the call to %0
error: cannot pass string to %0; argument %1 must be a pointer that outlives the call to %0
error: cannot use inout expression here; argument %1 must be a pointer that outlives the call to %0
How does that sound? And then we get the warning form for pre-Swift-5 (which we maybe take away completely in the future):
warning: passing array to %0, but argument %1 should be a pointer that outlives the call to %0
warning: passing string to %0, but argument %1 should be a pointer that outlives the call to %0
warning: inout expression creates a temporary pointer, but argument %1 should be a pointer that outlives the call to %0
I don't love this but it's a little less wordy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, need to start using a US spell checker :) Thanks for the suggestions, I like them!
IMO it would be nice to use the argument type of the array or string in the diagnostic message, and avoid repeating the name of the callee – so how about the following?
error: cannot pass %0 to parameter; argument must a pointer that outlives the call%select{| to %2}1
warning: passing %0 to parameter, but argument should be a pointer that outlives the call%select{| to %2}1; this will be an error in a future release
error: cannot use inout expression here; argument must a pointer that outlives the call%select{| to %1}0
warning: inout expression creates a temporary pointer, but argument should be a pointer that outlives the call%select{| to %1}0; this will be an error in a future release
For now, I've omitted the placeholder for "argument %1", as I wasn't too sure what it would be filled in with – would it be the argument position or the text of the argument expr?
I've gone ahead and pushed a commit with the message changes so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "must a pointer".
The reason I had the name of the callee twice was that "to parameter" on its own sounds weird to me. The "argument %1" part was the position in my head, to make it slightly easier to tell what the diagnostic is referring to when just looking at the line and not the column or highlight.
I feel like the "this will be an error in a future release" isn't really adding that much value either, especially for a message that's already fairly long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "must a pointer"
Oops, good catch!
The reason I had the name of the callee twice was that "to parameter" on its own sounds weird to me.
Interesting! "to parameter" sounds natural to me – nonetheless I'm happy to change it. Though for cases where we don't have a callee decl, should it be "to parameter" or "to call"? The former sounds more natural to me, but the latter is more consistent with the message when we do have a callee.
The "argument %1" part was the position in my head, to make it slightly easier to tell what the diagnostic is referring to when just looking at the line and not the column or highlight.
Thanks for clarifying, I'll go ahead and make that change.
I feel like the "this will be an error in a future release" isn't really adding that much value either, especially for a message that's already fairly long.
Fair enough – I added it because I felt the diagnostic wasn't quite scary enough without it, but given the length of the message, I agree it probably doesn't quite pull its weight. Should I also remove it from the "dangling pointer" diagnostic?
This non-user-facing attribute is used to denote pointer parameters which do not accept pointers produced from temporary pointer conversions such as array-to-pointer, string-to-pointer, and in some cases inout-to-pointer.
Warn on ephemeral conversions that are passed to @_nonEphemeral parameters. This will hopefully transition to an error in a future version.
These include memberwise initializers for pointer properties and enum case constructors for pointer payloads. In both cases, the pointer is being immediately escaped, so don't allow the user to pass a temporary pointer value.
These include the pointer-to-pointer and pointer-to-buffer-pointer initialiser parameters amongst a couple of others, such as `Unmanaged.fromOpaque`, and the source for the `move[...]` family of methods.
…on-ephemeral For example, this enables: ```swift final class C { var foo: Int = 0 } var c = C() let ptr = UnsafePointer(&c.foo) ```
- For a string, we suggest withCString - For an array, we suggest withUnsafe[Mutable][Bytes/BufferPointer] - For an arbitrary inout-to-pointer, we suggest withUnsafe[Mutable][Bytes/Pointer]
bbb9d0c
to
47e8120
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll let @jrose-apple approve this because I don't know Sema well enough to determine if you're missing anything. But I did review the code and tests, and they looks astonishingly thorough to me.
47e8120
to
31afbe8
Compare
31afbe8
to
4e5a2da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: diag::cannot_construct_dangling_pointer; | ||
|
||
diags.diagnose(anchor->getLoc(), diagID, instanceTy, constructorKind) | ||
.highlight(anchor->getSourceRange()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the anchor expression always the correct expression? What if the construction is nested inside a larger expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, unsimplified argument locators for an apply/subscript are always anchored on the apply/subscript expr that they correspond to.
This is how they're generated in CSApply and CSGen at least:
Lines 2561 to 2564 in af13914
CS.addConstraint(ConstraintKind::ApplicableFunction, | |
FunctionType::get(params, outputTy, extInfo), | |
CS.getType(expr->getFn()), | |
CS.getConstraintLocator(expr, ConstraintLocator::ApplyFunction)); |
Lines 3085 to 3089 in af13914
Expr *visitApplyExpr(ApplyExpr *expr) { | |
return finishApply(expr, cs.getType(expr), | |
ConstraintLocatorBuilder( | |
cs.getConstraintLocator(expr))); | |
} |
lib/Sema/ConstraintSystem.cpp
Outdated
@@ -2417,14 +2404,22 @@ bool ConstraintSystem::isConversionNonEphemeral( | |||
continue; | |||
} | |||
|
|||
// Look through load exprs. These can occur when projecting through a | |||
// property on a class, which we want to support if the class is | |||
// non-final. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the correct condition. What case is this trying to handle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handling the following (which we want to allow):
final class C {
var str = ""
}
struct S {
var c = C()
}
var s = S()
let ptr = UnsafeMutablePointer(&s.c.str)
The base s
is an lvalue, but we want an rvalue C
, so a LoadExpr
is inserted before then performing the member access for str
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to allow this. Basically if you load from a class, you don't care what the base was anymore, and then either we allow forming pointers to instance variables or we don't. Being a global doesn't matter at that point. Right, @atrick?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let ptr = UnsafeMutablePointer(&s.c.str)
If s
is a var
, then there's there's nothing keeping c
alive. So escaping the addres of c.str
seems pretty bad.
If s
is a top-level or global let
then c
stays alive. However, technically we don't guarantee that a Swift class object has a persistent address (it may need to for satisfying some runtime constraint, but the language doesn't guarantee it). So I don't think we should allow that either.
Sorry if I missed that in my review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through diag_non_ephemeral_warning.swift
again, I see these two cases that still don't produce warnings, but probably should:
takesRaw(&topLevelFinalC.storedProperty)
takesRaw(&topLevelFragileFinalC.storedProperty)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let ptr = UnsafeMutablePointer(&s.c.str)
If
s
is avar
, then there's there's nothing keepingc
alive. So escaping the addres ofc.str
seems pretty bad.
@atrick Should c
not be guaranteed to be kept alive as long as s.c
isn't changed to a different reference? IMO this would be reasonable constraint to place on the lifetime of the pointer created from the inout-to-pointer conversion.
If
s
is a top-level or globallet
thenc
stays alive. However, technically we don't guarantee that a Swift class object has a persistent address (it may need to for satisfying some runtime constraint, but the language doesn't guarantee it). So I don't think we should allow that either.
Ah, okay then. I think for now at least I'll revert b6e1e99, with the view of possibly reinstating it if we ever officially make such a guarantee. I somewhat doubt many (if any) people are using this pattern anyway – there was no sign of it in the source compat suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atrick Should c not be guaranteed to be kept alive as long as s.c isn't changed to a different reference? IMO this would be reasonable constraint to place on the lifetime of the pointer created from the inout-to-pointer conversion.
Yes. If s
is a let
, then we have that guarantee. And on second thought, I don't think it will ever be possible to reallocate an object that has a let
reference. But I think @jrose-apple and I agree that it's better to have a simple rule that disallows non-ephemeral pointers to instance variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay then. I think for now at least I'll revert b6e1e99,
Wait a sec, that commit also allows non-ephemeral pointers to static
variables. I think that's ok under the circumstances that you've identified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atrick Thanks for clarifying!
But I think @jrose-apple and I agree that it's better to have a simple rule that disallows non-ephemeral pointers to instance variables.
Does this include non-ephemeral pointers to stored instance properties on global/static variable structures? For example, rejecting:
struct S {
var i = 0
}
var s = S()
let ptr = UnsafeMutablePointer(&s.i)
Wait a sec, that commit also allows non-ephemeral pointers to
static
variables
static
variables were also allowed before the commit through the checking of a base static decl and a metatype base. 8b41b2a adds an additional check for static members in order to only apply the "base must be a mutable struct" rule to stored instance properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this include non-ephemeral pointers to stored instance properties on global/static variable structures?
No, stored properties on structs can be pointed to if the struct itself is global/static. We want to disallow pointing to stored properties inside class instances, which I think we were calling "instance variables".
- Add/clarify a couple of comments - Add a couple of tests - Rename test file - Remove extraneous locator simplification logic
4e5a2da
to
abf6d0b
Compare
I'll try to take a look at Sema changes tomorrow. |
|
||
/// Determines the element type of a known Unsafe[Mutable][Raw]BufferPointer | ||
/// variant, or returns null if the type is not a buffer pointer. | ||
Type getAnyBufferPointerElementType(BufferPointerTypeKind &BPTK); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only see one use of this either of new methods and it doesn't use result type, so should this just be a bool
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be, although IMO it's a more flexible API returning the pointee type. In addition, it nicely mirrors the getAnyPointerElementType
API.
@@ -4854,11 +4862,15 @@ class ParamDecl : public VarDecl { | |||
StringRef StringRepresentation; | |||
}; | |||
|
|||
/// The default value, if any, along with whether this is varargs. | |||
llvm::PointerIntPair<StoredDefaultArgument *, 1> DefaultValueAndIsVariadic; | |||
enum class Flags { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this change very much! It might be worth separating it and merging standalone!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing! I've opened a separate PR: #20663
@@ -5454,6 +5459,59 @@ static bool hasCurriedSelf(ConstraintSystem &cs, ConcreteDeclRef callee, | |||
return false; | |||
} | |||
|
|||
static void checkNonEphemeralArgumentConversion( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic belongs in the solver itself (when we are matching arguments to parameters, or later when restricted constraint is being simplified), we should be able to create a ConstraintFix
which is going to diagnose as a warning on an error depending on swift version.
That would involve a bit of work to distinguish between fatal and warning fixes, but ultimately that's the right approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I am already using constraint fixes for the error logic, but I originally wasn't sure whether they'd also be suitable for the warning logic. Looking at it again, I believe it should be fairly straightforward to adapt to handle warnings. I'll work on making that change.
Any ideas on how non-ephemeral-ness should be modelled in the constraint system? In my WIP implementation, I'm currently introducing two new constraint kinds, but that doesn't seem terribly ideal, and certainly won't scale well if we ever introduce another parameter flag that affects conversions.
Thanks for the review!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make most sense to make it aTypeMatchFlag
. Maybe some constraints should just carry type match options which could be deduced by matchCallArguments
? Or otherwise it has to be a "constraint restriction"...
|
||
/// Emits a note explaining to the user that an ephemeral conversion is only | ||
/// valid for the duration of the call, and suggests an alternative to use. | ||
static void emitIllegalEphemeralConversionNotes( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment about diagnosing this as a ConstraintFix
- this logic could to be a part of FailureDiagnostic
associated with such fix. I'm trying to move diagnostics away from type-checking sub-expressions and instead form solutions with fixes and diagnose them.
Sorry for taking so long to get back to this! I have opened an updated PR here: #27695. |
In this PR is the logic to emit a warning on unsound array-to-pointer, string-to-pointer, and inout-to-pointer conversions. This is done through the introduction of an underscored attribute
@_nonEphemeral
which denotes a pointer parameter which cannot accept pointer conversions that are only valid for the duration of the call they're passed to.This includes all array-to-pointer and string-to-pointer conversions, and most inout-to-pointer conversions with the exception of:
Various pointer parameters in the stdlib have been marked as
@_nonEphemeral
, primarily pointer parameters for initialisers on pointer types. In addition, this PR infers@_nonEphemeral
on pointer parameters for memberwise initialisers and enum case constructors.Essentially, this is the warning logic of #20070 plus:
The allowing of stored properties on final class global/static vars to be treated as non-ephemeral.withUnsafe[Mutable][Bytes/Pointer]
, or for arrays,withUnsafe[Mutable][Bytes/BufferPointer]
, or for strings,withCString
.@_nonEphemeral
for enum case constructors.If possible, I'd like to get this into Swift 5, allowing us to transition to an error for Swift 6. I do appreciate how late we are into the release cycle for such a (relatively big) change though.
I'm unsure if this PR requires Swift evolution discussion – I'm happy to start up a thread if it's felt that it does.
Resolves SR-2790.