Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sema] Warn on unsound pointer conversions #20467

Closed

Conversation

hamishknight
Copy link
Collaborator

@hamishknight hamishknight commented Nov 9, 2018

In this PR is the logic to emit a warning on unsound array-to-pointer, string-to-pointer, and inout-to-pointer conversions. This is done through the introduction of an underscored attribute @_nonEphemeral which denotes a pointer parameter which cannot accept pointer conversions that are only valid for the duration of the call they're passed to.

This includes all array-to-pointer and string-to-pointer conversions, and most inout-to-pointer conversions with the exception of:

  • Global and static variables
  • Stored properties on such variables that are accessed directly (e.g excluding ones with property observers, ones on non-final classes and resilient types in general)
  • Force unwraps of any of the above

Various pointer parameters in the stdlib have been marked as @_nonEphemeral, primarily pointer parameters for initialisers on pointer types. In addition, this PR infers @_nonEphemeral on pointer parameters for memberwise initialisers and enum case constructors.

Essentially, this is the warning logic of #20070 plus:

  • The allowing of stored properties on final class global/static vars to be treated as non-ephemeral.
  • An additional note that suggests that the user use withUnsafe[Mutable][Bytes/Pointer], or for arrays, withUnsafe[Mutable][Bytes/BufferPointer], or for strings, withCString.
  • The inferring of @_nonEphemeral for enum case constructors.
  • More tests.

If possible, I'd like to get this into Swift 5, allowing us to transition to an error for Swift 6. I do appreciate how late we are into the release cycle for such a (relatively big) change though.

I'm unsure if this PR requires Swift evolution discussion – I'm happy to start up a thread if it's felt that it does.

Resolves SR-2790.

WARNING(cannot_convert_non_ephemeral_warning,none,
"passing temporary pointer argument of type %0 to parameter expecting "
"pointer that outlives the duration of the call leads to undefined "
"behaviour; this will be an error in a future release", (Type))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagnostic feels a bit too wordy, but I can't think of how to improve it – any ideas @jrose-apple?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, well, we could remove that "u" from "behavior". :-)

warning: temporary pointer will be destroyed after call to %0

Okay, that sounds sufficiently scary, but the problem is that the user probably doesn't think they've written a pointer. They wrote an inout expression or an array or string.

error: cannot pass array to %0; argument %1 must be a pointer that outlives the call to %0
error: cannot pass string to %0; argument %1 must be a pointer that outlives the call to %0
error: cannot use inout expression here; argument %1 must be a pointer that outlives the call to %0

How does that sound? And then we get the warning form for pre-Swift-5 (which we maybe take away completely in the future):

warning: passing array to %0, but argument %1 should be a pointer that outlives the call to %0
warning: passing string to %0, but argument %1 should be a pointer that outlives the call to %0
warning: inout expression creates a temporary pointer, but argument %1 should be a pointer that outlives the call to %0

I don't love this but it's a little less wordy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, need to start using a US spell checker :) Thanks for the suggestions, I like them!

IMO it would be nice to use the argument type of the array or string in the diagnostic message, and avoid repeating the name of the callee – so how about the following?

error: cannot pass %0 to parameter; argument must a pointer that outlives the call%select{| to %2}1

warning: passing %0 to parameter, but argument should be a pointer that outlives the call%select{| to %2}1; this will be an error in a future release

error: cannot use inout expression here; argument must a pointer that outlives the call%select{| to %1}0

warning: inout expression creates a temporary pointer, but argument should be a pointer that outlives the call%select{| to %1}0; this will be an error in a future release

For now, I've omitted the placeholder for "argument %1", as I wasn't too sure what it would be filled in with – would it be the argument position or the text of the argument expr?

I've gone ahead and pushed a commit with the message changes so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "must a pointer".

The reason I had the name of the callee twice was that "to parameter" on its own sounds weird to me. The "argument %1" part was the position in my head, to make it slightly easier to tell what the diagnostic is referring to when just looking at the line and not the column or highlight.

I feel like the "this will be an error in a future release" isn't really adding that much value either, especially for a message that's already fairly long.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "must a pointer"

Oops, good catch!

The reason I had the name of the callee twice was that "to parameter" on its own sounds weird to me.

Interesting! "to parameter" sounds natural to me – nonetheless I'm happy to change it. Though for cases where we don't have a callee decl, should it be "to parameter" or "to call"? The former sounds more natural to me, but the latter is more consistent with the message when we do have a callee.

The "argument %1" part was the position in my head, to make it slightly easier to tell what the diagnostic is referring to when just looking at the line and not the column or highlight.

Thanks for clarifying, I'll go ahead and make that change.

I feel like the "this will be an error in a future release" isn't really adding that much value either, especially for a message that's already fairly long.

Fair enough – I added it because I felt the diagnostic wasn't quite scary enough without it, but given the length of the message, I agree it probably doesn't quite pull its weight. Should I also remove it from the "dangling pointer" diagnostic?

This non-user-facing attribute is used to denote pointer parameters which do not accept pointers produced from temporary pointer conversions such as array-to-pointer, string-to-pointer, and in some cases inout-to-pointer.
Warn on ephemeral conversions that are passed to @_nonEphemeral parameters. This will hopefully transition to an error in a future version.
These include memberwise initializers for pointer properties and enum case constructors for pointer payloads. In both cases, the pointer is being immediately escaped, so don't allow the user to pass a temporary pointer value.
These include the pointer-to-pointer and pointer-to-buffer-pointer initialiser parameters amongst a couple of others, such as `Unmanaged.fromOpaque`, and the source for the `move[...]` family of methods.
…on-ephemeral

For example, this enables:

```swift
final class C {
  var foo: Int = 0
}
var c = C()
let ptr = UnsafePointer(&c.foo)
```
- For a string, we suggest withCString
- For an array, we suggest withUnsafe[Mutable][Bytes/BufferPointer]
- For an arbitrary inout-to-pointer, we suggest withUnsafe[Mutable][Bytes/Pointer]
Copy link
Member

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let @jrose-apple approve this because I don't know Sema well enough to determine if you're missing anything. But I did review the code and tests, and they looks astonishingly thorough to me.

include/swift/AST/DiagnosticsSema.def Outdated Show resolved Hide resolved
lib/Sema/ConstraintSystem.cpp Show resolved Hide resolved
test/Sema/diag_non_ephemeral_swift4.swift Outdated Show resolved Hide resolved
Copy link
Contributor

@jrose-apple jrose-apple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot here so I might have still missed something, but this is what I noticed reading through the whole thing. @xedin or @rudkx should probably comment on the ConstraintSystem parts.

test/attr/attributes.swift Show resolved Hide resolved
lib/Sema/ConstraintSystem.cpp Show resolved Hide resolved
: diag::cannot_construct_dangling_pointer;

diags.diagnose(anchor->getLoc(), diagID, instanceTy, constructorKind)
.highlight(anchor->getSourceRange());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the anchor expression always the correct expression? What if the construction is nested inside a larger expression?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, unsimplified argument locators for an apply/subscript are always anchored on the apply/subscript expr that they correspond to.

This is how they're generated in CSApply and CSGen at least:

swift/lib/Sema/CSGen.cpp

Lines 2561 to 2564 in af13914

CS.addConstraint(ConstraintKind::ApplicableFunction,
FunctionType::get(params, outputTy, extInfo),
CS.getType(expr->getFn()),
CS.getConstraintLocator(expr, ConstraintLocator::ApplyFunction));

swift/lib/Sema/CSApply.cpp

Lines 3085 to 3089 in af13914

Expr *visitApplyExpr(ApplyExpr *expr) {
return finishApply(expr, cs.getType(expr),
ConstraintLocatorBuilder(
cs.getConstraintLocator(expr)));
}

lib/Sema/CSDiag.cpp Outdated Show resolved Hide resolved
lib/Sema/TypeCheckPattern.cpp Show resolved Hide resolved
@@ -2417,14 +2404,22 @@ bool ConstraintSystem::isConversionNonEphemeral(
continue;
}

// Look through load exprs. These can occur when projecting through a
// property on a class, which we want to support if the class is
// non-final.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the correct condition. What case is this trying to handle?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is handling the following (which we want to allow):

final class C {
  var str = ""
}

struct S {
  var c = C()
}

var s = S()
let ptr = UnsafeMutablePointer(&s.c.str)

The base s is an lvalue, but we want an rvalue C, so a LoadExpr is inserted before then performing the member access for str.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to allow this. Basically if you load from a class, you don't care what the base was anymore, and then either we allow forming pointers to instance variables or we don't. Being a global doesn't matter at that point. Right, @atrick?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let ptr = UnsafeMutablePointer(&s.c.str)

If s is a var, then there's there's nothing keeping c alive. So escaping the addres of c.str seems pretty bad.

If s is a top-level or global let then c stays alive. However, technically we don't guarantee that a Swift class object has a persistent address (it may need to for satisfying some runtime constraint, but the language doesn't guarantee it). So I don't think we should allow that either.

Sorry if I missed that in my review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through diag_non_ephemeral_warning.swift again, I see these two cases that still don't produce warnings, but probably should:

takesRaw(&topLevelFinalC.storedProperty)
takesRaw(&topLevelFragileFinalC.storedProperty)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let ptr = UnsafeMutablePointer(&s.c.str)

If s is a var, then there's there's nothing keeping c alive. So escaping the addres of c.str seems pretty bad.

@atrick Should c not be guaranteed to be kept alive as long as s.c isn't changed to a different reference? IMO this would be reasonable constraint to place on the lifetime of the pointer created from the inout-to-pointer conversion.

If s is a top-level or global let then c stays alive. However, technically we don't guarantee that a Swift class object has a persistent address (it may need to for satisfying some runtime constraint, but the language doesn't guarantee it). So I don't think we should allow that either.

Ah, okay then. I think for now at least I'll revert b6e1e99, with the view of possibly reinstating it if we ever officially make such a guarantee. I somewhat doubt many (if any) people are using this pattern anyway – there was no sign of it in the source compat suite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atrick Should c not be guaranteed to be kept alive as long as s.c isn't changed to a different reference? IMO this would be reasonable constraint to place on the lifetime of the pointer created from the inout-to-pointer conversion.

Yes. If s is a let, then we have that guarantee. And on second thought, I don't think it will ever be possible to reallocate an object that has a let reference. But I think @jrose-apple and I agree that it's better to have a simple rule that disallows non-ephemeral pointers to instance variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay then. I think for now at least I'll revert b6e1e99,

Wait a sec, that commit also allows non-ephemeral pointers to static variables. I think that's ok under the circumstances that you've identified.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atrick Thanks for clarifying!

But I think @jrose-apple and I agree that it's better to have a simple rule that disallows non-ephemeral pointers to instance variables.

Does this include non-ephemeral pointers to stored instance properties on global/static variable structures? For example, rejecting:

struct S {
  var i = 0
}

var s = S()
let ptr = UnsafeMutablePointer(&s.i)

Wait a sec, that commit also allows non-ephemeral pointers to static variables

static variables were also allowed before the commit through the checking of a base static decl and a metatype base. 8b41b2a adds an additional check for static members in order to only apply the "base must be a mutable struct" rule to stored instance properties.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this include non-ephemeral pointers to stored instance properties on global/static variable structures?

No, stored properties on structs can be pointed to if the struct itself is global/static. We want to disallow pointing to stored properties inside class instances, which I think we were calling "instance variables".

lib/Sema/CSApply.cpp Show resolved Hide resolved
test/Sema/diag_non_ephemeral_swift4.swift Outdated Show resolved Hide resolved
- Add/clarify a couple of comments
- Add a couple of tests
- Rename test file
- Remove extraneous locator simplification logic
@xedin xedin self-requested a review November 15, 2018 22:31
@xedin
Copy link
Member

xedin commented Nov 16, 2018

I'll try to take a look at Sema changes tomorrow.


/// Determines the element type of a known Unsafe[Mutable][Raw]BufferPointer
/// variant, or returns null if the type is not a buffer pointer.
Type getAnyBufferPointerElementType(BufferPointerTypeKind &BPTK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see one use of this either of new methods and it doesn't use result type, so should this just be a bool instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be, although IMO it's a more flexible API returning the pointee type. In addition, it nicely mirrors the getAnyPointerElementType API.

@@ -4854,11 +4862,15 @@ class ParamDecl : public VarDecl {
StringRef StringRepresentation;
};

/// The default value, if any, along with whether this is varargs.
llvm::PointerIntPair<StoredDefaultArgument *, 1> DefaultValueAndIsVariadic;
enum class Flags {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change very much! It might be worth separating it and merging standalone!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! I've opened a separate PR: #20663

@@ -5454,6 +5459,59 @@ static bool hasCurriedSelf(ConstraintSystem &cs, ConcreteDeclRef callee,
return false;
}

static void checkNonEphemeralArgumentConversion(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logic belongs in the solver itself (when we are matching arguments to parameters, or later when restricted constraint is being simplified), we should be able to create a ConstraintFix which is going to diagnose as a warning on an error depending on swift version.
That would involve a bit of work to distinguish between fatal and warning fixes, but ultimately that's the right approach.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I am already using constraint fixes for the error logic, but I originally wasn't sure whether they'd also be suitable for the warning logic. Looking at it again, I believe it should be fairly straightforward to adapt to handle warnings. I'll work on making that change.

Any ideas on how non-ephemeral-ness should be modelled in the constraint system? In my WIP implementation, I'm currently introducing two new constraint kinds, but that doesn't seem terribly ideal, and certainly won't scale well if we ever introduce another parameter flag that affects conversions.

Thanks for the review!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make most sense to make it aTypeMatchFlag. Maybe some constraints should just carry type match options which could be deduced by matchCallArguments? Or otherwise it has to be a "constraint restriction"...


/// Emits a note explaining to the user that an ephemeral conversion is only
/// valid for the duration of the call, and suggests an alternative to use.
static void emitIllegalEphemeralConversionNotes(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment about diagnosing this as a ConstraintFix - this logic could to be a part of FailureDiagnostic associated with such fix. I'm trying to move diagnostics away from type-checking sub-expressions and instead form solutions with fixes and diagnose them.

@hamishknight
Copy link
Collaborator Author

Sorry for taking so long to get back to this! I have opened an updated PR here: #27695.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants