Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sema Diagnostics] Add Miscellaneous Sema diagnostics to enforce that the new os log APIs and atomics are passed constants for certain arguments #26969

Merged
merged 1 commit into from
Apr 7, 2020

Conversation

ravikandhadai
Copy link
Contributor

@ravikandhadai ravikandhadai commented Aug 30, 2019

This PR introduces a new AST-level check to enforce correct usage of the new (as yet private) os log APIs based on string interpolation. The check is implemented in a new file: OSLogSemaDiagnostics.cpp and is invoked by performSyntacticExprDiagnostics defined in MiscDiagnostics.cpp. The following properties are enforced by this check:

  • Every os log call accepts only a string literal or an interpolated string literal.

  • If the interpolations in a log message take additional parameters (in order to specify formatting and privacy options), those additional parameters must be either literals or enum elements with literal arguments.

  • Every appendInterpolation function invoked to handle an interpolated expression in a log message must be defined in the system modules and is not defined by users. This requirement ensures that string interpolations passed to the log calls are processed by the implementation defined in the overlay. (This will be relaxed in the future so that users can extend the APIs to interpolate their own types. )

These properties guarantee that the string interpolation passed to the log calls will be de-constructed into a static format string and a sequence of arguments by the OSLogOptimization pass.

rdar://50413562

@ravikandhadai
Copy link
Contributor Author

@swift-ci please test

Copy link
Contributor

@xedin xedin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least first and second checks could be done and diagnosed in the constraint solver instead. We can detect what the declaration represents and put additional requirements on the associated (argument) types.

I think we should try and avoid producing solutions which don't fit the purpose.

@ravikandhadai
Copy link
Contributor Author

@xedin I have a concern about moving the first and second checks to the constraint solver. The requirement that formatting and the log message must be literals will be relaxed in the future. E.g. we could allow global "let" constants to be passed as format options to the log APIs. We may also support concatenation of log messages. Therefore other solutions would be relevant and acceptable in the future, though not right now.

Also, another minor concern is, given that this is a check specific to an overlay, would it be better to separate this from the main type checking algorithm? I don't think type checking performance would be a problem here.

@xedin
Copy link
Contributor

xedin commented Sep 16, 2019

@ravikandhadai Interesting, if this is just a temporary limitation then it's fine to keep it in MiscDiagnostics. It would be good to indicate that in the code via comment somehow.

@ravikandhadai ravikandhadai changed the title [Sema Diagnostics] Add Sema diagnostics to enforce the user model of the new os log APIs based on string interpolation [DNM][Sema Diagnostics] Add Sema diagnostics to enforce the user model of the new os log APIs based on string interpolation Feb 10, 2020
@ravikandhadai
Copy link
Contributor Author

ravikandhadai commented Feb 10, 2020

An update: I am planning to update this PR soon so that it can diagnose non-constants passed to the new OS log APIs. The interface of the new OS log APIs has changed since this PR had been created so this needs to be updated.

@ravikandhadai ravikandhadai changed the title [DNM][Sema Diagnostics] Add Sema diagnostics to enforce the user model of the new os log APIs based on string interpolation [Sema Diagnostics] Add Miscellaneous Sema diagnostics to enforce that the new os log APIs and atomics are passed constants for certain arguments Mar 31, 2020
@ravikandhadai
Copy link
Contributor Author

ravikandhadai commented Mar 31, 2020

Update: this PR implements a Sema check for enforcing that the arguments passed to certain APIs are constants. Specifically, it checks whether the new os_log APIs are passed constants and whether the ordering argument of the low-level atomic operations is a constant. These APIs are identified by an @_semantics attribute. For os_log APIs the semantics attribute is "oslog.requires_constant_arguments". For the atomic operations, the semantics attribute is "atomics.requires_constant_ordering".

The Sema check is implemented in a new file: lib/Sema/ConstantnessSemaDiagnostics.cpp. It is invoked by performSyntacticExprDiagnostics defined in MiscDiagnostics.cpp. This PR also includes two test files: diag_constantness_check.swift and diagnostics_constantness_check_os_log.swift. The latter uses the OSLogTestHelper module which lives in stdlib/private.

The following is the high-level overview of the Sema check lib/Sema/ConstantnessSemaDiagnostics.cpp.

  1. Walk over the AST looking for calls annotated with any of the above mentioned semantics attributes and extracts the arguments that are required to be constants. In the os_log case, all arguments have to be constants. In the atomics case, only the ordering argument should be a constant. The ordering argument is determined based on the typename of the parameter.

  2. Check that the arguments discovered in the above step are constants, which is recursively defined as follows. (a) All literals expression (including string-interpolation literals) are constants. (b) Closure expressions are constants. (c) enum cases are constants as long as their arguments, if any, are constants. (d) Function calls and property accesses are constants iff they are annotated as @_semantics("constant_evaluable") and their arguments are constants. This check is implemented by the function checkConstantness (in file MiscDiagnostics.cpp).

  3. Any errors discovered are diagnosed with a message the points to the sub-expression that is not a constant. E.g. calling a function foo(_: Int) whose parameter is required to be a constant with a variable would result in an error "argument must be an integer literal". The test files diag_constantness_check.swift and diag_constantness_check_os_log.swift have more examples of the diagnostics.

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please test

@ravikandhadai ravikandhadai requested a review from xedin April 1, 2020 01:12
/// Given a call \c callExpr, if some or all of its arguments are required to be
/// constants, check that property on the arguments.
static void diagnoseConstantArgumentRequirementOfCall(const CallExpr *callExpr,
const ASTContext &ctx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered integrating this into a constraint solver? There is a way to check flags like this already when argument is matched to a parameter for a given overload choice and record a fix per argument which would diagnose as a constness violation. In general I think we shouldn't be producing solutions which are "incorrect" this way.

Copy link
Contributor Author

@ravikandhadai ravikandhadai Apr 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xedin I am worried about putting this in the main type checking algorithm as this is not a language feature yet. We are just using some semantics attribute to enforce this on specific APIs. This uses hard coded knowledge of those APIs which are also outside the stdlib. The semantics attribute and also the constraints may also change as the APIs evolves (until of course we evolve a language feature that can subsume this ad hoc check). I think it will be a language feature eventually, and the annotation will be generalized to a type or parameter annotation, at which point it should certainly be in the constraint solver. But now, it is more fine tuned to specific APIs. That is why I wonder if it would be best as a MiscDiagnostics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have multiple precedents of that so I wouldn't worry that much. I'd say we should just add a flag and put everything into the solver for a couple of reasons:

  1. We'd have to do this anyway :)
  2. It's much easier to work with calls in the solver without involving expressions;
  3. We are planning to change AST representation of calls so it would be great if new uses of TupleExpr/ParenExpr could be avoided;
  4. Avoids producing incorrect solutions, which means that solver behavior when it comes to overloading would be correct for free;
  5. Fix and diagnostics could be worked on separately for failure detection.

Copy link
Contributor Author

@ravikandhadai ravikandhadai Apr 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xedin Okay. I am quite new to that code base. It would very helpful if you can provide some pointers to how I can implement this in the constraint solver. I did a quick glance, and found somethings.

I see that there is TypeChecker::getDeclTypeCheckingSemantics. I can extend that enum to include these functions. I see that it is used in finishApply (particularly in finishApplyOfDeclWithSpecialTypeCheckingSemantics), and in getTypeOfReferenceWithSpecialTypeCheckingSemantics. It seems to me that most of the logic can go inside the finishApplyOfDeclWithSpecialTypeCheckingSemantics, where it can call the coerceCallArguments and then checks for constantness of some arguments. The diagnostics can also be emitted there. Does it match with what you were thinking?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the place you are looking for is matchCallArguments https://github.com/apple/swift/blob/master/lib/Sema/CSSimplify.cpp#L967. That's the place we arguments are matched to their corresponding parameter types. matchCallArguments also retrieves selected overload used to form this call, so all you need to do is to determine which parameters have to be constants and fail such comparisons unless shouldAttemptFixes() is set. There are also a couple of examples of use of fixes available in that method. I'll be happy to help you out with any other questions you might have.

APIs and atomic operations are passed compile-time constants for
certain arguments.
@ravikandhadai
Copy link
Contributor Author

@swift-ci Please test

@swift-ci
Copy link
Contributor

swift-ci commented Apr 7, 2020

Build failed
Swift Test Linux Platform
Git Sha - a822d80

@swift-ci
Copy link
Contributor

swift-ci commented Apr 7, 2020

Build failed
Swift Test OS X Platform
Git Sha - a822d80

@ravikandhadai
Copy link
Contributor Author

ravikandhadai commented Apr 7, 2020

Based on discussions with @xedin and some experimentation, we have planned to leave this check as MiscDiagnostics. (Moving this into the constraint solver to the point where arguments and parameters are matched (e.g. in matchTypes), was not possible in the current state as some member declarations referred to by the arguments, necessary for this check, were not resolved at that point.)

@xedin I have simplified the Misc Diagnostics code a bit, added more tests and rebased it on master.

@ravikandhadai ravikandhadai requested a review from xedin April 7, 2020 01:52
Copy link
Contributor

@xedin xedin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Unfortunately we'd have to do some work in the constraint system before this could be moved to the solver especially for unresolved member references e.g. foo(.bar(x)) because today we'd use contextual base type of the .bar as a type of argument conversion which means that conversion is going to be simplified before overload choice for .bar could be picked.

@ravikandhadai ravikandhadai merged commit 4264b39 into swiftlang:master Apr 7, 2020
@slavapestov
Copy link
Contributor

Is there a reason we can't diagnose this in the SIL pass where we constant-fold the calls? This way there's less risk of divergence between the implementation and specified behavior.

@ravikandhadai
Copy link
Contributor Author

Yes. The SIL level diagnostic pass (particularly in the os_log case) is not capable of providing good diagnostics in many situations. IMO, there are two reasons for this: (a) the nature of the problem itself. Here, we are analyzing a large piece of SIL code without any clear demarkation of the first and the last SIL instructions that defines the boundary of the relevant SIL code. This Sema check enforces a syntactic restriction and thereby provides some clear boundaries for the SIL code that needs to be analyzed. In the absence of this, the SIL pass must make a choice on whether it should stop analyzing and blame the user or must continue analyzing in the hope that it may potentially evaluate the code to a constant. This essentially means that the analysis budget and not the structure of the code describes what is an error and what is not. This can produce very unintuitive errors. We cannot clearly map out the source code snippets that would be accepted. E.g. simple patterns like creating a local variable to hold the value of a subexpression may or may not work depending on how far separated the point of definition and point of use is. (Note that the traditional constant folding pass e.g.ConstantFolding.cpp works at the level of a single builtin, where the boundaries are fixed by the builtin instruction and its literal arguments. Moreover, it also can have false negatives, and is not required to catch all errors.)

(b) Too much canonicalization and loss of information happens at the SIL level, which makes it quite difficult to diagnose the root cause of the problem. E.g. transparent functions loose their SourceLoc info. Also, since many source-level code fragments have identical SIL, imposing a syntactic model at the SIL level would be quite fragile.

First imposing a syntactic restriction on the API for what is allowed and what is not allowed, and ensuring that the SIL optimization pass can optimize everything that is allowed seems to provide a cleaner approach in this case. (Looking forward, I also think that a generalization of this Sema check may be used to power a much more general language-level concept like constexpr.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants