[Sema Diagnostics] Add Miscellaneous Sema diagnostics to enforce that the new os log APIs and atomics are passed constants for certain arguments #26969

ravikandhadai · 2019-08-30T22:27:05Z

This PR introduces a new AST-level check to enforce correct usage of the new (as yet private) os log APIs based on string interpolation. The check is implemented in a new file: OSLogSemaDiagnostics.cpp and is invoked by performSyntacticExprDiagnostics defined in MiscDiagnostics.cpp. The following properties are enforced by this check:

Every os log call accepts only a string literal or an interpolated string literal.
If the interpolations in a log message take additional parameters (in order to specify formatting and privacy options), those additional parameters must be either literals or enum elements with literal arguments.
Every appendInterpolation function invoked to handle an interpolated expression in a log message must be defined in the system modules and is not defined by users. This requirement ensures that string interpolations passed to the log calls are processed by the implementation defined in the overlay. (This will be relaxed in the future so that users can extend the APIs to interpolate their own types. )

These properties guarantee that the string interpolation passed to the log calls will be de-constructed into a static format string and a sequence of arguments by the OSLogOptimization pass.

rdar://50413562

ravikandhadai · 2019-08-30T22:41:52Z

@swift-ci please test

xedin

At least first and second checks could be done and diagnosed in the constraint solver instead. We can detect what the declaration represents and put additional requirements on the associated (argument) types.

I think we should try and avoid producing solutions which don't fit the purpose.

ravikandhadai · 2019-09-16T20:39:24Z

@xedin I have a concern about moving the first and second checks to the constraint solver. The requirement that formatting and the log message must be literals will be relaxed in the future. E.g. we could allow global "let" constants to be passed as format options to the log APIs. We may also support concatenation of log messages. Therefore other solutions would be relevant and acceptable in the future, though not right now.

Also, another minor concern is, given that this is a check specific to an overlay, would it be better to separate this from the main type checking algorithm? I don't think type checking performance would be a problem here.

xedin · 2019-09-16T20:55:25Z

@ravikandhadai Interesting, if this is just a temporary limitation then it's fine to keep it in MiscDiagnostics. It would be good to indicate that in the code via comment somehow.

ravikandhadai · 2020-02-10T19:09:32Z

An update: I am planning to update this PR soon so that it can diagnose non-constants passed to the new OS log APIs. The interface of the new OS log APIs has changed since this PR had been created so this needs to be updated.

ravikandhadai · 2020-03-31T23:24:01Z

Update: this PR implements a Sema check for enforcing that the arguments passed to certain APIs are constants. Specifically, it checks whether the new os_log APIs are passed constants and whether the ordering argument of the low-level atomic operations is a constant. These APIs are identified by an @_semantics attribute. For os_log APIs the semantics attribute is "oslog.requires_constant_arguments". For the atomic operations, the semantics attribute is "atomics.requires_constant_ordering".

The Sema check is implemented in a new file: lib/Sema/ConstantnessSemaDiagnostics.cpp. It is invoked by performSyntacticExprDiagnostics defined in MiscDiagnostics.cpp. This PR also includes two test files: diag_constantness_check.swift and diagnostics_constantness_check_os_log.swift. The latter uses the OSLogTestHelper module which lives in stdlib/private.

The following is the high-level overview of the Sema check lib/Sema/ConstantnessSemaDiagnostics.cpp.

Walk over the AST looking for calls annotated with any of the above mentioned semantics attributes and extracts the arguments that are required to be constants. In the os_log case, all arguments have to be constants. In the atomics case, only the ordering argument should be a constant. The ordering argument is determined based on the typename of the parameter.
Check that the arguments discovered in the above step are constants, which is recursively defined as follows. (a) All literals expression (including string-interpolation literals) are constants. (b) Closure expressions are constants. (c) enum cases are constants as long as their arguments, if any, are constants. (d) Function calls and property accesses are constants iff they are annotated as @_semantics("constant_evaluable") and their arguments are constants. This check is implemented by the function checkConstantness (in file MiscDiagnostics.cpp).
Any errors discovered are diagnosed with a message the points to the sub-expression that is not a constant. E.g. calling a function foo(_: Int) whose parameter is required to be a constant with a variable would result in an error "argument must be an integer literal". The test files diag_constantness_check.swift and diag_constantness_check_os_log.swift have more examples of the diagnostics.

ravikandhadai · 2020-03-31T23:24:36Z

@swift-ci Please test

xedin · 2020-04-02T00:23:40Z

lib/Sema/ConstantnessSemaDiagnostics.cpp

+/// Given a call \c callExpr, if some or all of its arguments are required to be
+/// constants, check that property on the arguments.
+static void diagnoseConstantArgumentRequirementOfCall(const CallExpr *callExpr,
+                                                      const ASTContext &ctx) {


Have you considered integrating this into a constraint solver? There is a way to check flags like this already when argument is matched to a parameter for a given overload choice and record a fix per argument which would diagnose as a constness violation. In general I think we shouldn't be producing solutions which are "incorrect" this way.

@xedin I am worried about putting this in the main type checking algorithm as this is not a language feature yet. We are just using some semantics attribute to enforce this on specific APIs. This uses hard coded knowledge of those APIs which are also outside the stdlib. The semantics attribute and also the constraints may also change as the APIs evolves (until of course we evolve a language feature that can subsume this ad hoc check). I think it will be a language feature eventually, and the annotation will be generalized to a type or parameter annotation, at which point it should certainly be in the constraint solver. But now, it is more fine tuned to specific APIs. That is why I wonder if it would be best as a MiscDiagnostics.

We already have multiple precedents of that so I wouldn't worry that much. I'd say we should just add a flag and put everything into the solver for a couple of reasons:

We'd have to do this anyway :)

It's much easier to work with calls in the solver without involving expressions;

We are planning to change AST representation of calls so it would be great if new uses of TupleExpr/ParenExpr could be avoided;

Avoids producing incorrect solutions, which means that solver behavior when it comes to overloading would be correct for free;

Fix and diagnostics could be worked on separately for failure detection.

@xedin Okay. I am quite new to that code base. It would very helpful if you can provide some pointers to how I can implement this in the constraint solver. I did a quick glance, and found somethings.

I see that there is TypeChecker::getDeclTypeCheckingSemantics. I can extend that enum to include these functions. I see that it is used in finishApply (particularly in finishApplyOfDeclWithSpecialTypeCheckingSemantics), and in getTypeOfReferenceWithSpecialTypeCheckingSemantics. It seems to me that most of the logic can go inside the finishApplyOfDeclWithSpecialTypeCheckingSemantics, where it can call the coerceCallArguments and then checks for constantness of some arguments. The diagnostics can also be emitted there. Does it match with what you were thinking?

I think the place you are looking for is matchCallArguments https://github.com/apple/swift/blob/master/lib/Sema/CSSimplify.cpp#L967. That's the place we arguments are matched to their corresponding parameter types. matchCallArguments also retrieves selected overload used to form this call, so all you need to do is to determine which parameters have to be constants and fail such comparisons unless shouldAttemptFixes() is set. There are also a couple of examples of use of fixes available in that method. I'll be happy to help you out with any other questions you might have.

APIs and atomic operations are passed compile-time constants for certain arguments.

ravikandhadai · 2020-04-07T01:44:23Z

@swift-ci Please test

swift-ci · 2020-04-07T01:47:04Z

Build failed
Swift Test Linux Platform
Git Sha - a822d80

swift-ci · 2020-04-07T01:47:15Z

Build failed
Swift Test OS X Platform
Git Sha - a822d80

ravikandhadai · 2020-04-07T01:52:29Z

Based on discussions with @xedin and some experimentation, we have planned to leave this check as MiscDiagnostics. (Moving this into the constraint solver to the point where arguments and parameters are matched (e.g. in matchTypes), was not possible in the current state as some member declarations referred to by the arguments, necessary for this check, were not resolved at that point.)

@xedin I have simplified the Misc Diagnostics code a bit, added more tests and rebased it on master.

xedin

LGTM! Unfortunately we'd have to do some work in the constraint system before this could be moved to the solver especially for unresolved member references e.g. foo(.bar(x)) because today we'd use contextual base type of the .bar as a type of argument conversion which means that conversion is going to be simplified before overload choice for .bar could be picked.

slavapestov · 2020-04-07T22:22:40Z

Is there a reason we can't diagnose this in the SIL pass where we constant-fold the calls? This way there's less risk of divergence between the implementation and specified behavior.

ravikandhadai · 2020-04-08T00:35:52Z

Yes. The SIL level diagnostic pass (particularly in the os_log case) is not capable of providing good diagnostics in many situations. IMO, there are two reasons for this: (a) the nature of the problem itself. Here, we are analyzing a large piece of SIL code without any clear demarkation of the first and the last SIL instructions that defines the boundary of the relevant SIL code. This Sema check enforces a syntactic restriction and thereby provides some clear boundaries for the SIL code that needs to be analyzed. In the absence of this, the SIL pass must make a choice on whether it should stop analyzing and blame the user or must continue analyzing in the hope that it may potentially evaluate the code to a constant. This essentially means that the analysis budget and not the structure of the code describes what is an error and what is not. This can produce very unintuitive errors. We cannot clearly map out the source code snippets that would be accepted. E.g. simple patterns like creating a local variable to hold the value of a subexpression may or may not work depending on how far separated the point of definition and point of use is. (Note that the traditional constant folding pass e.g.ConstantFolding.cpp works at the level of a single builtin, where the boundaries are fixed by the builtin instruction and its literal arguments. Moreover, it also can have false negatives, and is not required to catch all errors.)

(b) Too much canonicalization and loss of information happens at the SIL level, which makes it quite difficult to diagnose the root cause of the problem. E.g. transparent functions loose their SourceLoc info. Also, since many source-level code fragments have identical SIL, imposing a syntactic model at the SIL level would be quite fragile.

First imposing a syntactic restriction on the API for what is allowed and what is not allowed, and ensuring that the SIL optimization pass can optimize everything that is allowed seems to provide a cleaner approach in this case. (Looking forward, I also think that a generalization of this Sema check may be used to power a much more general language-level concept like constexpr.)

ravikandhadai requested review from xedin, devincoughlin, DougGregor, milseman and airspeedswift August 30, 2019 22:27

xedin reviewed Sep 16, 2019

View reviewed changes

ravikandhadai changed the title ~~[Sema Diagnostics] Add Sema diagnostics to enforce the user model of the new os log APIs based on string interpolation~~ [DNM][Sema Diagnostics] Add Sema diagnostics to enforce the user model of the new os log APIs based on string interpolation Feb 10, 2020

ravikandhadai force-pushed the oslog-user-model branch from daf9b28 to a822d80 Compare March 31, 2020 22:46

ravikandhadai requested a review from lorentey March 31, 2020 22:48

ravikandhadai requested a review from xedin April 1, 2020 01:12

xedin reviewed Apr 2, 2020

View reviewed changes

[Sema] Add miscellaneous sema diagnostics to check that the new os log

b57a1d7

APIs and atomic operations are passed compile-time constants for certain arguments.

ravikandhadai force-pushed the oslog-user-model branch from a822d80 to b57a1d7 Compare April 7, 2020 01:41

ravikandhadai requested a review from xedin April 7, 2020 01:52

xedin approved these changes Apr 7, 2020

View reviewed changes

ravikandhadai merged commit 4264b39 into swiftlang:master Apr 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Sema Diagnostics] Add Miscellaneous Sema diagnostics to enforce that the new os log APIs and atomics are passed constants for certain arguments #26969

[Sema Diagnostics] Add Miscellaneous Sema diagnostics to enforce that the new os log APIs and atomics are passed constants for certain arguments #26969

ravikandhadai commented Aug 30, 2019 •

edited

Loading

ravikandhadai commented Aug 30, 2019

xedin left a comment

ravikandhadai commented Sep 16, 2019

xedin commented Sep 16, 2019

ravikandhadai commented Feb 10, 2020 •

edited

Loading

ravikandhadai commented Mar 31, 2020 •

edited

Loading

ravikandhadai commented Mar 31, 2020

xedin Apr 2, 2020

ravikandhadai Apr 2, 2020 •

edited

Loading

xedin Apr 2, 2020

ravikandhadai Apr 2, 2020 •

edited

Loading

xedin Apr 2, 2020

ravikandhadai commented Apr 7, 2020

swift-ci commented Apr 7, 2020

swift-ci commented Apr 7, 2020

ravikandhadai commented Apr 7, 2020 •

edited

Loading

xedin left a comment

slavapestov commented Apr 7, 2020

ravikandhadai commented Apr 8, 2020

[Sema Diagnostics] Add Miscellaneous Sema diagnostics to enforce that the new os log APIs and atomics are passed constants for certain arguments #26969

[Sema Diagnostics] Add Miscellaneous Sema diagnostics to enforce that the new os log APIs and atomics are passed constants for certain arguments #26969

Conversation

ravikandhadai commented Aug 30, 2019 • edited Loading

ravikandhadai commented Aug 30, 2019

xedin left a comment

Choose a reason for hiding this comment

ravikandhadai commented Sep 16, 2019

xedin commented Sep 16, 2019

ravikandhadai commented Feb 10, 2020 • edited Loading

ravikandhadai commented Mar 31, 2020 • edited Loading

ravikandhadai commented Mar 31, 2020

xedin Apr 2, 2020

Choose a reason for hiding this comment

ravikandhadai Apr 2, 2020 • edited Loading

Choose a reason for hiding this comment

xedin Apr 2, 2020

Choose a reason for hiding this comment

ravikandhadai Apr 2, 2020 • edited Loading

Choose a reason for hiding this comment

xedin Apr 2, 2020

Choose a reason for hiding this comment

ravikandhadai commented Apr 7, 2020

swift-ci commented Apr 7, 2020

swift-ci commented Apr 7, 2020

ravikandhadai commented Apr 7, 2020 • edited Loading

xedin left a comment

Choose a reason for hiding this comment

slavapestov commented Apr 7, 2020

ravikandhadai commented Apr 8, 2020

ravikandhadai commented Aug 30, 2019 •

edited

Loading

ravikandhadai commented Feb 10, 2020 •

edited

Loading

ravikandhadai commented Mar 31, 2020 •

edited

Loading

ravikandhadai Apr 2, 2020 •

edited

Loading

ravikandhadai Apr 2, 2020 •

edited

Loading

ravikandhadai commented Apr 7, 2020 •

edited

Loading