Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema needs to be carefully scrubbed for minItems and uniqueItems use for all arrays #270

Closed
ghost opened this issue Oct 25, 2018 · 8 comments
Labels
2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. design-approved The TC approved the design and I can write the change draft design-improvement impact-breaks-consumers impact-breaks-producers resolved-fixed triage-approved

Comments

@ghost
Copy link

ghost commented Oct 25, 2018

This was opened in the sarif-sdk repo: microsoft/sarif-sdk#1115. I'm opening this issue in the spec repo since it really is a spec issue.

@ghost ghost added bug impact-breaks-producers 2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. labels Oct 25, 2018
@ghost ghost added the discussion-ongoing label Nov 6, 2018
@ghost ghost self-assigned this Nov 30, 2018
@ghost ghost added design-approved The TC approved the design and I can write the change draft triage-approved to-be-written and removed discussion-ongoing labels Nov 30, 2018
@ghost
Copy link
Author

ghost commented Nov 30, 2018

After discussion among @lgolding, @michaelcfanning, and @kupsch, we settled on the following principles to guide our choices:

  1. Producing highly optimized (e.g., most compact) SARIF is not a goal.
  2. The SARIF design should not force producers into an optimal implementation (i.e., we loosen some restrictions to accommodate sub-optimal approaches to analysis, fix generation, etc.).
  3. SARIF producers are allowed to emit explicit but unnecessary/not useful data (such as an empty array []) except in cases where doing so creates an invalid object.
  4. We will provide explicit default values in the schema, which can be used during serialization to make SARIF compact.
  5. We use an anyOf array + null construct to differentiate cases where we need to understand whether a producer has the capability to emit a construct which may be empty
  6. We have observed that the use of fileLocations generally requires an array to be unique
  7. We prefer to err on the side of allowing arrays to be non-unique, in order to prevent too draconian validations

@ghost
Copy link
Author

ghost commented Nov 30, 2018

index root object property anyOf+null required minItems uniqueItems default
1 sarifLog runs yes TRUE 0 FALSE none
2 attachment regions   FALSE 0 TRUE []
3 attachment rectangles   FALSE 0 TRUE []
4 codeFlow threadFlows   TRUE 1 FALSE none
5 conversion analysisToolLogFiles   FALSE 0 TRUE []
6 exception innerExceptions   FALSE 0 FALSE []
7 file roles   FALSE 0 TRUE []
8 fileChange replacements   TRUE 1 FALSE none
9 fix fileChanges   TRUE 1 TRUE none
10 graph nodes   FALSE 0 TRUE []
11 graph edges   FALSE 0 TRUE []
12 graphTraversal edgeTraversals   FALSE 0 FALSE []
13 invocation arguments yes FALSE 0 FALSE null
14 invocation responseFiles yes FALSE 0 TRUE null
15 invocation attachments   FALSE 0 TRUE []
16 invocation toolNotifications   FALSE 0 FALSE []
17 invocation configurationNotifications   FALSE 0 FALSE []
18 location annotations   FALSE 0 TRUE []
19 message arguments   FALSE 0 FALSE []
20 node children   FALSE 0 TRUE []
21 properties tags   FALSE 0 TRUE []
22 resources rules FALSE 0 TRUE []
23 result locations   FALSE 0 FALSE []
24 result stacks   FALSE 0 FALSE []
25 result codeFlows   FALSE 0 FALSE []
26 result relatedLocations   FALSE 0 FALSE []
27 result suppressionStates yes FALSE 0 TRUE null
28 result attachments   FALSE 0 TRUE []
29 result workItemUris yes FALSE 0 TRUE null
30 result fixes   FALSE 0 TRUE []
31 resultProvenance conversionSources   FALSE 0 TRUE []
32 run invocations   FALSE 0 FALSE []
33 run versionControlProvenance FALSE 0 TRUE []
34 run logicalLocations FALSE 0 FALSE []
35 run files FALSE 0 TRUE []
36 run results yes TRUE 0 FALSE  null
37 run aggregateIds   FALSE 0 TRUE []
38 run newlineSequences FALSE 1 TRUE ["\r\n", "\n"]
39 stack frames   TRUE 0 FALSE  []
40 stackFrame parameters   FALSE 0 FALSE []
41 threadFlow locations   TRUE 1 FALSE  

@michaelcfanning
Copy link
Contributor

Whew. You know, we really should try to get one more array into the format so that we have 42.

ghost pushed a commit that referenced this issue Dec 4, 2018
ghost pushed a commit that referenced this issue Dec 13, 2018
@ghost ghost closed this as completed Dec 13, 2018
ghost pushed a commit that referenced this issue Jan 2, 2019
@ghost
Copy link
Author

ghost commented Jan 3, 2019

At this time, the code generator cannot handle anyOf. So we cannot yet make the schema changes for properties that are specified as "anyOf + null" in the table above. We can make the rest of the changes.

The affected properties are:

  • sarifLog.runs
  • invocation.arguments
  • invocation.responseFiles
  • result.suppressionStates
  • result.workItemUris
  • run.results

@michaelcfanning FYI

@michaelcfanning
Copy link
Contributor

We don't necessarily need to allow limitations in the tool to drive the schema update. We could, for example, maintain an altered schema that drives the SDK code gen until that library is functional.

@ghost
Copy link
Author

ghost commented Jan 3, 2019

Yes, we can do that. In that case, we would:

  1. Validate the SDK's copy of the schema that does not include anyOf.
  2. Update the spec's copy of the schema by adding the anyOfs.
  3. Publish the spec's copy of the schema to schemastore.

@ghost
Copy link
Author

ghost commented Jan 3, 2019

(Of course now we have to remember to make future updates by hand in both copies of the schema until the code gen can handle the "real" schema.)

@ghost
Copy link
Author

ghost commented Jan 3, 2019

@michaelcfanning

The externalPropertyFiles work introduces some array-valued properties that we missed in the table above. Here is the way the schema stands today with respect to those properties. The spec isn't clear enough on these properties. It says what should happen if these properties are externalized (in which case, obviously, the array must have at least one element), but it doesn't say what should happen if they are not externalized (should it be absent? can it be an empty array? is there a default?). The spec doesn't even explicitly say that the array elements have to be unique, although obviously they do.

index root object property anyOf+null required minItems uniqueItems default
1 externalPropertyFiles files FALSE 1 TRUE
2 externalPropertyFiles invocations FALSE 1 TRUE
3 externalPropertyFiles logicalLocations FALSE 1 TRUE
4 externalPropertyFiles results FALSE 1 TRUE

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. design-approved The TC approved the design and I can write the change draft design-improvement impact-breaks-consumers impact-breaks-producers resolved-fixed triage-approved
Projects
None yet
Development

No branches or pull requests

1 participant