Improve look-up semantics for message strings #214

michaelcfanning · 2018-08-03T14:52:10Z

We don't address the possibility that a message associated with a rule might refer to a shared string that is in the resources table (rather than being embedded in a specific rule).

a message within a result or notification (which can be associated with a rule) should first examine the resources.rules table to see whether there is a rule + message id that corresponds with the data in the result/notification. if there is not, the consumer should fallback to the resources.rules table. this would be the case for referring to a globally shared string. a rule that wants to override a globally shared string would have the ability to do so using this approach.

in the event that a notification associated with a rule wanted to refer to the globally shared string for some reason, the notification could eliminate populating the rule id (if appropriate) or the tool provider would need to provide unique names between the rules strings table and the global run.resources table.

we could alternately require look-up to start with the global table, but this would have the result of requiring that there be no name collisions between message ids stored in the global table and rule-specific table (which seems like it would have the possibility of introducing issues that we can avoid by using the inside out look-up approach).

michaelcfanning · 2018-08-23T15:10:37Z

Relates to #216

ghost · 2018-11-03T21:23:03Z

@michaelcfanning How strong is the use case for this change? Is it strong enough to burden implementers with the additional logic?

This is a file size optimization: it coalesces message strings that are used by multiple rules. Naively it doesn't seem to me that this would be common. Do you have a real-world example?

(I labeled this issue design-improvement because that's the intent (as opposed to a bug fix or the addition of new functionality). I'm not necessarily convinced it is an improvement.)

michaelcfanning · 2018-11-07T23:47:42Z

Why should we provide a global strings table if rules can't retrieve them? This would restrict those strings to notifications only, seems an arbitrary limitation.

To answer your question, this design emerged directly from existing tools. BinSkim, for example, has a subset of rules that emit a common message along the lines of: "I could not function because I could not locate a PDB for the target binary". Every Binskim rule has a shared 'not applicable' message: 'This rule did not execute because that target was observed to be a {0} binary and therefore not valid for analysis." where the format string is populated with things like '32-bit', 'pure managed MSIL', etc.

ghost · 2018-11-08T00:15:37Z

Your second point (a tool that, under certain conditions, can emit the same string for every rule) is compelling, so I accept this change.

FYI, to your first point: many objects -- not just notification -- have message object-valued message properties that would use the global (non-rule-specific) message string table:

result
location
region
rectangle
codeFlow
threadFlow
edgeTraversal
stack
notification
exception (not really the same as the others; I mention it only for completeness).

michaelcfanning · 2018-11-08T00:21:18Z

Wow. Of course. I forgot about the enormous annotation/messaging potential in the format. :)

ghost · 2018-11-08T22:45:07Z

To say nothing of the objects that have a message-valued description property (which I forgot about above):

attachment
fix
graph
graphTraversal
runAutomationDetails

ghost · 2018-11-08T22:45:50Z

Changed issue title. This isn't a "clarification"; it's a functional change. But I agree it's better, so, "improvement" :-)

ghost · 2019-03-16T17:49:04Z

This is partially done. The result.message section says:

NOTE: This interpretation of message.messageId differs from its interpretation elsewhere in this specification. In every other message object in this specification, messageId refers to a string within the resources.messageStrings property of the current run object (see §3.11.9, and see §3.11.6.2 and §3.11.6.4 for details of the resource message string lookup procedure). It is only result.message.messageId that refers to a string within resources.rules[ruleId].messageStrings.

However, the notification.message section does not say to do the same thing if the notification object contains a ruleId property -- which it should.

At this point, I won't bother to add that text. Instead, I'll incorporate all of this into the unified message lookup procedure in the description of the message object. That procedure will be complete when #179 (toolComponent/driver/extensions), #311 (notification metadata), #325 (remove current localization mechanism), and #338 (new localization mechanism) are all done. At that point, I'll close this issue.

@michaelcfanning FYI

ghost · 2019-03-18T23:28:10Z

Effectively a dup of the combination of #179, #311, #324, #325, and #338.

michaelcfanning added the 2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. label Aug 3, 2018

ghost added impact-breaks-consumers design-improvement labels Nov 3, 2018

ghost added to-be-written design-approved The TC approved the design and I can write the change draft triage-approved labels Nov 8, 2018

ghost self-assigned this Nov 8, 2018

ghost changed the title ~~Clarify look-up semantics for message strings~~ Improve look-up semantics for message strings Nov 8, 2018

michaelcfanning mentioned this issue Jan 24, 2019

Spec should make it clear: we consult rules metadata first for string look-ups #310

Closed

ghost added resolved-wont-fix and removed to-be-written labels Mar 18, 2019

ghost closed this as completed Mar 18, 2019

ghost added resolved-duplicate and removed resolved-wont-fix labels Mar 18, 2019

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve look-up semantics for message strings #214

Improve look-up semantics for message strings #214

michaelcfanning commented Aug 3, 2018

michaelcfanning commented Aug 23, 2018

ghost commented Nov 3, 2018 •

edited by ghost

Loading

michaelcfanning commented Nov 7, 2018

ghost commented Nov 8, 2018

michaelcfanning commented Nov 8, 2018

ghost commented Nov 8, 2018

ghost commented Nov 8, 2018

ghost commented Mar 16, 2019 •

edited by ghost

Loading

ghost commented Mar 18, 2019 •

edited by ghost

Loading

Improve look-up semantics for message strings #214

Improve look-up semantics for message strings #214

Comments

michaelcfanning commented Aug 3, 2018

michaelcfanning commented Aug 23, 2018

ghost commented Nov 3, 2018 • edited by ghost Loading

michaelcfanning commented Nov 7, 2018

ghost commented Nov 8, 2018

michaelcfanning commented Nov 8, 2018

ghost commented Nov 8, 2018

ghost commented Nov 8, 2018

ghost commented Mar 16, 2019 • edited by ghost Loading

ghost commented Mar 18, 2019 • edited by ghost Loading

ghost commented Nov 3, 2018 •

edited by ghost

Loading

ghost commented Mar 16, 2019 •

edited by ghost

Loading

ghost commented Mar 18, 2019 •

edited by ghost

Loading