Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve look-up semantics for message strings #214

Closed
michaelcfanning opened this issue Aug 3, 2018 · 9 comments
Closed

Improve look-up semantics for message strings #214

michaelcfanning opened this issue Aug 3, 2018 · 9 comments
Labels
2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. design-approved The TC approved the design and I can write the change draft design-improvement impact-breaks-consumers resolved-duplicate triage-approved

Comments

@michaelcfanning
Copy link
Contributor

We don't address the possibility that a message associated with a rule might refer to a shared string that is in the resources table (rather than being embedded in a specific rule).

a message within a result or notification (which can be associated with a rule) should first examine the resources.rules table to see whether there is a rule + message id that corresponds with the data in the result/notification. if there is not, the consumer should fallback to the resources.rules table. this would be the case for referring to a globally shared string. a rule that wants to override a globally shared string would have the ability to do so using this approach.

in the event that a notification associated with a rule wanted to refer to the globally shared string for some reason, the notification could eliminate populating the rule id (if appropriate) or the tool provider would need to provide unique names between the rules strings table and the global run.resources table.

we could alternately require look-up to start with the global table, but this would have the result of requiring that there be no name collisions between message ids stored in the global table and rule-specific table (which seems like it would have the possibility of introducing issues that we can avoid by using the inside out look-up approach).

@michaelcfanning michaelcfanning added the 2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. label Aug 3, 2018
@michaelcfanning
Copy link
Contributor Author

Relates to #216

@ghost
Copy link

ghost commented Nov 3, 2018

@michaelcfanning How strong is the use case for this change? Is it strong enough to burden implementers with the additional logic?

This is a file size optimization: it coalesces message strings that are used by multiple rules. Naively it doesn't seem to me that this would be common. Do you have a real-world example?

(I labeled this issue design-improvement because that's the intent (as opposed to a bug fix or the addition of new functionality). I'm not necessarily convinced it is an improvement.)

@michaelcfanning
Copy link
Contributor Author

Why should we provide a global strings table if rules can't retrieve them? This would restrict those strings to notifications only, seems an arbitrary limitation.

To answer your question, this design emerged directly from existing tools. BinSkim, for example, has a subset of rules that emit a common message along the lines of: "I could not function because I could not locate a PDB for the target binary". Every Binskim rule has a shared 'not applicable' message: 'This rule did not execute because that target was observed to be a {0} binary and therefore not valid for analysis." where the format string is populated with things like '32-bit', 'pure managed MSIL', etc.

@ghost
Copy link

ghost commented Nov 8, 2018

Your second point (a tool that, under certain conditions, can emit the same string for every rule) is compelling, so I accept this change.

FYI, to your first point: many objects -- not just notification -- have message object-valued message properties that would use the global (non-rule-specific) message string table:

  • result
  • location
  • region
  • rectangle
  • codeFlow
  • threadFlow
  • edgeTraversal
  • stack
  • notification
  • exception (not really the same as the others; I mention it only for completeness).

@ghost ghost added to-be-written design-approved The TC approved the design and I can write the change draft triage-approved labels Nov 8, 2018
@michaelcfanning
Copy link
Contributor Author

Wow. Of course. I forgot about the enormous annotation/messaging potential in the format. :)

@ghost ghost self-assigned this Nov 8, 2018
@ghost
Copy link

ghost commented Nov 8, 2018

To say nothing of the objects that have a message-valued description property (which I forgot about above):

  • attachment
  • fix
  • graph
  • graphTraversal
  • runAutomationDetails

@ghost ghost changed the title Clarify look-up semantics for message strings Improve look-up semantics for message strings Nov 8, 2018
@ghost
Copy link

ghost commented Nov 8, 2018

Changed issue title. This isn't a "clarification"; it's a functional change. But I agree it's better, so, "improvement" :-)

@ghost
Copy link

ghost commented Mar 16, 2019

This is partially done. The result.message section says:

NOTE: This interpretation of message.messageId differs from its interpretation elsewhere in this specification. In every other message object in this specification, messageId refers to a string within the resources.messageStrings property of the current run object (see §3.11.9, and see §3.11.6.2 and §3.11.6.4 for details of the resource message string lookup procedure). It is only result.message.messageId that refers to a string within resources.rules[ruleId].messageStrings.

However, the notification.message section does not say to do the same thing if the notification object contains a ruleId property -- which it should.

At this point, I won't bother to add that text. Instead, I'll incorporate all of this into the unified message lookup procedure in the description of the message object. That procedure will be complete when #179 (toolComponent/driver/extensions), #311 (notification metadata), #325 (remove current localization mechanism), and #338 (new localization mechanism) are all done. At that point, I'll close this issue.

@michaelcfanning FYI

@ghost ghost added resolved-wont-fix and removed to-be-written labels Mar 18, 2019
@ghost
Copy link

ghost commented Mar 18, 2019

Effectively a dup of the combination of #179, #311, #324, #325, and #338.

@ghost ghost closed this as completed Mar 18, 2019
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. design-approved The TC approved the design and I can write the change draft design-improvement impact-breaks-consumers resolved-duplicate triage-approved
Projects
None yet
Development

No branches or pull requests

1 participant