Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify URI normalization algorithm #315

Closed
michaelcfanning opened this issue Jan 25, 2019 · 4 comments
Closed

Specify URI normalization algorithm #315

michaelcfanning opened this issue Jan 25, 2019 · 4 comments
Labels
2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. design-approved The TC approved the design and I can write the change draft design-improvement e-ballot-3 impact-non-breaking-change merged Changes merged into provisional draft. resolved-fixed tc-34

Comments

@michaelcfanning
Copy link
Contributor

TC recommendation for direct producers:
for files, you SHALL normalize to the file system's absolute path then create URIs from this
(for converters, we recommend creating a uriBaseId comprising everything to the left of the rightmost .. segment)
for other uris, follow the normalization semantics defined by 3986 (transforms double slashes to one, slash dot becomes nothing, etc).

@michaelcfanning michaelcfanning added the design-approved The TC approved the design and I can write the change draft label Jan 25, 2019
@ghost
Copy link

ghost commented Jan 28, 2019

Re #2, we need to do the remainder of the 3986 normalization for file URIs as well. For example, normalize the case of the "scheme" URI component, and of the hex representations of any percent-encoded characters.

@ghost ghost added the e-ballot-3 label Mar 18, 2019
@ghost ghost added impact-non-breaking-change 2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. design-improvement tc-34 labels Mar 28, 2019
@ghost ghost self-assigned this Mar 28, 2019
@ghost
Copy link

ghost commented Mar 28, 2019

E-BALLOT #3 PROPOSAL

  • All URIs SHALL be fully normalized according to the requirements of RFC3986. (The spec actually already said that, but I'll make it clearer).
  • For file scheme URIs, direct producers SHALL preserve the file system's case. Converters SHOULD preserve the case from the analysis tool's native output format, since they don't know the file system's casing.
  • Direct producers SHALL resolve file scheme URIs to an absolute path, and optionally split a portion of the path into a uriBaseId.
  • Converters SHOULD split everything to the left of the rightmost .. into a uriBaseId.

SCHEMA CHANGES

None

@kupsch
Copy link

kupsch commented Apr 3, 2019

Producers SHOULD not include .. or . path segments in a URI after forming the full URI. Before converting to a URI, paths that contain a .. or . should be normalized to a canonical absolute path using an appropriate algorithm for the operating system on which the tool ran. This is necessary as the path /d1/../f naively converted to a URI is file:///d1/../f which resolves to file:///f using RFC 3986. If /d1 is a symbolic link to the directory d2/d3 then the correct URI is file:///d2/f.

If a SARIF producer cannot determine the correct canonical representation of the path for some reason such as the original file system not being available, then the producer MAY generate URIs with .. segments.

For file sheme URIs, consumer must not normalize .. segments out of the path. Any paths that contain a .. segment should treat the directory formed by the segments prior to and including the .. segnent as if it were a unique directory in the file system, even if RFC 3986 normalization produces identical.

@ghost
Copy link

ghost commented Apr 6, 2019

Approved in e-ballot-3. Incorporated @kupsch's ".." improvement.

@ghost ghost closed this as completed Apr 6, 2019
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.1.0-CSD.1 Will be fixed in SARIF v2.1.0 CSD.1. design-approved The TC approved the design and I can write the change draft design-improvement e-ballot-3 impact-non-breaking-change merged Changes merged into provisional draft. resolved-fixed tc-34
Projects
None yet
Development

No branches or pull requests

2 participants