Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions Chapters/4.Syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,9 @@ the following grammar:
<visit_ctxt> ::= "visit" "=" <identifier_core> ;
<anchor_ctxt> ::= "anchor" "=" <identifier_core> ;
<path_ctxt> ::= "path" "=" <path_absolute_escaped> ;
<fragment_qualifier> ::= "lines" "=" <line_number> ["-" <line_number>] ;
<line_number> ::= <dec_digit> + ;
<fragment_qualifier> ::= "lines" "=" <range> | "bytes" "=" <range> ;
<range> = <number> ["-" <number>] ;
<number> ::= <dec_digit> + ;
<url_escaped> ::= (* RFC 3987 IRI *)
<path_absolute_escaped> ::= (* RFC 3987 absolute path *)
```
Expand Down
19 changes: 18 additions & 1 deletion Chapters/6_Qualified_identifiers.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,30 @@ The following *context qualifiers* are available:
A "line" in the context of a file content refers to a sequence of characters that ends with a line break. This line can contain text, code, or any other form of data. In this specification, the line break is the ASCII LF character.
The "lines" qualifier allows to designate a line range inside a content.
The range can be a single line number, or a pair of line numbers separated by the ASCII `-` character.
Line numbers start from 1, and range is inclusive, i.e. the fragment includes both the lines numbered as start and end of the range.

For example, [`swh:1:cnt:4d99d2d18326621ccdd70f5ea66c2e2ac236ad8b;lines=9-15`](https://archive.softwareheritage.org/swh:1:cnt:4d99d2d18326621ccdd70f5ea66c2e2ac236ad8b;lines=9-15)
designates the function `generate_intput_stream` that is found at lines 9 to 15 of the *content* with core SWHID `swh:1:cnt:4d99d2d18326621ccdd70f5ea66c2e2ac236ad8b`.

Notice that the notion of "line number" is not always meaningful: the content
may be a binary file, or a file that uses non standard line termination character(s).

### 6.1.2 Bytes qualifier

To overcome the limitations of the lines qualifier, the bytes qualifier allows
to designate a byte range inside a content. The range can be a single byte number, or a pair of byte numbers separated by `-`.
Byte numbers start from 0, and range is inclusive, i.e. the fragment includes both the bytes numbered as start and end of the range.
If the range is a single byte number, it designates the byte at that specific position.

For example, `swh:1:cnt:4d99d2d18326621ccdd70f5ea66c2e2ac236ad8b;bytes=154-315`
designates the same function `generate_intput_stream` as in the example above, but
does not rely on any convention about line numbers.

### 6.1.3 Bytes and line qualifiers are mutually exclusive

The `bytes` and `lines` qualifiers are mutually exclusive: a valid SWHID MUST not contain both qualifiers.
A conformant implementation MAY accept a SWHID that violates this constraint, by ignoring the `lines` qualifier when the `bytes` qualifier is present.

## 6.2 Context qualifiers

### 6.2.1 Origin qualifier
Expand Down Expand Up @@ -75,7 +92,7 @@ its full state had the SWHID core identifier `swh:1:snp:d7f1b9eb7ccb596c2622c478
We recommend to equip identifiers meant to be shared with as many
qualifiers as possible. While qualifiers may be listed in any order, it
is good practice to present them in the following order:
`origin`, `visit`, `anchor`, `path` and `lines`. Redundant information
`origin`, `visit`, `anchor`, `path`, `lines` or `bytes`. Redundant information
should be omitted: for example, if the *visit* is present, and the
*path* is relative to the snapshot indicated there, then the *anchor*
qualifier is superfluous; similarly, if the *path* is empty, it may be
Expand Down