Examples vs. syntax spec inconsistencies

Hi! I've recently been working on a Rust implementation for this, and found a few corner cases where the [syntax](https://github.com/google/hrx/tree/502116a49c0e9619e05633f1aefa8daeae63d972#syntax) from the README didn't match up with the examples:

The straightforwardest one is probably the [`duplicate-despite-quotes.hrx`](https://github.com/google/hrx/blob/502116a49c0e9619e05633f1aefa8daeae63d972/example/invalid/duplicates.hrx#L16) part of `example/invalid/duplicates.hrx` – it looks like (syntax says naught, #1) quotes are no longer specialcased?

The next one I ran into was [`duplicate-files.hrx`](https://github.com/google/hrx/blob/502116a49c0e9619e05633f1aefa8daeae63d972/example/invalid/duplicates.hrx#L4) from that same file – that archive should, according to the spec, (a) be valid and (b) contain `<======> file\n`.
I think so due to the following: `contents` is defined as "any sequence of characters that does not include U+000A LINE FEED followed immediately by `boundary`", and `file` as `boundary " "+ path newline body?`.
Now, given a buffer containing

```
<======> file
A      BCD  EF
<======> file

```

We can see, that the AB span matches `boundary`, C – the spaces, DE – `path`, and F – `newline`. What is left? To match the optional `body`, which consists of the following:

```
<======> file

```

Note, how this chunk doesn't start with U+000A LINE FEED, despite the line starting with `boundary`. This means, that the file contents continue until EOF.

The third mismatched example plagues [`example/empty-file.hrx`](https://github.com/google/hrx/blob/502116a49c0e9619e05633f1aefa8daeae63d972/example/empty-file.hrx). Assuming the same symbols as before, we get (after the first comment)

```
<===> file1
A   BCD   EF
<===>
So is this one.
<===> file2

```

thereby hitting the first LF+boundary sequence on the line declaring `file2` (my parser returns `{file1: {cmt: "This file is empty.", ctnt: "<===>\nSo is this one."}, file2: { cmt: null, ctnt: "" }}`, which I feel is correct, going solely by the syntax?).

My hunch as to why these weren't noticed earlier is due to the usage of splitting parsers (e.g. in [`hrx.js`](https://github.com/rebeccajae/hrx.js) and [`hrx.py`](https://github.com/rebeccajae/hrx.py)), which probably handle these examples as expected.

I'd be more than happy to submit a PR addressing these issues, if deemed valid :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Examples vs. syntax spec inconsistencies #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Examples vs. syntax spec inconsistencies #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions