Refactor formatting code #189

niklasmohrin · 2021-05-10T21:35:40Z

This aims to introduce some abstractions where needed and refactor code that has accumulated complexity over time

niklasmohrin · 2021-05-10T21:52:49Z

@dbrgn I bumped MSRV to 1.52 here because I wanted to use str::split_once. Like I said, I don't really see why we would have to support older versions of the compiler. Are you okay with that or do you see any problems?

niklasmohrin · 2021-05-10T22:41:29Z

Tests are complaining about highlighting, but I think it looks fine. I checked the binary output and it seems that the current implementation outputs more color modifiers than what I refactored it to (left / "stable.out" is current impl, right / "master.out" is this PR)

$ du *.out --bytes
1471    master.out
1542    stable.out

Can someone take a look that knows exactly what is important for the coloring? It looks just fine to me 😄 @SimplyDanny

SimplyDanny

The colorization looks fine for what I have checked so far. I guess, the reference files can just be updated.

Some suggestions:

src/formatter.rs

SimplyDanny

Just some more ideas ...

src/formatter.rs

niklasmohrin · 2021-06-02T20:54:17Z

@dbrgn @SimplyDanny I think I managed to include everything I had in mind, your turn :)

niklasmohrin · 2021-06-06T10:27:51Z

@dbrgn In case you feel like merging this PR, please make sure not to squash the history, so that we can keep the documentation I put in the commits

dbrgn

Thanks, this is a clear improvement.

However, with all the ad-hoc parsing in multiple passes (first we detect line types in the tokenizer, then we detect variables in highlihgt_code, then we detect the code name in highlight_code_segment I wonder if in the long term it would be better if the tokenizer (now line iterator) were more powerful and could emit more fine-grained tokens on the first pass. We would need up to len(command_name) lookahead, but that's not an issue at all. This way, we would not need to differentiate between LineType tokens and HighlightingSnippet anymore (they are kind of similar if you give up the line based approach that I used initially for simplicity).

The main reason for splitting tokenization and output formatting in parsers is that the processing of the input data is separated from the formatting of the output. Right now we process input data both in the tokenizer and in the formatter, which isn't really the cleanest approach.

(However, that was just an observation and does not affect this PR.)

@dbrgn In case you feel like merging this PR, please make sure not to squash the history, so that we can keep the documentation I put in the commits

Sure, which such clean commit descriptions there's no need for squashing :) If you want, you could squash the commit "Remove unneeded imports in main.rs" into the first commit ("Add FindFrom extension trait to str to start searching at a byte offset") which added those imports in the first place.

src/formatter.rs

src/line_iterator.rs

src/formatter.rs

dbrgn · 2021-09-07T20:22:40Z

I think only the renamings and a rebase against master are left until we can merge this!

…offset. This also moves the contents of `dedup.rs` into a shared module `extensions`.

This includes - adding the `HighlightingSnippet` enum as a common ground for highlighting and printing code to communicate - decomposing `print_lines` accordingly - clearing up `highlight_code_segment` (previously `highlight_command`) - adding unit tests (now that they can reason about `HighlightingSnippet`s instead of having to output on the integration test level

This also replaces the `next_token` method with the `next` method from the `Iterator` trait.

This moves `print_page` from `main.rs` and `print_snippet` from `formatter.rs` into a new file `output.rs`. To decompose `print_snippet` from `print_lines`, the latter now takes the `yield_snippet` callable as an argument (similar to how the helper methods got a hold of it). In order for this to work, callers have to provide a function that is generic over all possible snippet lifetimes.

As someone reading this file for the first time, I would want to see the only public and most general function first and find the specifics further down instead of having to look for the "module entry" first.

Although this did not fail on old master, it is still an important case to test. Actually, this test doesn't even break when introducing some errors that fail i18n unit tests, unless the command name itself is non-ascii.

- Rename: `yield_snippet` => `process_snippet` - Rename: `HighlightingSnippet` => `PageSnippet`

dbrgn

Cool, thank you!

niklasmohrin mentioned this pull request May 10, 2021

Directly return in case of an empty command #178

Merged

niklasmohrin force-pushed the simplify_highlight_commands branch 2 times, most recently from e406d4d to 8e8da2b Compare May 10, 2021 21:50

niklasmohrin force-pushed the simplify_highlight_commands branch 2 times, most recently from ac52725 to e4c23eb Compare May 10, 2021 22:44

niklasmohrin requested a review from dbrgn May 10, 2021 22:44

niklasmohrin force-pushed the simplify_highlight_commands branch 2 times, most recently from c395989 to f187714 Compare May 10, 2021 22:51

SimplyDanny suggested changes May 11, 2021

View reviewed changes

src/formatter.rs Outdated Show resolved Hide resolved

src/formatter.rs Outdated Show resolved Hide resolved

src/formatter.rs Outdated Show resolved Hide resolved

niklasmohrin force-pushed the simplify_highlight_commands branch from 0b30256 to de5f461 Compare May 16, 2021 09:50

SimplyDanny reviewed May 16, 2021

View reviewed changes

src/formatter.rs Outdated Show resolved Hide resolved

src/formatter.rs Outdated Show resolved Hide resolved

src/formatter.rs Outdated Show resolved Hide resolved

niklasmohrin force-pushed the simplify_highlight_commands branch 2 times, most recently from fd2a39f to 8a570f2 Compare May 21, 2021 10:50

niklasmohrin force-pushed the simplify_highlight_commands branch from e323830 to 25cde36 Compare June 2, 2021 19:34

niklasmohrin marked this pull request as ready for review June 2, 2021 20:40

dbrgn requested changes Jun 6, 2021

View reviewed changes

niklasmohrin force-pushed the simplify_highlight_commands branch from 5e679f2 to 2515abb Compare June 10, 2021 15:02

niklasmohrin requested a review from dbrgn June 29, 2021 09:47

niklasmohrin force-pushed the simplify_highlight_commands branch from 2515abb to df100dd Compare August 20, 2021 10:01

niklasmohrin mentioned this pull request Sep 7, 2021

Custom patches should not need to be formatted like a full page #181

Closed

dbrgn added the waiting-for-author The PR needs an update before it can be considered for merging. label Sep 7, 2021

niklasmohrin added 4 commits September 8, 2021 14:38

Add FindFrom extension trait to str to start searching at a byte …

cdbca5c

…offset. This also moves the contents of `dedup.rs` into a shared module `extensions`.

Rename Tokenizer to LineIterator

e8b1c9e

This also replaces the `next_token` method with the `next` method from the `Iterator` trait.

niklasmohrin added 4 commits September 8, 2021 14:44

Reorder functions in formatter.rs

743998a

As someone reading this file for the first time, I would want to see the only public and most general function first and find the specifics further down instead of having to look for the "module entry" first.

Update inkscape-default.expected

22baa45

Add integration test for non-ascii page rendering

b216a63

Although this did not fail on old master, it is still an important case to test. Actually, this test doesn't even break when introducing some errors that fail i18n unit tests, unless the command name itself is non-ascii.

Final touches for formatting refactor

3ac087a

- Rename: `yield_snippet` => `process_snippet` - Rename: `HighlightingSnippet` => `PageSnippet`

niklasmohrin force-pushed the simplify_highlight_commands branch from df100dd to 3ac087a Compare September 8, 2021 13:04

niklasmohrin added ready-for-review The PR can be reviewed and is waiting for the maintainers. and removed waiting-for-author The PR needs an update before it can be considered for merging. labels Sep 8, 2021

dbrgn added this to the v1.5.0 milestone Sep 12, 2021

dbrgn approved these changes Sep 12, 2021

View reviewed changes

dbrgn merged commit a811788 into dbrgn:master Sep 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor formatting code #189

Refactor formatting code #189

niklasmohrin commented May 10, 2021

niklasmohrin commented May 10, 2021

niklasmohrin commented May 10, 2021

SimplyDanny left a comment

SimplyDanny left a comment

niklasmohrin commented Jun 2, 2021

niklasmohrin commented Jun 6, 2021

dbrgn left a comment •

edited

dbrgn commented Sep 7, 2021 •

edited

dbrgn left a comment

Refactor formatting code #189

Refactor formatting code #189

Conversation

niklasmohrin commented May 10, 2021

niklasmohrin commented May 10, 2021

niklasmohrin commented May 10, 2021

SimplyDanny left a comment

Choose a reason for hiding this comment

SimplyDanny left a comment

Choose a reason for hiding this comment

niklasmohrin commented Jun 2, 2021

niklasmohrin commented Jun 6, 2021

dbrgn left a comment • edited

Choose a reason for hiding this comment

dbrgn commented Sep 7, 2021 • edited

dbrgn left a comment

Choose a reason for hiding this comment

dbrgn left a comment •

edited

dbrgn commented Sep 7, 2021 •

edited