Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc clap needs #82

Closed
epage opened this issue Mar 1, 2023 · 6 comments
Closed

Misc clap needs #82

epage opened this issue Mar 1, 2023 · 6 comments

Comments

@epage
Copy link
Contributor

epage commented Mar 1, 2023

For clap and related applications, I've been looking into ANSI parsers. None quite meet my needs so I was looking at writing my own (or forking vte and turning it into what I want) but I figured I'd reach out first in case there is a way to make this work within vte and its of interest.

vte seems like it is optimized for developers for use in alacritty. Performance would most be noticed in highly interactive applications which would have a high escape code to text ratio. Since every control code needs to be processed to render correctly for alacritty, vte cares more about the control code side of printable control codes like \n. As the build times and binary size are likely a drop in bucket for alacritty, I'm assuming they haven't been optimized.

My care abouts are the opposite of the above. My applications are static and likely to have a low escape code to text ratio and I would want to treat all printable control codes as text. Technically, this can all be handled with vte's design but the char-by-char processing won't be the most optimal. Instead I'd want to deal with slices of text. clap is also a heavily used crate and there is a lot of interest in managing the build times and compile sizes. I could likely get away without the proc macro generated state tables and replace them with a function with matches and have little performance hit but massive compile time and binary size improvements.

What I'm trying to decide is how much to contribute vte to handle both cases or if its better to go my own route. Thoughts?

(sorry, wasn't there if there was a better medium to reach out)

@chrisduerr
Copy link
Member

I'm not sure how much details you actually need to figure out about the escape sequences, or if you're just looking to discard them, but to me it sounds like there's more things that your usecase doesn't share with vte than things that are shared.

Ultimately if you take out the char-by-char parsing, remove the performance, then throw out some of the escape sequences you don't care about, there's not much of an escape sequence parser left.

If you think you can make things work without affecting VTE's performance, I'd be happy to help adapting it to your usecase. But it sounds like that might not be the best approach depending on what your actual application is. That said, it might not make much sense to rewrite VTE entirely unless the char-by-char parsing performance and binary size actually make a difference, so I'd probably try and measure that first.

@epage
Copy link
Contributor Author

epage commented Mar 1, 2023

For build times, my initial changes to vte dropped debug build times for parselog from 1.5s to 0.5s which is significant for clap. proc-macro2 would be a reused dependency for clap_derive users which helps somewhat.

I'm actually surprised that switching away from the table didn't change binary size, so that could possibly be reintroduced with self modifying code.

I'll have to get further into my experiment to be able to say whether runtime performance is improved for my application. I'm actually considering this as a wrapper around Parser that delegates to Parser when needed. My hope is that will also allow me to see what build times are like when dropping utf8parser

@epage
Copy link
Contributor Author

epage commented Mar 1, 2023

Dropping utf8parse saved around 40ms of build time. Its not much but my design will unlikely unuse it so cutting it would be helpful.

@epage
Copy link
Contributor Author

epage commented Mar 2, 2023

Would you be open to upstreaming parts of my experiment like:

  • Switching to criterion so you can do benches on stable, making it easier to not breaking benches during development
  • Rename Action::None to Action::Nop to avoid any ambiguity when use Action::* happens?
  • Replace the proc macro with test-time codegen?

Holding off on suggesting any more as I need to see how the ideas pan out

@chrisduerr
Copy link
Member

Switching to criterion so you can do benches on stable, making it easier to not breaking benches during development

In my experience doing benchmarks with criterion has been extremely unreliable. I doubt this would actually provide any real-world benefit.

Rename Action::None to Action::Nop to avoid any ambiguity when use Action::* happens?

I'd highly discourage any imports like this in Rust code, but I suppose I'd be open to the idea of changing it.

Replace the proc macro with test-time codegen?

You mean code generation that has to be done manually? That seems extremely inconvenient for development and its benefits seem questionable to me.

epage added a commit to rust-cli/anstyle that referenced this issue Mar 2, 2023
In #4, I researched different libs I could use for this but none fit the
bill.  I approached `vte` about making changes to fit our needs but it
didn't seem like there was interest
(alacritty/vte#82), so instead we are using it
as the base for what we need.
epage added a commit to rust-cli/anstyle that referenced this issue Mar 2, 2023
In #4, I researched different libs I could use for this but none fit the
bill.  I approached `vte` about making changes to fit our needs but it
didn't seem like there was interest
(alacritty/vte#82), so instead we are using it
as the base for what we need.
@epage
Copy link
Contributor Author

epage commented Mar 3, 2023

I'd highly discourage any imports like this in Rust code, but I suppose I'd be open to the idea of changing it.

Oh, I agree about avoiding use Action::*, I just did it for the sake of some dropped changes I made for my alternative to the proc macro. I kept the change as I felt it was a slight improvement but one I'm willing to drop.

You mean code generation that has to be done manually? That seems extremely inconvenient for development and its benefits seem questionable to me.

I've found for low-churn code-gen, it offers big compile time improvements and using some snapshotting library with it makes it easy to enforce / update. I've switched all of my codegen from build.rs to this approach and generally recommend it. As I linked before, matklad (of rust-analyzer fame) also does so.

Thought I'd post an update now that I also have runtime performance numbers.

Key:

  • demo.vte was the "testfile" benchmark
  • state_changes was the data from the `state_changes benchmark
  • rg_help.vte is rg -h which has no escape codes
  • rg_linus.vte is rg -i linux --color always on the Linux code base
  • advance is Parser::advance
  • advance_strip is Parser::advance used in a recreation of strip-ansi-escapes
  • next_bytes_strip is my new slice-oriented function used to recreate strip-ansi-escapes
advance/demo.vte        time:   [50.155 µs 50.369 µs 50.647 µs]
advance_strip/demo.vte  time:   [58.339 µs 58.573 µs 58.836 µs]
next_bytes_strip/demo.vte
                        time:   [49.043 µs 49.267 µs 49.525 µs]

advance/rg_help.vte     time:   [17.184 µs 17.236 µs 17.293 µs]
advance_strip/rg_help.vte
                        time:   [23.440 µs 23.483 µs 23.529 µs]
next_bytes_strip/rg_help.vte
                        time:   [4.3577 µs 4.3745 µs 4.3926 µs]

advance/rg_linus.vte    time:   [381.53 µs 382.55 µs 383.68 µs]
advance_strip/rg_linus.vte
                        time:   [447.79 µs 449.93 µs 452.51 µs]
next_bytes_strip/rg_linus.vte
                        time:   [375.67 µs 376.91 µs 378.20 µs]

advance/state_changes   time:   [69.255 ns 69.533 ns 69.847 ns]
advance_strip/state_changes
                        time:   [79.295 ns 79.547 ns 79.809 ns]
next_bytes_strip/state_changes
                        time:   [107.38 ns 107.96 ns 108.55 ns]

next_bytes passes all the same tests as advance. There are some more scenarios I'd like to write tests for it and I'd like to throw some fuzzing at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants