Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ada::can_parse performance #377

Open
anonrig opened this issue May 8, 2023 · 15 comments
Open

Improve ada::can_parse performance #377

anonrig opened this issue May 8, 2023 · 15 comments
Labels
help wanted Extra attention is needed

Comments

@anonrig
Copy link
Member

anonrig commented May 8, 2023

It is possible to speed-up ada::can_parse performance.

@anonrig anonrig added the help wanted Extra attention is needed label Jun 25, 2023
@darowny
Copy link

darowny commented Sep 13, 2023

I would like to try this

@darowny
Copy link

darowny commented Sep 13, 2023

@anonrig do you have anything specific in mind ? or should I just investigate and try to find something ?

@anonrig
Copy link
Member Author

anonrig commented Sep 13, 2023

ideally, we should try to avoid any string allocations whenever we are calling can_parse

@darowny
Copy link

darowny commented Sep 13, 2023

I looked briefly and I am not sure what can be done directly in ada:can_parse
Maybe:

  1. removing allocations in the parse_url itself and url_aggregator?
  2. changing parse signature to maybe take directly some views and changing url_aggregator in the process ?
  3. maybe parsing base + rest together ? it should fail early anyway if base is not ok and we don't have to allocate result for base before we parse together anyway and have to allocate again into aggregator

@darowny
Copy link

darowny commented Nov 25, 2023

I give up on this someone else might take over

@CarlosEduR
Copy link
Contributor

what do you guys use for profiling? I don't have much experience with C++, but I'd like to give a shot here

@lemire
Copy link
Member

lemire commented May 10, 2024

@CarlosEduR What do you run in your computer? Windows, macOS, Linux?

@CarlosEduR
Copy link
Contributor

@lemire I currently have a dual-boot so I run Linux and Windows, but I usually go with Linux for coding.

@lemire
Copy link
Member

lemire commented May 10, 2024

@CarlosEduR Ok. So the first step is 'just' to add can_parse to our benchmarks:

static void BasicBench_AdaURL(benchmark::State& state) {

Basically copying this code, and replacing 'parse' by 'can_parse' ought to do.

This would be an excellent start.

@CarlosEduR
Copy link
Contributor

@lemire awesome, I'll do it right now.

@CarlosEduR
Copy link
Contributor

@lemire can_parse benchmark added:


bytes/URL: 86.859205
curl : OMITTED
input bytes: 8688092
number of URLs: 100025
performance counters: Enabled
rust version : 1.76.0
zuri : OMITTED
--------------------------------------------------------------------------------------------
Benchmark                                  Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------
BasicBench_AdaURL_href              30622042 ns     30620308 ns           23 GHz=4.24318 cycle/byte=14.8028 cycles/url=1.28576k instructions/byte=34.4159 instructions/cycle=2.32496 instructions/ns=9.86522 instructions/url=2.98934k ns/url=303.018 speed=283.736M/s time/byte=3.5244ns time/url=306.127ns url/s=3.26662M/s
BasicBench_AdaURL_aggregator_href   19617809 ns     19616697 ns           36 GHz=4.18634 cycle/byte=9.36962 cycles/url=813.838 instructions/byte=23.009 instructions/cycle=2.45571 instructions/ns=10.2804 instructions/url=1.99855k ns/url=194.403 speed=442.893M/s time/byte=2.25788ns time/url=196.118ns url/s=5.09897M/s
BasicBench_AdaURL_CanParse          19694538 ns     19692588 ns           36 GHz=4.27876 cycle/byte=9.55048 cycles/url=829.548 instructions/byte=23.5847 instructions/cycle=2.46948 instructions/ns=10.5663 instructions/url=2.04855k ns/url=193.876 speed=441.186M/s time/byte=2.26662ns time/url=196.877ns url/s=5.07932M/s
BasicBench_whatwg                   53192626 ns     53182009 ns           13 GHz=4.24331 cycle/byte=25.8289 cycles/url=2.24348k instructions/byte=68.8033 instructions/cycle=2.66381 instructions/ns=11.3034 instructions/url=5.9762k ns/url=528.709 speed=163.365M/s time/byte=6.12125ns time/url=531.687ns url/s=1.88081M/s
BasicBench_ServoUrl                 84392206 ns     84384304 ns            8 GHz=4.24188 cycle/byte=40.2331 cycles/url=3.49461k instructions/byte=108.216 instructions/cycle=2.68973 instructions/ns=11.4095 instructions/url=9.39958k ns/url=823.836 speed=102.959M/s time/byte=9.71264ns time/url=843.632ns url/s=1.18535M/s

@lemire
Copy link
Member

lemire commented May 12, 2024

@CarlosEduR Would you issue a PR? This is fantastic!!!

Once we have good metrics, we can more easily test out ideas. It seems that there are obvious things that we can drop... e.g., sections of code that always succeed and do not impact the rest of the processing. We should be able to go step-by-step.

@CarlosEduR
Copy link
Contributor

@lemire sure, I'll open a PR soon! And it sounds interesting, I'd love to help more on that too.

@CarlosEduR
Copy link
Contributor

@lemire @anonrig I'd like to keep the discussion

@lemire
Copy link
Member

lemire commented May 14, 2024

Yes. Absolutely.

Right now I don't know the easiest way to go forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants