Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get rid of str_len() #6

Open
DerDakon opened this issue Jul 8, 2019 · 4 comments
Open

get rid of str_len() #6

DerDakon opened this issue Jul 8, 2019 · 4 comments

Comments

@DerDakon
Copy link
Member

DerDakon commented Jul 8, 2019

strlen() is one of the functions that a compiler usually optimizes away, given much more efficient implementations can be done on todays SIMD processors. The old implementation uses loop unrolling, a common 90's optimization technique that has usually lost it's benefit today.

@schmonz
Copy link
Member

schmonz commented Jul 8, 2019

Depending on the details of the change, I have the same concerns about timing as I expressed in #7.

@DerDakon
Copy link
Member Author

There are also some friends, see the std-functions branch.

@mbhangui
Copy link
Contributor

I did some tests and it turns out the glibc versions are so much faster. strlen is faster than str_len, strncmp is faster than str_diffn, strrchr is faster than str_rchr and so on. Curious to know what makes the glibc versions faster.

This is just one of the test I carried out. Searching for one byte in a 90 byte character string in a for() loop 1 million times. The difference is huge.

str_rchr

real 0m0.657s
user 0m0.653s
sys 0m0.002s

strrchr

real 0m0.020s
user 0m0.018s
sys 0m0.002s

I got similar results for strlen and other functions.

@DerDakon
Copy link
Member Author

They are faster because of multiple reasons:

  • they use assembler implementations for the most common platforms, making use of SIMD instructions where possible
  • the compiler will optimize these if it can, i.e. if you search for something constant it will probably just assembler-optimize that constant in
  • the compiler knows that calling these functions with the same arguments multiple times will return the same values and does just one call
  • if you are doing something in a constant string the compiler often enough just optimizes away the whole call, directly embedding the result in your binary

No idea which of these happens for you, but exactly those are the reasons to get rid of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants