Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of URI#hostname(=) [RM#17219] #3635

Closed
wants to merge 1 commit into from

Conversation

lzap
Copy link
Contributor

@lzap lzap commented Oct 7, 2020

Hello,

URI#hostname extends URI#host with IPv6 support. In URI, IPv6 address must have square brackets (e.g. http://[2001:db8::1]), URI#hostname strips these characters out while URI#hostname= adds them if missing. There are three regular expressions to perform these tasks which can dramatically slow down performance. I am attaching a two-line patch and here is a benchmark: https://gist.github.com/lzap/24cbecb47daf29111350e41a24250922

https://bugs.ruby-lang.org/issues/17219

Calculating -------------------------------------
hostname on an ipv6 (orig)
                          4.018M (± 0.4%) i/s -     20.175M in   5.021128s
hostname on an ipv6 (new)
                          4.753M (± 0.5%) i/s -     23.930M in   5.034444s

Comparison:
hostname on an ipv6 (new):  4753483.7 i/s
hostname on an ipv6 (orig):  4018053.7 i/s - 1.18x  (± 0.00) slower

Calculating -------------------------------------
hostname on an ipv4 (orig)
                          4.145M (± 0.8%) i/s -     21.031M in   5.073638s
hostname on an ipv4 (new)
                          4.960M (± 0.6%) i/s -     25.142M in   5.069027s

Comparison:
hostname on an ipv4 (new):  4960111.9 i/s
hostname on an ipv4 (orig):  4145365.6 i/s - 1.20x  (± 0.00) slower

Calculating -------------------------------------
hostname on an host (orig)
                          4.183M (± 0.4%) i/s -     20.984M in   5.016865s
hostname on an host (new)
                          4.976M (± 0.7%) i/s -     25.213M in   5.066627s

Comparison:
hostname on an host (new):  4976454.5 i/s
hostname on an host (orig):  4182840.9 i/s - 1.19x  (± 0.00) slower

Calculating -------------------------------------
hostname= on an ipv6 (orig)
                        958.023k (± 1.1%) i/s -      4.798M in   5.008884s
hostname= on an ipv6 (new)
                          1.252M (± 1.2%) i/s -      6.337M in   5.063109s

Comparison:
hostname= on an ipv6 (new):  1251868.4 i/s
hostname= on an ipv6 (orig):   958023.5 i/s - 1.31x  (± 0.00) slower

Calculating -------------------------------------
hostname= on an ipv4 (orig)
                          2.246M (± 1.0%) i/s -     11.392M in   5.073679s
hostname= on an ipv4 (new)
                          5.739M (± 0.3%) i/s -     29.246M in   5.095776s

Comparison:
hostname= on an ipv4 (new):  5739237.9 i/s
hostname= on an ipv4 (orig):  2245518.7 i/s - 2.56x  (± 0.00) slower

Calculating -------------------------------------
hostname= on an host (orig)
                          2.266M (± 0.7%) i/s -     11.504M in   5.076980s
hostname= on an host (new)
                          5.795M (± 0.5%) i/s -     29.263M in   5.049941s

Comparison:
hostname= on an host (new):  5794820.2 i/s
hostname= on an host (orig):  2266068.9 i/s - 2.56x  (± 0.00) slower

@@ -659,7 +659,7 @@ def hostname
# it is wrapped with brackets.
#
def hostname=(v)
v = "[#{v}]" if /\A\[.*\]\z/ !~ v && /:/ =~ v
v = "[#{v}]" if v.index(':') && v[0] != '[' && v[-1] != ']'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, I have measured that '[' + v + ']' is actually a bit slower than interpolation. So keeping this as-is.

@nobu
Copy link
Member

nobu commented Oct 7, 2020

Have you tried start_with? and end_with??.

@krzysiek1507
Copy link

I guess this patch should be submitted to the URI repo? https://github.com/ruby/uri

@lzap
Copy link
Contributor Author

lzap commented Oct 7, 2020

Have you tried start_with? and end_with??.

I haven't, I have considered it for a moment and thought that [] must be faster but this was indeed a C-language mindset! Methods start_with? and end_with? are faster.

I also corrected a test failure (nil pointer exception missed).

I guess this patch should be submitted to the URI repo? https://github.com/ruby/uri

Okay, apologies this is my first contribution. I tried to research contributing guides, looks like RedMine is preferred. I will update and close this PR, file new one as you recommend and also update the RM ticket with a patch. Thanks.

@lzap
Copy link
Contributor Author

lzap commented Oct 7, 2020

There is still a failure, looking into this. I was only running rspec tests locally, this is another suite.

@krzysiek1507
Copy link

Okay, apologies this is my first contribution.

No need to apologize. This repo is being synced with gem repos. Probably this patch should be submitted to the URI repo and then it will be synced here.

@lzap
Copy link
Contributor Author

lzap commented Oct 7, 2020

Okay, thanks. Moved to: ruby/uri#12

@lzap lzap closed this Oct 7, 2020
@lzap lzap deleted the uri-hostname-17219 branch October 7, 2020 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants