Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimization for 'split' by 'partition' #15

Closed
Mifrill opened this issue Sep 23, 2021 · 2 comments · Fixed by #16
Closed

Performance optimization for 'split' by 'partition' #15

Mifrill opened this issue Sep 23, 2021 · 2 comments · Fixed by #16

Comments

@Mifrill
Copy link
Contributor

Mifrill commented Sep 23, 2021

Method split can be changed to partition to speed up the entire process.

w = line.split(/\s+/)[0]

require 'benchmark'

Benchmark.bm do |x|
  string = "'hood n 1 2 @ ; 1 0 08641944"
  x.report { 50000.times { string.split(/\s+/)[0] } }
  x.report { 50000.times { string.partition(/\s+/)[0] } }
end
[
 #<Benchmark::Tms:0x0000560ef6b6cc30 @label="", @real=0.18285021604970098, @cstime=0.0, @cutime=0.0, @stime=2.3999999999996247e-05, @utime=0.18282500000000024, @total=0.18284900000000023>, 
 #<Benchmark::Tms:0x0000560ef6b864a0 @label="", @real=0.05183851800393313, @cstime=0.0, @cutime=0.0, @stime=1.2000000000012001e-05, @utime=0.0518200000000002, @total=0.05183200000000021>
]

The change of real from 0.18284900000000023 to 0.05183200000000021 looks nice, @yohasebe what do you think?

Source:
https://stackoverflow.com/questions/7533479/ruby-string-search-which-is-faster-split-or-regex

@Mifrill
Copy link
Contributor Author

Mifrill commented Sep 23, 2021

It seems like this split method usage also might be a case for performance optimization, @yohasebe what do you think?

w, s = line.split(/\s+/)

@yohasebe
Copy link
Owner

I'm so sorry for my late reply. I needed a little time to refresh my memory about lemmatizer to respond to your suggestions and pull request. Your suggestion here looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants