New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster String#blank? #12976

Merged
merged 1 commit into from Nov 21, 2013

Conversation

Projects
None yet
7 participants
@tmm1
Contributor

tmm1 commented Nov 21, 2013

Our rails app spends 2-3% of production cpu time in this method:

$ stackprof data/stackprof-cpu-3975-1384979644.dump --method "String#blank?"
String#blank? (ruby/2.1.0/gems/activesupport-2.3.14.github30/lib/active_support/core_ext/object/blank.rb:80)
  samples:   305 self (2.9%)  /    317 total (3.0%)
  callers:
     158  (   49.8%)  Object#present?
      24  (    7.6%)  ActionController::Response#content_type
      24  (    7.6%)  block in ActiveRecord::Base.merge_conditions
      22  (    6.9%)  UrlHelper#escape_url_path
      16  (    5.0%)  ActionView::Helpers::AssetTagHelper#compute_public_path
       9  (    2.8%)  ActiveRecord::Base.sanitize_sql_for_conditions
       8  (    2.5%)  ActionController::Response#nonempty_ok_response?
       6  (    1.9%)  ActiveRecord::Base.add_conditions!
  callees (12 total):
      12  (  100.0%)  String#encoding_aware?
  code:
                                  |    80  |   def blank?
   67    (0.6%) /    67   (0.6%)  |    81  |     return true if empty?
                                  |    82  |
                                  |    83  |     # 1.8 does not takes [:space:] properly
   16    (0.1%) /     4   (0.0%)  |    84  |     if encoding_aware?
  232    (2.2%) /   232   (2.2%)  |    85  |       self !~ /[^[:space:]]/
                                  |    86  |     else
    2    (0.0%) /     2   (0.0%)  |    87  |       self !~ NON_WHITESPACE_REGEXP
                                  |    88  |     end

This pull request optimizes the regex in this method slightly by removing the double negative.

require 'benchmark'

iterations = 500_000
strings = ['', ' ', '  ', '   ', '     ', 'test']

Benchmark.bmbm(40) do |b|
  b.report('str !~ /[^[:space:]]/') do
    iterations.times{ strings.each{ |str| str !~ /[^[:space:]]/ } }
  end
  b.report('!!(str =~ /[^[:space:]]/)') do
    iterations.times{ strings.each{ |str| !!(str =~ /[^[:space:]]/) } }
  end
  b.report('str =~ /\A[[:space:]]*\z/') do
    iterations.times{ strings.each{ |str| str =~ /\A[[:space:]]*\z/ } }
  end
  b.report('str =~ /\A\s*\z/') do
    iterations.times{ strings.each{ |str| str =~ /\A\s*\z/ } }
  end
  b.report('str.empty? || str == " " || str =~ /\A\s*\z/') do
    iterations.times{ strings.each{ |str| str.empty? || str == " " || str =~ /\A\s*\z/ } }
  end
end

__END__

Rehearsal --------------------------------------------------------------------------------
str !~ /[^[:space:]]/                          3.340000   0.010000   3.350000 (  3.349366)
!!(str =~ /[^[:space:]]/)                      2.610000   0.000000   2.610000 (  2.614685)
str =~ /\A[[:space:]]*\z/                      2.470000   0.000000   2.470000 (  2.483112)
str =~ /\A\s*\z/                               2.520000   0.010000   2.530000 (  2.523422)
str.empty? || str == " " || str =~ /\A\s*\z/   2.690000   0.000000   2.690000 (  2.701668)
---------------------------------------------------------------------- total: 13.650000sec

                                                   user     system      total        real
str !~ /[^[:space:]]/                          3.360000   0.000000   3.360000 (  3.376225)
!!(str =~ /[^[:space:]]/)                      2.580000   0.010000   2.590000 (  2.592369)
str =~ /\A[[:space:]]*\z/                      2.480000   0.000000   2.480000 (  2.482733)
str =~ /\A\s*\z/                               2.420000   0.000000   2.420000 (  2.419532)
str.empty? || str == " " || str =~ /\A\s*\z/   2.680000   0.010000   2.690000 (  2.684017)
@fxn

This comment has been minimized.

Show comment
Hide comment
@fxn

fxn Nov 21, 2013

Member

❤️

Member

fxn commented Nov 21, 2013

❤️

fxn added a commit that referenced this pull request Nov 21, 2013

@fxn fxn merged commit 9b423c9 into rails:master Nov 21, 2013

1 check was pending

default The Travis CI build is in progress
Details
@tmm1

This comment has been minimized.

Show comment
Hide comment
@tmm1

tmm1 Nov 21, 2013

Contributor

The biggest caller here is Object#present?, which is often invoked as params[:something].present?. This requires a full scan of the string to check for blankness every time.

Perhaps String#present? should simply alias !empty? instead...

Object#present? (gems/activesupport-2.3.14.github30/lib/active_support/core_ext/object/blank.rb:20)
  samples:    11 self (0.1%)  /    177 total (1.7%)
  callees (166 total):
     158  (   95.2%)  String#blank?
       6  (    3.6%)  Object#blank?
       2  (    1.2%)  NilClass#blank?
  code:
                                  |    20  |   def present?
  177    (1.7%) /    11   (0.1%)  |    21  |     !blank?
                                  |    22  |   end
Contributor

tmm1 commented Nov 21, 2013

The biggest caller here is Object#present?, which is often invoked as params[:something].present?. This requires a full scan of the string to check for blankness every time.

Perhaps String#present? should simply alias !empty? instead...

Object#present? (gems/activesupport-2.3.14.github30/lib/active_support/core_ext/object/blank.rb:20)
  samples:    11 self (0.1%)  /    177 total (1.7%)
  callees (166 total):
     158  (   95.2%)  String#blank?
       6  (    3.6%)  Object#blank?
       2  (    1.2%)  NilClass#blank?
  code:
                                  |    20  |   def present?
  177    (1.7%) /    11   (0.1%)  |    21  |     !blank?
                                  |    22  |   end
@jeremy

This comment has been minimized.

Show comment
Hide comment
@jeremy

jeremy Nov 21, 2013

Member

Tried str =~ /\A\p{Z}*\z/ also. [[:space:]] is still king:

                                                   user     system      total        real
str !~ /[^[:space:]]/                          3.180000   0.000000   3.180000 (  3.180527)
!!(str =~ /[^[:space:]]/)                      2.800000   0.000000   2.800000 (  2.806802)
str =~ /\A[[:space:]]*\z/                      2.330000   0.000000   2.330000 (  2.336198)
str =~ /\A\s*\z/                               2.350000   0.000000   2.350000 (  2.345009)
str.empty? || str == " " || str =~ /\A\s*\z/   2.500000   0.010000   2.510000 (  2.503975)
str =~ /\A\p{White_Space}*\z/                  2.400000   0.000000   2.400000 (  2.404541)
str =~ /\A\p{Z}*\z/                            2.400000   0.000000   2.400000 (  2.399678)
Member

jeremy commented Nov 21, 2013

Tried str =~ /\A\p{Z}*\z/ also. [[:space:]] is still king:

                                                   user     system      total        real
str !~ /[^[:space:]]/                          3.180000   0.000000   3.180000 (  3.180527)
!!(str =~ /[^[:space:]]/)                      2.800000   0.000000   2.800000 (  2.806802)
str =~ /\A[[:space:]]*\z/                      2.330000   0.000000   2.330000 (  2.336198)
str =~ /\A\s*\z/                               2.350000   0.000000   2.350000 (  2.345009)
str.empty? || str == " " || str =~ /\A\s*\z/   2.500000   0.010000   2.510000 (  2.503975)
str =~ /\A\p{White_Space}*\z/                  2.400000   0.000000   2.400000 (  2.404541)
str =~ /\A\p{Z}*\z/                            2.400000   0.000000   2.400000 (  2.399678)
@rafaelfranca

This comment has been minimized.

Show comment
Hide comment
@rafaelfranca

rafaelfranca Nov 21, 2013

Member

@tmm1 if we change present? to be to !empty? we will have a behavior change:

>> ' '.present?
=> false
>> !' '.empty?
=> true
Member

rafaelfranca commented Nov 21, 2013

@tmm1 if we change present? to be to !empty? we will have a behavior change:

>> ' '.present?
=> false
>> !' '.empty?
=> true
@jeremy

This comment has been minimized.

Show comment
Hide comment
@jeremy

jeremy Nov 21, 2013

Member

String#present? could !empty? && self =~ /[[:graph:]]/ to avoid the full scan

Member

jeremy commented Nov 21, 2013

String#present? could !empty? && self =~ /[[:graph:]]/ to avoid the full scan

@tmm1

This comment has been minimized.

Show comment
Hide comment
@tmm1

tmm1 Nov 21, 2013

Contributor

The new regex covers the empty? case pretty fast already, so adding the conditional only makes it slower. We want to avoid the regex in the common !empty? case, where params[:key] == "value"

Contributor

tmm1 commented Nov 21, 2013

The new regex covers the empty? case pretty fast already, so adding the conditional only makes it slower. We want to avoid the regex in the common !empty? case, where params[:key] == "value"

@jeremy

This comment has been minimized.

Show comment
Hide comment
@jeremy

jeremy Nov 21, 2013

Member

Yeah - hence implementing String#present? separately rather than making it !blank?

Member

jeremy commented Nov 21, 2013

Yeah - hence implementing String#present? separately rather than making it !blank?

@homakov

This comment has been minimized.

Show comment
Hide comment
@homakov

homakov Nov 21, 2013

Contributor

Isn't just string search faster? or should it match other non-space sybmols

Contributor

homakov commented Nov 21, 2013

Isn't just string search faster? or should it match other non-space sybmols

@jeremy

This comment has been minimized.

Show comment
Hide comment
@jeremy

jeremy Nov 21, 2013

Member

@homakov please translate "string search" from English code to Ruby code and run the benchmark 😁

Member

jeremy commented Nov 21, 2013

@homakov please translate "string search" from English code to Ruby code and run the benchmark 😁

@homakov

This comment has been minimized.

Show comment
Hide comment
@homakov

homakov Nov 21, 2013

Contributor

this seems to be fastest

 b.report('strip') do
    iterations.times{ strings.each{ |str| str.strip == '' } }
  end
Contributor

homakov commented Nov 21, 2013

this seems to be fastest

 b.report('strip') do
    iterations.times{ strings.each{ |str| str.strip == '' } }
  end
@homakov

This comment has been minimized.

Show comment
Hide comment
@homakov

homakov Nov 21, 2013

Contributor

@jeremy nevermind, i meant just searching for ' ' but then i understood tabs/new lines are also space chars...
for strip function:

Rehearsal --------------------------------------------------------------------------------
str !~ /[^[:space:]]/                          2.630000   0.010000   2.640000 (  2.639136)
!!(str =~ /[^\s]/)                             2.120000   0.010000   2.130000 (  2.129457)
str =~ /\A[[:space:]]*\z/                      2.130000   0.010000   2.140000 (  2.143221)
strip                                          1.850000   0.000000   1.850000 (  1.856492)
str.empty? || str == " " || str =~ /\A\s*\z/   2.310000   0.010000   2.320000 (  2.319772)
---------------------------------------------------------------------- total: 11.080000sec

                                                   user     system      total        real
str !~ /[^[:space:]]/                          2.540000   0.010000   2.550000 (  2.557135)
!!(str =~ /[^\s]/)                             2.230000   0.010000   2.240000 (  2.246694)
str =~ /\A[[:space:]]*\z/                      2.010000   0.010000   2.020000 (  2.011918)
strip                                          1.840000   0.010000   1.850000 (  1.839065)
str.empty? || str == " " || str =~ /\A\s*\z/   2.440000   0.000000   2.440000 (  2.451348)
Contributor

homakov commented Nov 21, 2013

@jeremy nevermind, i meant just searching for ' ' but then i understood tabs/new lines are also space chars...
for strip function:

Rehearsal --------------------------------------------------------------------------------
str !~ /[^[:space:]]/                          2.630000   0.010000   2.640000 (  2.639136)
!!(str =~ /[^\s]/)                             2.120000   0.010000   2.130000 (  2.129457)
str =~ /\A[[:space:]]*\z/                      2.130000   0.010000   2.140000 (  2.143221)
strip                                          1.850000   0.000000   1.850000 (  1.856492)
str.empty? || str == " " || str =~ /\A\s*\z/   2.310000   0.010000   2.320000 (  2.319772)
---------------------------------------------------------------------- total: 11.080000sec

                                                   user     system      total        real
str !~ /[^[:space:]]/                          2.540000   0.010000   2.550000 (  2.557135)
!!(str =~ /[^\s]/)                             2.230000   0.010000   2.240000 (  2.246694)
str =~ /\A[[:space:]]*\z/                      2.010000   0.010000   2.020000 (  2.011918)
strip                                          1.840000   0.010000   1.850000 (  1.839065)
str.empty? || str == " " || str =~ /\A\s*\z/   2.440000   0.000000   2.440000 (  2.451348)
@SamSaffron

This comment has been minimized.

Show comment
Hide comment
@SamSaffron

SamSaffron Nov 21, 2013

Contributor

fast_blank ftw :) https://github.com/SamSaffron/fast_blank/blob/master/benchmark

agree with @tmm1 this is often heavily overused

Contributor

SamSaffron commented Nov 21, 2013

fast_blank ftw :) https://github.com/SamSaffron/fast_blank/blob/master/benchmark

agree with @tmm1 this is often heavily overused

@SamSaffron

This comment has been minimized.

Show comment
Hide comment
@SamSaffron

SamSaffron Nov 21, 2013

Contributor

@homakov run my test suite to ensure 100% parity.

Contributor

SamSaffron commented Nov 21, 2013

@homakov run my test suite to ensure 100% parity.

@schneems

This comment has been minimized.

Show comment
Hide comment
@schneems

schneems Apr 28, 2016

Member

Do you remember why you made this change? We're trying to figure out why one is slower than the other in some benchmarks but not in others #24658 (comment)

Member

schneems commented on 30ba7ee Apr 28, 2016

Do you remember why you made this change? We're trying to figure out why one is slower than the other in some benchmarks but not in others #24658 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment