Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster String#blank? #12976

Merged
merged 1 commit into from
Nov 21, 2013
Merged

Faster String#blank? #12976

merged 1 commit into from
Nov 21, 2013

Conversation

tmm1
Copy link
Contributor

@tmm1 tmm1 commented Nov 21, 2013

Our rails app spends 2-3% of production cpu time in this method:

$ stackprof data/stackprof-cpu-3975-1384979644.dump --method "String#blank?"
String#blank? (ruby/2.1.0/gems/activesupport-2.3.14.github30/lib/active_support/core_ext/object/blank.rb:80)
  samples:   305 self (2.9%)  /    317 total (3.0%)
  callers:
     158  (   49.8%)  Object#present?
      24  (    7.6%)  ActionController::Response#content_type
      24  (    7.6%)  block in ActiveRecord::Base.merge_conditions
      22  (    6.9%)  UrlHelper#escape_url_path
      16  (    5.0%)  ActionView::Helpers::AssetTagHelper#compute_public_path
       9  (    2.8%)  ActiveRecord::Base.sanitize_sql_for_conditions
       8  (    2.5%)  ActionController::Response#nonempty_ok_response?
       6  (    1.9%)  ActiveRecord::Base.add_conditions!
  callees (12 total):
      12  (  100.0%)  String#encoding_aware?
  code:
                                  |    80  |   def blank?
   67    (0.6%) /    67   (0.6%)  |    81  |     return true if empty?
                                  |    82  |
                                  |    83  |     # 1.8 does not takes [:space:] properly
   16    (0.1%) /     4   (0.0%)  |    84  |     if encoding_aware?
  232    (2.2%) /   232   (2.2%)  |    85  |       self !~ /[^[:space:]]/
                                  |    86  |     else
    2    (0.0%) /     2   (0.0%)  |    87  |       self !~ NON_WHITESPACE_REGEXP
                                  |    88  |     end

This pull request optimizes the regex in this method slightly by removing the double negative.

require 'benchmark'

iterations = 500_000
strings = ['', ' ', '  ', '   ', '     ', 'test']

Benchmark.bmbm(40) do |b|
  b.report('str !~ /[^[:space:]]/') do
    iterations.times{ strings.each{ |str| str !~ /[^[:space:]]/ } }
  end
  b.report('!!(str =~ /[^[:space:]]/)') do
    iterations.times{ strings.each{ |str| !!(str =~ /[^[:space:]]/) } }
  end
  b.report('str =~ /\A[[:space:]]*\z/') do
    iterations.times{ strings.each{ |str| str =~ /\A[[:space:]]*\z/ } }
  end
  b.report('str =~ /\A\s*\z/') do
    iterations.times{ strings.each{ |str| str =~ /\A\s*\z/ } }
  end
  b.report('str.empty? || str == " " || str =~ /\A\s*\z/') do
    iterations.times{ strings.each{ |str| str.empty? || str == " " || str =~ /\A\s*\z/ } }
  end
end

__END__

Rehearsal --------------------------------------------------------------------------------
str !~ /[^[:space:]]/                          3.340000   0.010000   3.350000 (  3.349366)
!!(str =~ /[^[:space:]]/)                      2.610000   0.000000   2.610000 (  2.614685)
str =~ /\A[[:space:]]*\z/                      2.470000   0.000000   2.470000 (  2.483112)
str =~ /\A\s*\z/                               2.520000   0.010000   2.530000 (  2.523422)
str.empty? || str == " " || str =~ /\A\s*\z/   2.690000   0.000000   2.690000 (  2.701668)
---------------------------------------------------------------------- total: 13.650000sec

                                                   user     system      total        real
str !~ /[^[:space:]]/                          3.360000   0.000000   3.360000 (  3.376225)
!!(str =~ /[^[:space:]]/)                      2.580000   0.010000   2.590000 (  2.592369)
str =~ /\A[[:space:]]*\z/                      2.480000   0.000000   2.480000 (  2.482733)
str =~ /\A\s*\z/                               2.420000   0.000000   2.420000 (  2.419532)
str.empty? || str == " " || str =~ /\A\s*\z/   2.680000   0.010000   2.690000 (  2.684017)

@fxn
Copy link
Member

fxn commented Nov 21, 2013

❤️

fxn added a commit that referenced this pull request Nov 21, 2013
@fxn fxn merged commit 9b423c9 into rails:master Nov 21, 2013
@tmm1
Copy link
Contributor Author

tmm1 commented Nov 21, 2013

The biggest caller here is Object#present?, which is often invoked as params[:something].present?. This requires a full scan of the string to check for blankness every time.

Perhaps String#present? should simply alias !empty? instead...

Object#present? (gems/activesupport-2.3.14.github30/lib/active_support/core_ext/object/blank.rb:20)
  samples:    11 self (0.1%)  /    177 total (1.7%)
  callees (166 total):
     158  (   95.2%)  String#blank?
       6  (    3.6%)  Object#blank?
       2  (    1.2%)  NilClass#blank?
  code:
                                  |    20  |   def present?
  177    (1.7%) /    11   (0.1%)  |    21  |     !blank?
                                  |    22  |   end

@jeremy
Copy link
Member

jeremy commented Nov 21, 2013

Tried str =~ /\A\p{Z}*\z/ also. [[:space:]] is still king:

                                                   user     system      total        real
str !~ /[^[:space:]]/                          3.180000   0.000000   3.180000 (  3.180527)
!!(str =~ /[^[:space:]]/)                      2.800000   0.000000   2.800000 (  2.806802)
str =~ /\A[[:space:]]*\z/                      2.330000   0.000000   2.330000 (  2.336198)
str =~ /\A\s*\z/                               2.350000   0.000000   2.350000 (  2.345009)
str.empty? || str == " " || str =~ /\A\s*\z/   2.500000   0.010000   2.510000 (  2.503975)
str =~ /\A\p{White_Space}*\z/                  2.400000   0.000000   2.400000 (  2.404541)
str =~ /\A\p{Z}*\z/                            2.400000   0.000000   2.400000 (  2.399678)

@rafaelfranca
Copy link
Member

@tmm1 if we change present? to be to !empty? we will have a behavior change:

>> ' '.present?
=> false
>> !' '.empty?
=> true

@jeremy
Copy link
Member

jeremy commented Nov 21, 2013

String#present? could !empty? && self =~ /[[:graph:]]/ to avoid the full scan

@tmm1
Copy link
Contributor Author

tmm1 commented Nov 21, 2013

The new regex covers the empty? case pretty fast already, so adding the conditional only makes it slower. We want to avoid the regex in the common !empty? case, where params[:key] == "value"

@jeremy
Copy link
Member

jeremy commented Nov 21, 2013

Yeah - hence implementing String#present? separately rather than making it !blank?

@homakov
Copy link
Contributor

homakov commented Nov 21, 2013

Isn't just string search faster? or should it match other non-space sybmols

@jeremy
Copy link
Member

jeremy commented Nov 21, 2013

@homakov please translate "string search" from English code to Ruby code and run the benchmark 😁

@homakov
Copy link
Contributor

homakov commented Nov 21, 2013

this seems to be fastest

 b.report('strip') do
    iterations.times{ strings.each{ |str| str.strip == '' } }
  end

@homakov
Copy link
Contributor

homakov commented Nov 21, 2013

@jeremy nevermind, i meant just searching for ' ' but then i understood tabs/new lines are also space chars...
for strip function:

Rehearsal --------------------------------------------------------------------------------
str !~ /[^[:space:]]/                          2.630000   0.010000   2.640000 (  2.639136)
!!(str =~ /[^\s]/)                             2.120000   0.010000   2.130000 (  2.129457)
str =~ /\A[[:space:]]*\z/                      2.130000   0.010000   2.140000 (  2.143221)
strip                                          1.850000   0.000000   1.850000 (  1.856492)
str.empty? || str == " " || str =~ /\A\s*\z/   2.310000   0.010000   2.320000 (  2.319772)
---------------------------------------------------------------------- total: 11.080000sec

                                                   user     system      total        real
str !~ /[^[:space:]]/                          2.540000   0.010000   2.550000 (  2.557135)
!!(str =~ /[^\s]/)                             2.230000   0.010000   2.240000 (  2.246694)
str =~ /\A[[:space:]]*\z/                      2.010000   0.010000   2.020000 (  2.011918)
strip                                          1.840000   0.010000   1.850000 (  1.839065)
str.empty? || str == " " || str =~ /\A\s*\z/   2.440000   0.000000   2.440000 (  2.451348)

@SamSaffron
Copy link
Contributor

fast_blank ftw :) https://github.com/SamSaffron/fast_blank/blob/master/benchmark

agree with @tmm1 this is often heavily overused

@SamSaffron
Copy link
Contributor

@homakov run my test suite to ensure 100% parity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants