Permalink
Browse files

Avoid constructing a Regexp when splitting a String with negative limit

When splitting a string using a String pattern, we don't want to turn it
into a Regexp unless we absolutely have to. Prior to this change,
passing a negative limit into String#split would have forced this
conversion and incurred a high performance cost as a result.

Running the benchmark for this particular case shows a three-fold
improvement.
  • Loading branch information...
1 parent 7978c7d commit 04ca45e0b1d8c135d2005415d763308b11c7aa00 @leocassarani leocassarani committed Apr 17, 2013
Showing with 19 additions and 12 deletions.
  1. +19 −12 kernel/common/splitter.rb
View
31 kernel/common/splitter.rb
@@ -39,19 +39,24 @@ def self.split(string, pattern, limit)
else
pattern = StringValue(pattern) unless pattern.kind_of?(String)
- if !limited and limit.equal?(undefined)
+ trim_end = !tail_empty || limit == 0
+
+ unless limited
if pattern.empty?
- ret = []
- pos = 0
+ if trim_end
+ ret = []
+ pos = 0
+ str_size = string.num_bytes
- while pos < string.num_bytes
- ret << string.byteslice(pos, 1)
- pos += 1
- end
+ while pos < str_size
+ ret << string.byteslice(pos, 1)
+ pos += 1
+ end
- return ret
+ return ret
+ end
else
- return split_on_string(string, pattern)
+ return split_on_string(string, pattern, trim_end)
end
end
@@ -111,7 +116,7 @@ def self.split(string, pattern, limit)
ret
end
- def self.split_on_string(string, pattern)
+ def self.split_on_string(string, pattern, trim_end)
pos = 0
ret = []
@@ -132,8 +137,10 @@ def self.split_on_string(string, pattern)
# No more separators, but we need to grab the last part still.
ret << string.byteslice(pos, str_size - pos)
- while s = ret.at(-1) and s.empty?
- ret.pop
+ if trim_end
+ while s = ret.at(-1) and s.empty?
+ ret.pop
+ end
end
ret

0 comments on commit 04ca45e

Please sign in to comment.