Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Short-circuit String tokenization. #89

wants to merge 1 commit into from

1 participant


The short-circuit pattern doesn't really work help anyone on the i18n front, but I really don't know how multibyte characters are represented in YAML anyway. So, this may not be an issue. All tests are passing and I'm seeing a 3x speedup in the YAML file from issue #84.


I found I can do this with a single regexp and save the second comparison in many cases. I'll prepare a new pull request.

@nirvdrum nirvdrum closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Oct 2, 2012
  1. @nirvdrum
This page is out of date. Refresh to see the latest.
Showing with 3 additions and 1 deletion.
  1. +3 −1 lib/psych/scalar_scanner.rb
4 lib/psych/scalar_scanner.rb
@@ -24,7 +24,9 @@ def tokenize string
return string if @string_cache.key?(string)
case string
- when /^[A-Za-z_~]/
+ # Make sure it's not a hex string, a special float (e.g., -.inf), or hash key. Look for any character that would
+ # immediately qualify it as a String type.
+ when /^[A-Za-z_~]/, /^[^(?:0x)\.:-][A-Za-z\s!@#\$%\^&\*\(\)\{\}\|\/\\]+/
if string.length > 5
@string_cache[string] = true
return string
Something went wrong with that request. Please try again.