Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

Fixing tokenization bug causing single quotes to not be removed

  • Loading branch information...
commit 371869c050c256abab7d7eb58d12495a1abcda04 1 parent 475ffcf
Cameron Dutro authored
2  lib/twitter_cldr/tokenizers/base.rb
@@ -34,7 +34,7 @@ def tokenize_format(text)
34 34 content = token.match(regexes[token_type][:content])[1]
35 35 ret << CompositeToken.new(tokenize_format(content))
36 36 else
37   - ret << Token.new(:value => token, :type => token_type) # .gsub(/\A\'/, "").chomp("'")
  37 + ret << Token.new(:value => token, :type => token_type)
38 38 end
39 39 end
40 40 ret
2  lib/twitter_cldr/tokenizers/calendars/datetime_tokenizer.rb
@@ -20,7 +20,7 @@ def initialize(options = {})
20 20 DateTokenizer::TOKEN_SPLITTER_REGEX,
21 21 TimeTokenizer::TOKEN_SPLITTER_REGEX
22 22 ),
23   - :else => //
  23 + :else => /([^\s]+)/ # groups of non-space chars
24 24 }
25 25
26 26 @token_type_regexes = {

0 comments on commit 371869c

Please sign in to comment.
Something went wrong with that request. Please try again.