A gem for converting Java-style regular expressions to Ruby ones.
With Java's String.match()
a regex has to match the entire string to be a match. In ruby, you have to explicitly use \A
and \z
if you want only matches for the whole string.
The JavaRegex
class automatically adds those characters to every Java string that it converts. Also, it will attempt to convert Java regex conventions to the Ruby equivalent where possible, and throw an exception where a feature does not exist in Ruby. It's still very much a work in progress.
$ gem install java_regex
> require java_regex
=> true
> jre = JavaRegex.new('.*some_regex_\p{Digit}')
=> #<JavaRegex:0x007fea73075ea8 @regex=".*some_regex_\\p{Digit}">
puts jre.to_ruby()
\A.*some_regex_[[:digit:]]\z
=> nil
> re = jre.to_ruby_regex()
=> /\A.*some_regex_[[:digit:]]\z/
> 'some_regex_1'.scan(re)
=> ["some_regex_1"]
The regex feature differences between Java and Ruby were obtained from the comparison charts at Regular-Expressions.info.
-
\Q
...\E
escapes a string of metacharactersJava Support
Java 6Ruby Support
NoResult if Found
Throws an exception. -
\cA
through\cZ
(control character)Java Support
YesRuby Support
NoResult if Found
Throws an exception.
-
Hyphen in
[\d-z]
is a literalJava Support
YesRuby Support
NoResult if Found
Escapes the hyphen:[\d-z]
->[\d\-z]
-
\b
(at the beginning or end of a word)Java Support
YesRuby Support
Ascii onlyResult if Found
Throws an exception if followed by a non-ascii character. -
\B
(NOT the beginning or end of a word)Java Support
YesRuby Support
Ascii onlyResult if Found
Throws an exception if followed by a non-ascii character.
-
Backreferences non-existent groups are an error
Java Support
YesRuby Support
NoResult if Found
Throws an exception.
-
(?s)
(dot matches newlines)Java Support
YesRuby Support
(?m)
Result if Found
Throws an exception if a newline character is found. -
(?m)
(^
and$
match at line breaks)Java Support
YesRuby Support
Always onResult if Found
Handling not implemented.
-
?+
,*+
,++
and{m, n}+
(possessive quantifiers)Java Support
YesRuby Support
NoResult if Found
Handling not implemented.
-
(?<=text)
(positive lookbehind)Java Support
Finite LengthRuby Support
NoResult if Found
Handling not implemented. -
(?<!text)
(negative lookbehind)Java Support
Finite LengthRuby Support
NoResult if Found
Handling not implemented.
-
\u0000
through\uFFFF
(Unicode character)Java Support
YesRuby Support
NoResult if Found
Converted to 0x0000 format.
-
\pL
through\pC
(Unicode properties)Java Support
YesRuby Support
NoResult if Found
Handling not implemented. -
\p{L}
through\p{C}
(Unicode properties)Java Support
YesRuby Support
NoResult if Found
Handling not implemented. -
\p{Lu}
through\p{Cn}
(Unicode property)Java Support
YesRuby Support
NoResult if Found
Handling not implemented. -
\p{IsL}
through\p{IsC}
(Unicode properties)Java Support
YesRuby Support
NoResult if Found
Handling not implemented. -
\p{IsLu}
through\p{IsCn}
(Unicode property)Java Support
YesRuby Support
NoResult if Found
Handling not implemented. -
\p{IsBasicLatin}
through\p{InSpecials}
(Unicode block)Java Support
YesRuby Support
NoResult if Found
Handling not implemented. -
Spaces, hyphens, and underscores allowed in all long names listed above (e.g.
BasicLatin
can be written asBasic-Latin
orBasic_Latin
orBasic Latin
)Java Support
YesRuby Support
NoResult if Found
Handling not implemented. -
\P
(negated variants of all\p
as listed above)Java Support
YesRuby Support
NoResult if Found
Handling not implemented.
-
\p{Alpha}
POSIX character classJava Support
AsciiRuby Support
NoResult if Found
Converted to[:alpha:]
POSIX character class.
Copyright (c) 2013 Dean Morin. See LICENSE for details.