Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Truffle] Including RegexpOptions in RubyRegexp #2631

Merged
merged 3 commits into from Feb 28, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 0 additions & 1 deletion spec/truffle/tags/core/encoding/aliases_tags.txt
@@ -1,2 +1 @@
fails:Encoding.aliases has a 'locale' key and its value equals to the name of the encoding finded by the locale charmap
fails:Encoding.aliases only contains valid aliased encodings
2 changes: 0 additions & 2 deletions spec/truffle/tags/core/encoding/ascii_compatible_tags.txt

This file was deleted.

17 changes: 0 additions & 17 deletions spec/truffle/tags/core/encoding/compatible_tags.txt
@@ -1,29 +1,12 @@
fails:Encoding.compatible? String, String when the first's Encoding is valid US-ASCII returns US-ASCII if the second String is ASCII-8BIT and ASCII only
fails:Encoding.compatible? String, String when the first's Encoding is valid US-ASCII returns ASCII-8BIT if the second String is ASCII-8BIT but not ASCII only
fails:Encoding.compatible? String, String when the first's Encoding is valid US-ASCII returns US-ASCII if the second String is UTF-8 and ASCII only
fails:Encoding.compatible? String, String when the first's Encoding is valid US-ASCII returns UTF-8 if the second String is UTF-8 but not ASCII only
fails:Encoding.compatible? String, String when the first's Encoding is ASCII compatible and ASCII only returns the first's Encoding if the second is ASCII compatible and ASCII only
fails:Encoding.compatible? String, String when the first's Encoding is ASCII compatible and ASCII only returns the second's Encoding if the second is ASCII compatible but not ASCII only
fails:Encoding.compatible? String, String when the first's Encoding is ASCII compatible but not ASCII only returns the first's Encoding if the second's is valid US-ASCII
fails:Encoding.compatible? String, String when the first's Encoding is ASCII compatible but not ASCII only returns the first's Encoding if the second's is UTF-8 and ASCII only
fails:Encoding.compatible? String, String when the first's Encoding is invalid returns the first's Encoding when the second's Encoding is US-ASCII
fails:Encoding.compatible? String, String when the first's Encoding is invalid returns the first's Encoding when the second String is ASCII only
fails:Encoding.compatible? String, Regexp returns US-ASCII if both are US-ASCII
fails:Encoding.compatible? String, Regexp returns the String's Encoding if it is not US-ASCII but both are ASCII only
fails:Encoding.compatible? String, Regexp returns the String's Encoding if the String is not ASCII only
fails:Encoding.compatible? String, Symbol returns US-ASCII if both are ASCII only
fails:Encoding.compatible? String, Symbol returns the String's Encoding if it is not US-ASCII but both are ASCII only
fails:Encoding.compatible? String, Symbol returns the String's Encoding if the String is not ASCII only
fails:Encoding.compatible? Regexp, String returns US-ASCII if both are US-ASCII
fails:Encoding.compatible? Regexp, Regexp returns US-ASCII if both are US-ASCII
fails:Encoding.compatible? Regexp, Regexp returns the first's Encoding if it is not US-ASCII and not ASCII only
fails:Encoding.compatible? Regexp, Symbol returns US-ASCII if both are US-ASCII
fails:Encoding.compatible? Regexp, Symbol returns the first's Encoding if it is not US-ASCII and not ASCII only
fails:Encoding.compatible? Symbol, String returns US-ASCII if both are ASCII only
fails:Encoding.compatible? Symbol, Regexp returns US-ASCII if both are US-ASCII
fails:Encoding.compatible? Symbol, Regexp returns the Regexp's Encoding if it is not US-ASCII and not ASCII only
fails:Encoding.compatible? Symbol, Symbol returns US-ASCII if both are US-ASCII
fails:Encoding.compatible? Symbol, Symbol returns the first's Encoding if it is not ASCII only
fails:Encoding.compatible? Object, Object returns nil for Object, String
fails:Encoding.compatible? Object, Object returns nil for Object, Regexp
fails:Encoding.compatible? Object, Object returns nil for Object, Symbol
Expand Down
6 changes: 0 additions & 6 deletions spec/truffle/tags/core/encoding/default_external_tags.txt
@@ -1,8 +1,2 @@
fails:Encoding.default_external with command line options is not changed by the -U option
fails:Encoding.default_external with command line options returns the encoding specified by '-E external'
fails:Encoding.default_external with command line options returns the encoding specified by '-E external:'
fails:Encoding.default_external= sets the default external encoding
fails:Encoding.default_external= can accept a name of an encoding as a String
fails:Encoding.default_external= calls #to_s on arguments that are neither Strings nor Encodings
fails:Encoding.default_external= raises a TypeError unless the argument is an Encoding or convertible to a String
fails:Encoding.default_external= raises an ArgumentError if the argument is nil
3 changes: 0 additions & 3 deletions spec/truffle/tags/core/encoding/list_tags.txt

This file was deleted.

1 change: 0 additions & 1 deletion spec/truffle/tags/core/encoding/locale_charmap_tags.txt
@@ -1,3 +1,2 @@
fails:Encoding.locale_charmap returns a String
fails:Encoding.locale_charmap returns a value based on the LC_ALL environment variable
fails:Encoding.locale_charmap is unaffected by assigning to ENV['LC_ALL'] in the same process
1 change: 0 additions & 1 deletion spec/truffle/tags/core/encoding/name_list_tags.txt

This file was deleted.

3 changes: 0 additions & 3 deletions spec/truffle/tags/core/regexp/escape_tags.txt

This file was deleted.

3 changes: 0 additions & 3 deletions spec/truffle/tags/core/regexp/quote_tags.txt

This file was deleted.

2 changes: 0 additions & 2 deletions spec/truffle/tags/core/regexp/union_tags.txt
@@ -1,3 +1 @@
fails:Regexp.union returns a Regexp with US-ASCII encoding if all arguments are ASCII-only
fails:Regexp.union raises ArgumentError if the arguments include conflicting fixed encoding Regexps
fails:Regexp.union returns a Regexp with the encoding of a String containing non-ASCII-compatible characters and another ASCII-only String
132 changes: 132 additions & 0 deletions truffle/src/main/java/org/jruby/truffle/nodes/core/EncodingNodes.java
Expand Up @@ -24,16 +24,21 @@
import org.jcodings.specific.UTF8Encoding;
import org.jcodings.util.CaseInsensitiveBytesHash;
import org.jcodings.util.Hash;
import org.jruby.Ruby;
import org.jruby.RubyObject;
import org.jruby.runtime.encoding.EncodingService;
import org.jruby.truffle.nodes.RubyNode;
import org.jruby.truffle.nodes.coerce.ToStrNode;
import org.jruby.truffle.nodes.coerce.ToStrNodeFactory;
import org.jruby.truffle.runtime.RubyContext;
import org.jruby.truffle.runtime.control.RaiseException;
import org.jruby.truffle.runtime.core.RubyArray;
import org.jruby.truffle.runtime.core.RubyEncoding;
import org.jruby.truffle.runtime.core.RubyHash;
import org.jruby.truffle.runtime.core.RubyNilClass;
import org.jruby.truffle.runtime.core.RubyRegexp;
import org.jruby.truffle.runtime.core.RubyString;
import org.jruby.truffle.runtime.core.RubySymbol;
import org.jruby.truffle.runtime.hash.HashOperations;
import org.jruby.truffle.runtime.hash.KeyValue;
import org.jruby.util.ByteList;
Expand Down Expand Up @@ -142,6 +147,98 @@ public Object isCompatible(RubyEncoding first, RubyEncoding second) {
return getContext().getCoreLibrary().getNilObject();
}
}

@Specialization
public Object isCompatible(RubyString first, RubyRegexp second) {
notDesignedForCompilation();

Encoding compatibleEncoding = org.jruby.RubyEncoding.areCompatible(first.getByteList().getEncoding(), second.getRegex().getEncoding());

if (compatibleEncodingProfile.profile(compatibleEncoding != null)) {
return RubyEncoding.getEncoding(compatibleEncoding);
} else {
return getContext().getCoreLibrary().getNilObject();
}
}

@Specialization
public Object isCompatible(RubyRegexp first, RubyString second) {
notDesignedForCompilation();

Encoding compatibleEncoding = org.jruby.RubyEncoding.areCompatible(first.getRegex().getEncoding(), second.getByteList().getEncoding());

if (compatibleEncodingProfile.profile(compatibleEncoding != null)) {
return RubyEncoding.getEncoding(compatibleEncoding);
} else {
return getContext().getCoreLibrary().getNilObject();
}
}

@Specialization
public Object isCompatible(RubyRegexp first, RubyRegexp second) {
notDesignedForCompilation();

Encoding compatibleEncoding = org.jruby.RubyEncoding.areCompatible(first.getRegex().getEncoding(), second.getRegex().getEncoding());

if (compatibleEncodingProfile.profile(compatibleEncoding != null)) {
return RubyEncoding.getEncoding(compatibleEncoding);
} else {
return getContext().getCoreLibrary().getNilObject();
}
}

@Specialization
public Object isCompatible(RubyRegexp first, RubySymbol second) {
notDesignedForCompilation();

Encoding compatibleEncoding = org.jruby.RubyEncoding.areCompatible(first.getRegex().getEncoding(), second.getByteList().getEncoding());

if (compatibleEncodingProfile.profile(compatibleEncoding != null)) {
return RubyEncoding.getEncoding(compatibleEncoding);
} else {
return getContext().getCoreLibrary().getNilObject();
}
}

@Specialization
public Object isCompatible(RubySymbol first, RubyRegexp second) {
notDesignedForCompilation();

Encoding compatibleEncoding = org.jruby.RubyEncoding.areCompatible(first.getByteList().getEncoding(), second.getRegex().getEncoding());

if (compatibleEncodingProfile.profile(compatibleEncoding != null)) {
return RubyEncoding.getEncoding(compatibleEncoding);
} else {
return getContext().getCoreLibrary().getNilObject();
}
}

@Specialization
public Object isCompatible(RubyString first, RubySymbol second) {
notDesignedForCompilation();

Encoding compatibleEncoding = org.jruby.RubyEncoding.areCompatible(first, second);

if (compatibleEncodingProfile.profile(compatibleEncoding != null)) {
return RubyEncoding.getEncoding(compatibleEncoding);
} else {
return getContext().getCoreLibrary().getNilObject();
}
}

@Specialization
public Object isCompatible(RubySymbol first, RubySymbol second) {
notDesignedForCompilation();

Encoding compatibleEncoding = org.jruby.RubyEncoding.areCompatible(first, second);

if (compatibleEncodingProfile.profile(compatibleEncoding != null)) {
return RubyEncoding.getEncoding(compatibleEncoding);
} else {
return getContext().getCoreLibrary().getNilObject();
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we factor this out into a helper method that takes two ByteList objects?

}

@CoreMethod(names = "default_external", onSingleton = true)
Expand Down Expand Up @@ -217,6 +314,21 @@ public RubyEncoding defaultExternal(RubyEncoding encoding) {
return encoding;
}

@Specialization
public RubyEncoding defaultExternal(RubyString encodingString) {
notDesignedForCompilation();

final RubyEncoding rubyEncoding = RubyEncoding.getEncoding(encodingString.toString());
getContext().getRuntime().setDefaultExternalEncoding(rubyEncoding.getEncoding());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets the default external encoding on the JRuby runtime. I'm not sure if we re-use that same value in Truffle. But since it passes the specs I won't worry too much.


return rubyEncoding;
}

@Specialization
public RubyEncoding defaultExternal(RubyNilClass nil) {
throw new RaiseException(getContext().getCoreLibrary().argumentError("default external can not be nil", this));
}

}

@CoreMethod(names = "default_internal=", onSingleton = true, required = 1)
Expand Down Expand Up @@ -361,6 +473,26 @@ public RubyArray list() {
}
}


@CoreMethod(names = "locale_charmap", onSingleton = true)
public abstract static class LocaleCharacterMapNode extends CoreMethodNode {

public LocaleCharacterMapNode(RubyContext context, SourceSection sourceSection) {
super(context, sourceSection);
}

public LocaleCharacterMapNode(LocaleCharacterMapNode prev) {
super(prev);
}

@Specialization
public RubyString localeCharacterMap() {
notDesignedForCompilation();
final ByteList name = new ByteList(getContext().getRuntime().getEncodingService().getLocaleEncoding().getName());
return getContext().makeString(name);
}
}

@CoreMethod(names = "dummy?")
public abstract static class DummyNode extends CoreMethodNode {

Expand Down
Expand Up @@ -48,7 +48,7 @@ public RubyRegexp executeRubyRegexp(VirtualFrame frame) {

final org.jruby.RubyString preprocessed = org.jruby.RubyRegexp.preprocessDRegexp(getContext().getRuntime(), strings, options);

final RubyRegexp regexp = new RubyRegexp(this, getContext().getCoreLibrary().getRegexpClass(), preprocessed.getByteList(), options.toOptions());
final RubyRegexp regexp = new RubyRegexp(this, getContext().getCoreLibrary().getRegexpClass(), preprocessed.getByteList(), options);

if (options.isEncodingNone()) {
if (!BodyTranslator.all7Bit(preprocessed.getByteList().bytes())) {
Expand Down
Expand Up @@ -25,6 +25,7 @@
import com.oracle.truffle.api.frame.VirtualFrame;
import com.oracle.truffle.api.source.SourceSection;
import com.oracle.truffle.api.utilities.ConditionProfile;
import org.jruby.util.RegexpOptions;

@CoreClass(name = "Regexp")
public abstract class RegexpNodes {
Expand Down Expand Up @@ -253,8 +254,11 @@ public int options(RubyRegexp regexp) {

throw new RaiseException(getContext().getCoreLibrary().typeError("uninitialized Regexp", this));
}
if(regexp.getOptions() != null){
return regexp.getOptions().toOptions();
}

return regexp.getRegex().getOptions();
return RegexpOptions.fromJoniOptions(regexp.getRegex().getOptions()).toOptions();
}

}
Expand Down
Expand Up @@ -27,6 +27,7 @@
import org.jruby.truffle.runtime.RubyArguments;
import org.jruby.truffle.runtime.RubyContext;
import org.jruby.util.ByteList;
import org.jruby.util.RegexpOptions;
import org.jruby.util.StringSupport;

import java.nio.ByteBuffer;
Expand All @@ -43,16 +44,31 @@ public class RubyRegexp extends RubyBasicObject {
// TODO(CS): not sure these compilation finals are correct - are they needed anyway?
@CompilationFinal private Regex regex;
@CompilationFinal private ByteList source;
@CompilationFinal private RegexpOptions options;


public RubyRegexp(RubyClass regexpClass) {
super(regexpClass);
}


public RubyRegexp(Node currentNode, RubyClass regexpClass, ByteList regex, RegexpOptions options) {
this(regexpClass);
this.options = options;
initialize(compile(currentNode, getContext(), regex, options.toJoniOptions()), regex);
}

public RubyRegexp(Node currentNode, RubyClass regexpClass, ByteList regex, int options) {
this(regexpClass);
initialize(compile(currentNode, getContext(), regex, options), regex);
}

public RubyRegexp(RubyClass regexpClass, Regex regex, ByteList source, RegexpOptions options ) {
this(regexpClass);
this.options = options;
initialize(regex, source);
}

public RubyRegexp(RubyClass regexpClass, Regex regex, ByteList source) {
this(regexpClass);
initialize(regex, source);
Expand All @@ -76,6 +92,10 @@ public ByteList getSource() {
return source;
}

public RegexpOptions getOptions() {
return options;
}

@CompilerDirectives.TruffleBoundary
public Object matchCommon(RubyString source, boolean operator, boolean setNamedCaptures) {
final byte[] stringBytes = source.getByteList().bytes();
Expand Down
Expand Up @@ -2349,9 +2349,9 @@ public RubyNode visitRedoNode(org.jruby.ast.RedoNode node) {

@Override
public RubyNode visitRegexpNode(org.jruby.ast.RegexpNode node) {
Regex regex = RubyRegexp.compile(currentNode, context, node.getValue().bytes(), node.getEncoding(), node.getOptions().toOptions());
Regex regex = RubyRegexp.compile(currentNode, context, node.getValue().bytes(), node.getEncoding(), node.getOptions().toJoniOptions());

final RubyRegexp regexp = new RubyRegexp(context.getCoreLibrary().getRegexpClass(), regex, node.getValue());
final RubyRegexp regexp = new RubyRegexp(context.getCoreLibrary().getRegexpClass(), regex, node.getValue(), node.getOptions());

// This isn't quite right - we shouldn't be looking up by name, we need a real reference to this constants
if (node.getOptions().isEncodingNone()) {
Expand Down