New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expand_path should use the original encoding #3866

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@ahorek
Contributor

ahorek commented May 10, 2016

fixes #3849

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 10, 2016

Member

It appears one of the tests in the JRuby suite failed, but I need to confirm if it's an appropriate test.

Member

headius commented May 10, 2016

It appears one of the tests in the JRuby suite failed, but I need to confirm if it's an appropriate test.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 10, 2016

Member

Oh, silly me...it's the test you added. Testing on MRI now.

Member

headius commented May 10, 2016

Oh, silly me...it's the test you added. Testing on MRI now.

Pavel Rosický
@ahorek

This comment has been minimized.

Show comment
Hide comment
@ahorek

ahorek May 10, 2016

Contributor

@headius should be fixed now

Contributor

ahorek commented May 10, 2016

@headius should be fixed now

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 10, 2016

Member

@ahorek Ok that matches a patch I put together here that I think fits a little better. I'll merge your PR to get the test but I'll go with my patch and you can let me know if it fixes your local issues too.

Member

headius commented May 10, 2016

@ahorek Ok that matches a patch I put together here that I think fits a little better. I'll merge your PR to get the test but I'll go with my patch and you can let me know if it fixes your local issues too.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 10, 2016

Member

Looks like with either of our patches I get a couple failures in MRI's suite:

TestFileExhaustive#test_expand_path_encoding_filesystem [/Users/headius/projects/jruby/test/mri/ruby/test_file_exhaustive.rb:660]:
<#<Encoding:UTF-8>> expected but was
<#<Encoding:US-ASCII>>.

Here's the code for that test:

  def test_expand_path_encoding_filesystem
    home = ENV["HOME"]
    ENV["HOME"] = "#{DRIVE}/UserHome"

    path = "~".encode("US-ASCII")
    dir = "C:/".encode("IBM437")
    fs = Encoding.find("filesystem")

    assert_equal fs, File.expand_path(path).encoding
    assert_equal fs, File.expand_path(path, dir).encoding
  ensure
    ENV["HOME"] = home
  end

And here's the patch I made (applies over your PR):

diff --git a/core/src/main/java/org/jruby/RubyFile.java b/core/src/main/java/org/jruby/RubyFile.java
index 23689f1..237fd90 100644
--- a/core/src/main/java/org/jruby/RubyFile.java
+++ b/core/src/main/java/org/jruby/RubyFile.java
@@ -785,13 +785,9 @@ public class RubyFile extends RubyIO implements EncodingCapable {

     @JRubyMethod(name = "expand_path", required = 1, optional = 1, meta = true)
     public static IRubyObject expand_path19(ThreadContext context, IRubyObject recv, IRubyObject[] args) {
-        RubyString path = (RubyString) expandPathInternal(context, recv, args, true, false);
-        path.force_encoding(context, context.runtime.getEncodingService().convertEncodingToRubyEncoding(get_path(context, args[0]).getEncoding()));
-
-        return path;
+        return expandPathInternal(context, recv, args, true, false);
     }

-
     /**
      * ---------------------------------------------------- File::absolute_path
      *      File.absolute_path(file_name [, dir_string] ) -> abs_file_name
@@ -1560,6 +1556,7 @@ public class RubyFile extends RubyIO implements EncodingCapable {

         RubyString origPath = StringSupport.checkEmbeddedNulls(runtime, get_path(context, args[0]));
         String relativePath = origPath.getUnicodeValue();
+        Encoding enc = origPath.getEncoding();

         // Special /dev/null of windows
         if (Platform.IS_WINDOWS && ("NUL:".equalsIgnoreCase(relativePath) || "NUL".equalsIgnoreCase(relativePath))) {
@@ -1613,7 +1610,7 @@ public class RubyFile extends RubyIO implements EncodingCapable {
                 // this is basically for classpath:/ and uri:classloader:/
                 relativePath = relativePath.substring(2).replace('\\', '/');
             }
-            return concatStrings(runtime, preFix, extra, relativePath);
+            return concatStrings(runtime, preFix, extra, relativePath, enc);
         }

         String[] uriParts = splitURI(relativePath);
@@ -1627,7 +1624,7 @@ public class RubyFile extends RubyIO implements EncodingCapable {
         if (uriParts != null) {
             //If the path was an absolute classpath path, return it as-is.
             if (uriParts[0].equals("classpath:")) {
-                return concatStrings(runtime, preFix, relativePath, postFix);
+                return concatStrings(runtime, preFix, relativePath, postFix, enc);
             }
             relativePath = uriParts[1];
         }
@@ -1745,13 +1742,14 @@ public class RubyFile extends RubyIO implements EncodingCapable {
                 }
             }
         }
-        return concatStrings(runtime, preFix, realPath, postFix);
+        return concatStrings(runtime, preFix, realPath, postFix, enc);
     }

-    private static RubyString concatStrings(final Ruby runtime, String s1, String s2, String s3) {
+    private static RubyString concatStrings(final Ruby runtime, String s1, String s2, String s3, Encoding enc) {
         return RubyString.newString(runtime,
                 new StringBuilder(s1.length() + s2.length() + s3.length()).
-                        append(s1).append(s2).append(s3)
+                        append(s1).append(s2).append(s3).toString(),
+                enc
         );
     }

Unfortunately this is where it gets a bit tricky. Looking at MRI's C code for expand_path does show it trying to negotiate the incoming string's encoding and the filesystem's encoding, but it's not clear why this test should produce the filesystem encoding rather than the input encoding.

Member

headius commented May 10, 2016

Looks like with either of our patches I get a couple failures in MRI's suite:

TestFileExhaustive#test_expand_path_encoding_filesystem [/Users/headius/projects/jruby/test/mri/ruby/test_file_exhaustive.rb:660]:
<#<Encoding:UTF-8>> expected but was
<#<Encoding:US-ASCII>>.

Here's the code for that test:

  def test_expand_path_encoding_filesystem
    home = ENV["HOME"]
    ENV["HOME"] = "#{DRIVE}/UserHome"

    path = "~".encode("US-ASCII")
    dir = "C:/".encode("IBM437")
    fs = Encoding.find("filesystem")

    assert_equal fs, File.expand_path(path).encoding
    assert_equal fs, File.expand_path(path, dir).encoding
  ensure
    ENV["HOME"] = home
  end

And here's the patch I made (applies over your PR):

diff --git a/core/src/main/java/org/jruby/RubyFile.java b/core/src/main/java/org/jruby/RubyFile.java
index 23689f1..237fd90 100644
--- a/core/src/main/java/org/jruby/RubyFile.java
+++ b/core/src/main/java/org/jruby/RubyFile.java
@@ -785,13 +785,9 @@ public class RubyFile extends RubyIO implements EncodingCapable {

     @JRubyMethod(name = "expand_path", required = 1, optional = 1, meta = true)
     public static IRubyObject expand_path19(ThreadContext context, IRubyObject recv, IRubyObject[] args) {
-        RubyString path = (RubyString) expandPathInternal(context, recv, args, true, false);
-        path.force_encoding(context, context.runtime.getEncodingService().convertEncodingToRubyEncoding(get_path(context, args[0]).getEncoding()));
-
-        return path;
+        return expandPathInternal(context, recv, args, true, false);
     }

-
     /**
      * ---------------------------------------------------- File::absolute_path
      *      File.absolute_path(file_name [, dir_string] ) -> abs_file_name
@@ -1560,6 +1556,7 @@ public class RubyFile extends RubyIO implements EncodingCapable {

         RubyString origPath = StringSupport.checkEmbeddedNulls(runtime, get_path(context, args[0]));
         String relativePath = origPath.getUnicodeValue();
+        Encoding enc = origPath.getEncoding();

         // Special /dev/null of windows
         if (Platform.IS_WINDOWS && ("NUL:".equalsIgnoreCase(relativePath) || "NUL".equalsIgnoreCase(relativePath))) {
@@ -1613,7 +1610,7 @@ public class RubyFile extends RubyIO implements EncodingCapable {
                 // this is basically for classpath:/ and uri:classloader:/
                 relativePath = relativePath.substring(2).replace('\\', '/');
             }
-            return concatStrings(runtime, preFix, extra, relativePath);
+            return concatStrings(runtime, preFix, extra, relativePath, enc);
         }

         String[] uriParts = splitURI(relativePath);
@@ -1627,7 +1624,7 @@ public class RubyFile extends RubyIO implements EncodingCapable {
         if (uriParts != null) {
             //If the path was an absolute classpath path, return it as-is.
             if (uriParts[0].equals("classpath:")) {
-                return concatStrings(runtime, preFix, relativePath, postFix);
+                return concatStrings(runtime, preFix, relativePath, postFix, enc);
             }
             relativePath = uriParts[1];
         }
@@ -1745,13 +1742,14 @@ public class RubyFile extends RubyIO implements EncodingCapable {
                 }
             }
         }
-        return concatStrings(runtime, preFix, realPath, postFix);
+        return concatStrings(runtime, preFix, realPath, postFix, enc);
     }

-    private static RubyString concatStrings(final Ruby runtime, String s1, String s2, String s3) {
+    private static RubyString concatStrings(final Ruby runtime, String s1, String s2, String s3, Encoding enc) {
         return RubyString.newString(runtime,
                 new StringBuilder(s1.length() + s2.length() + s3.length()).
-                        append(s1).append(s2).append(s3)
+                        append(s1).append(s2).append(s3).toString(),
+                enc
         );
     }

Unfortunately this is where it gets a bit tricky. Looking at MRI's C code for expand_path does show it trying to negotiate the incoming string's encoding and the filesystem's encoding, but it's not clear why this test should produce the filesystem encoding rather than the input encoding.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 10, 2016

Member

Oh, line 660 is the first assert_equal.

Member

headius commented May 10, 2016

Oh, line 660 is the first assert_equal.

headius added a commit that referenced this pull request May 10, 2016

Further fixes for File.expand_path encoding.
See #3849 and the PR this is based on #3866.
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 10, 2016

Member

Ok, I've merged this to master with my additional fixes and one newly-passing ruby/spec!

@ahorek Thanks for starting this off and providing more tests! Please confirm that #3849 is fixed for you on master.

Member

headius commented May 10, 2016

Ok, I've merged this to master with my additional fixes and one newly-passing ruby/spec!

@ahorek Thanks for starting this off and providing more tests! Please confirm that #3849 is fixed for you on master.

@headius headius closed this May 10, 2016

@headius headius added this to the JRuby 9.1.1.0 milestone May 10, 2016

@ahorek

This comment has been minimized.

Show comment
Hide comment
@ahorek

ahorek May 10, 2016

Contributor

@headius Tested, thanks for getting this done!

Contributor

ahorek commented May 10, 2016

@headius Tested, thanks for getting this done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment