Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dir.glob with classpath: resource as a base is broken #6421

Open
Magisus opened this issue Sep 30, 2020 · 7 comments · May be fixed by #6427
Open

Dir.glob with classpath: resource as a base is broken #6421

Magisus opened this issue Sep 30, 2020 · 7 comments · May be fixed by #6427

Comments

@Magisus
Copy link

Magisus commented Sep 30, 2020

Environment Information

Provide at least:

  • JRuby version: 9.2.11+ (still broken on 9.2.13.0)
  • MacOS 10.15, Centos 7 (3.10.0-862.3.2.el7.x86_64)

Other relevant info you may wish to add:

  • Used within a clojure app
  • issue can be reproduced running irb in a leiningen REPL
  • Also reproducible inside a jar

Expected Behavior

Dir.glob should be able to take a classpath:<filename> resource as a base, since it apparently supports uri:classloader:<filename>.

Actual Behavior

We observed the following, starting a REPL by running lein irb in a checkout of a the puppetserver repo. Apologies that I'm not sure how to properly set up a classpath outside of this environment for a simpler repro:

# this is the correct behavior
irb(main):006:0> Dir.glob("*", base: "./src/ruby/puppetserver-lib")
=> ["puppet"]
 
# This started breaking with 9.2.11.0 (or possibly 9.2.10, but we skipped that version)
irb(main):009:0> Dir.glob "*", base:  "classpath:/puppetserver-lib"
Traceback (most recent call last):
16: from org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:325)
15: from org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:205)
14: from org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:396)
13: from org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:208)
12: from org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:212)
11: from org.jruby.RubyDir$INVOKER$s$0$2$glob.call(RubyDir$INVOKER$s$0$2$glob.gen)
10: from org.jruby.RubyDir.glob(RubyDir.java:310)
9: from org.jruby.util.Dir.push_glob(Dir.java:310)
8: from org.jruby.util.Dir.push_braces(Dir.java:449)
7: from org.jruby.util.Dir.push_globs(Dir.java:481)
6: from org.jruby.util.Dir.glob_helper(Dir.java:693)
5: from org.jruby.util.Dir.glob_helper(Dir.java:745)
4: from org.jruby.util.JRubyFile.createResource(JRubyFile.java:82)
3: from org.jruby.util.JRubyFile.createResource(JRubyFile.java:119)
2: from org.jruby.util.JRubyFile.create(JRubyFile.java:62)
1: from org.jruby.util.JRubyFile.createNoUnicodeConversion(JRubyFile.java:143)
Java::JavaLang::IllegalArgumentException (Neither current working directory (classpath:/puppetserver-lib) nor pathname (.) led to an absolute path)

# This workaround seems to fix it
irb(main):009:0> Dir.glob "*", base:  "uri:classloader:/puppetserver-lib"
=> ["puppet"]

This worked on 9.2.8.0. When we upgraded to 9.2.11.0, it stopped working. We've switched to using uri:classloader for now as a workaround, but since this is a regression, and from looking at the code, classpath still seems to be generally supported, this seems like a bug and not an intentional change?

I did some spelunking, and noticed the following oddity that might explain this, even if the fix isn't exactly in that part of the code. I think we want to hit this code path but don’t, because while it converts a pathname with classpath in it to uri:classloader here, it doesn’t do the same for cwd, so instead we fall through to this which incorrectly treats it as a normal file. It's unclear to me if there's supposed to be a conversion for cwd too, or if this code expects that to already have been processed more before we get here. But hope that finding helps track this down.

Let me know if you need any more information.

@Magisus
Copy link
Author

Magisus commented Sep 30, 2020

I wasn't able to find a clear definition of the difference between the classpath and the uri:classloader protocols. Is that described somewhere? I was a little surprised to find them to be interchangeable for us, and would appreciate a better understanding of what each is for, if someone can explain. That would help me evaluate if this workaround is really going to be viable in all scenarios.

For context, we use it as an element in our configured load path for JRuby: https://github.com/puppetlabs/puppetserver/blob/74af3b155550c843b5705bbad263cb82940b0cfb/src/clj/puppetlabs/services/jruby/jruby_puppet_core.clj#L23-L35

@headius
Copy link
Member

headius commented Oct 6, 2020

Thanks for the report!

My memory is a bit fuzzy on the difference, but I believe "classpath" was the older URL format and we generally prefer "uri:classloader" now, but I agree they essentially mean the same thing and should behave the same.

@headius
Copy link
Member

headius commented Oct 6, 2020

Reproduced with a self-contained example:

irb(main):001:0> Dir.glob "*", base:  "uri:classloader:/jni"
=> ["sparcv9-Linux", "x86_64-Windows", "arm-Linux", "ppc-AIX", "x86_64-FreeBSD", "mips64el-Linux", "i386-SunOS", "x86_64-SunOS", "i386-Linux", "Darwin", "ppc64-AIX", "aarch64-Linux", "x86_64-Linux", "x86_64-DragonFlyBSD", "i386-Windows", "ppc64-Linux", "sparcv9-SunOS", "ppc64le-Linux", "x86_64-OpenBSD"]
irb(main):002:0> Dir.glob "*", base:  "classpath:/jni"
Traceback (most recent call last):
       16: from org.jruby.dist/org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:325)
       15: from org.jruby.dist/org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:205)
       14: from org.jruby.dist/org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:392)
       13: from org.jruby.dist/org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:211)
       12: from org.jruby.dist/org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:215)
       11: from org.jruby.dist/org.jruby.RubyDir$INVOKER$s$0$2$glob.call(RubyDir$INVOKER$s$0$2$glob.gen)
       10: from org.jruby.dist/org.jruby.RubyDir.glob(RubyDir.java:310)

Note that this isn't just an error, it's a Java IllegalArgumentException error, so clearly violating our goal of hiding the Java bits of JRuby.

@headius
Copy link
Member

headius commented Oct 6, 2020

Note also removing the leading "/" does not solve the "classpath" case, though it works both ways with "uri:classloader".

@headius
Copy link
Member

headius commented Oct 6, 2020

This appears to be a symptom of inconsistent normalization.

We have code in JRubyFile to normalize "classpath:" to "uri:classloader:", but the glob logic doesn't use the normalized path other than to check the existence of the resource. So in the case here, it normalizes to "uri:classloader:/jni", sees that that resource exists, and then continues on using "classpath:/jni" which is not understood by the rest of the globbing logic.

So there's two fixes here that make sense independently or together:

  • After normalizing the path with JRubyFile, use that new path from then on.
  • Add support to glob internals for the "classpath:" form alongside the "uri:classloader:" form.

A patch for the first part, which is sufficient to fix this issue:

diff --git a/core/src/main/java/org/jruby/RubyDir.java b/core/src/main/java/org/jruby/RubyDir.java
index 4eb1385894..8916e70a05 100644
--- a/core/src/main/java/org/jruby/RubyDir.java
+++ b/core/src/main/java/org/jruby/RubyDir.java
@@ -300,11 +300,18 @@ public class RubyDir extends RubyObject implements Closeable {
         String base = globOptions(context, args, flags);
         List<ByteList> dirs;
 
-        if (base != null && !base.isEmpty() && !(JRubyFile.createResource(context, base).exists())){
+        if (base == null || base.isEmpty()) {
+            base = runtime.getCurrentDirectory();
+        }
+
+        // createResource will normalize URLs (see GH-6421)
+        FileResource resource = JRubyFile.createResource(context, base);
+
+        if (!(resource.exists())){
             dirs = new ArrayList<ByteList>();
         } else {
             IRubyObject tmp = args[0].checkArrayType();
-            String dir = base == null || base.isEmpty() ? runtime.getCurrentDirectory() : base;
+            String dir = resource.absolutePath();
 
             if (tmp.isNil()) {
                 dirs = Dir.push_glob(runtime, dir, globArgumentAsByteList(context, args[0]), flags[0]);

I'm on the fence about whether to fix the rest of the glob internals to support "classpath:". If such a fix were made, it would likely be simplest to normalize the URI in the same way, which then begs the question of why it's not normalized earlier in the call stack as in the above patch.

The patch above is the lowest-impact fix that works, and does exactly that: normalizes the URI before proceeding into glob internals.

headius added a commit to headius/jruby that referenced this issue Oct 6, 2020
The createResource call here normalizes a URI like "classpath:" to
our preferred "uri:classloader:" but only uses that normalized
version to check the existence of the resource in question. If the
resource exists, it proceeds to use the un-normalized version in
glob, and fails since "classpath:" is not supported by our glob
internals.

This patch uses the newly normalized path when calling glob
internals, so "classpath:" never leaks into that code.

Fixes jruby#6421
headius added a commit to headius/jruby that referenced this issue Oct 6, 2020
Not directly related to jruby#6421 but found along the way.
@headius headius linked a pull request Oct 6, 2020 that will close this issue
@headius headius linked a pull request Oct 6, 2020 that will close this issue
@headius
Copy link
Member

headius commented Oct 12, 2020

The patch and connected PR appear to cause some issues loading resources from jar files, so this needs another look.

@headius
Copy link
Member

headius commented May 11, 2023

This is now affecting autoload-based systems like Zeitwork when used inside a jar context, such as on Android.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants