File.directory? of a uri:classloader resources ending in "/" incorrectly returns false #4350

Closed
the-michael-toy opened this Issue Dec 1, 2016 · 30 comments

Projects

None yet

4 participants

@the-michael-toy
the-michael-toy commented Dec 1, 2016 edited

Environment

  • jruby 9.1.5.0 (2.3.1) 2016-09-07 036ce39 Java HotSpot(TM) 64-Bit Server VM 25.112-b16 on 1.8.0_112-b16 +jit [darwin-x86_64]
  • Darwin Quindici.local 16.1.0 Darwin Kernel Version 16.1.0: Thu Oct 13 21:26:57 PDT 2016; root:xnu-3789.21.3~60/RELEASE_X86_64 x86_64

Bug description

  here = File.dirname(__FILE__)
  FileUtils.cp_r(File.join(here, "."), somewhere_else)

Run from the file system the cp_r works as expected. If you run from a jar, you get an exception. After much investigation (see below), this is because File.join(here, ".") doesn't get recognized as a directory.

This used to work in 1.7 and was discovered when testing our app under 9000. It boils down, after all the mapping of paths, to fail because

  • File.dirname(__FILE__) equalsuri:classloader:/dirname`
  • File.join(dirname, ".") gets normalized from uri:classloader:/dirname/. to uri:classloader:/dirname/
  • Which gets mapped to jar:file:PATH_TO_JAR!/dirname/
  • Which is not recognized as a directory because URLResource.createClassloaderURIcalls getResources (here) to determine directory contents in the absence of a .jrubydir file, which returns no data if passed a name with a trailing slash

Update

I have actively been trying to find a fix for this, which makes this issue kind of long, sorry.

@the-michael-toy
the-michael-toy commented Dec 1, 2016 edited

Here's a small (60 line shar file) example which shows this: repro_shar.txt

% sh repro_shar.txt
% cd saml_event
% jruby bin/saml_event
SOURCE IS /private/tmp/saml-event/data
DESTNATIOM IS /var/folders/bk/k57jlwr48xjcgr0059bklvj00000gp/T/d20161201-2152-mbu6ao
/var/folders/bk/k57jlwr48xjcgr0059bklvj00000gp/T/d20161201-2152-mbu6ao/deeper/a_file found

## now build a jar and run it that way

% warble
rm -f saml-event.jar
Creating saml-event.jar
% java -jar saml_event.jar
java -jar saml-event.jar 
SOURCE IS uri:classloader:/saml-event/data
DESTNATIOM IS /var/folders/bk/k57jlwr48xjcgr0059bklvj00000gp/T/d20161201-2183-y0dzhs
Errno::EISDIR: Is a directory - /var/folders/bk/k57jlwr48xjcgr0059bklvj00000gp/T/d20161201-2183-y0dzhs/.
                 initialize at org/jruby/RubyFile.java:368
                       open at org/jruby/RubyIO.java:1139
         block in copy_file at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1403
                       open at org/jruby/RubyIO.java:1141
                  copy_file at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1402
                       copy at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1370
        block in copy_entry at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:472
              wrap_traverse at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1508
                 copy_entry at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:469
              block in cp_r at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:444
  block in fu_each_src_dest at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1581
          fu_each_src_dest0 at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1595
           fu_each_src_dest at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1579
                       cp_r at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:443
                     <main> at uri:classloader:/saml-event/bin/saml-event:12
                       load at org/jruby/RubyKernel.java:977
                     <main> at uri:classloader:/META-INF/main.rb:1
                    require at org/jruby/RubyKernel.java:959
                     (root) at uri:classloader:/META-INF/main.rb:1
                     <main> at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1
ERROR: org.jruby.embed.EvalFailedException: (EISDIR) Is a directory - /var/folders/bk/k57jlwr48xjcgr0059bklvj00000gp/T/d20161201-2183-y0dzhs/.
@the-michael-toy
the-michael-toy commented Dec 7, 2016 edited

@dmarcotte is showing me how to debug JRuby, and what it looks like from the debugger, though it is tough getting up to speed, is that path names ending in a /. are mistakenly identified as files and not directories by the jar resource loader. This test shows that to be true ...

here = File.dirname(__FILE__)
here_here = File.join(here, ".")
not_directories = [ here, here_here ].reject {|f| File.directory?(f) }
puts "GOT THESE WRONG: #{not_directories.join(" AND ")}" if not_directories.size > 0

Run from a jar, you see the WRONG output.

Still swimming in stack traces and files I am not familiar with, so I don't have a fix yet.

@the-michael-toy the-michael-toy changed the title from FileUtils:::cp_r doesn't recognize source as a directory when in a jar to File.directory? of a uri:classpath resources ending in "/." incorrectly returns false Dec 13, 2016
@the-michael-toy the-michael-toy changed the title from File.directory? of a uri:classpath resources ending in "/." incorrectly returns false to File.directory? of a uri:classpath resources ending in "/" incorrectly returns false Dec 13, 2016
@the-michael-toy
the-michael-toy commented Dec 13, 2016 edited

Debugging JRuby as it processes this, I have discovered that I can reduce the failing cp_r to this one simpler failing gesture ...

here = File.dirname(__FILE__) + "/"
puts File.directory?(here) ? "PASS -- #{here} is a directory" : "FAIL !! #{here} is not a directory"
@the-michael-toy
the-michael-toy commented Dec 13, 2016 edited

I have a jar with one ruby file in "bin", with the above code.

What happens is that URLResource.listClassLoaderFiles assumes that there is a .jrubydir file, and there isn't, and so the string url:classpath:/MYAPP/bin/ stats as a file because it has no entries.

Making the jar have a .jrubydir file makes the problem go away.

What I can't say is if this is a bug or not.

@the-michael-toy
the-michael-toy commented Dec 13, 2016 edited

Aha the code from #3073 ( and 97e35b8 in particular ), should have kicked in here, but didn't save me.

Back to the debugger.

@the-michael-toy
the-michael-toy commented Dec 13, 2016 edited

I'm documenting my journey in case someone who actually understands JRuby picks up this bug.

I tried adding a test to test/jruby/test_dir.rb because the @mkristian's PR seems like it should address this.

The problem seems to be at the line of code in URLResource.createClassloaderURI which tries to construct the directory contents when a .jrubydir is missing by calling getResources. I wrote this test ...

  def test_stat_trailing_slash_via_uri_classloader
    jar_file = File.expand_path('../jar_with_relative_require1.jar', __FILE__)
    $CLASSPATH << jar_file
    jar_path = "uri:classloader:test/require_relative1.rb"
    assert File.exist?(jar_path), "test is wrong, #{jar_path} doesn't even exist"
    dir_in_jar = File.dirname(jar_path) + "/"
    assert File.directory?(dir_in_jar), "#{dir_in_jar} claims to not be a directory"
  end

but the test PASSES because at this point in URLResource.createClassloaderURI

                  Enumeration<URL> urls = cl.getResources(pathname);

pathname = "file:/Users/mtoy/git/jruby/test/"

but in my failing test app-in-a-jar, at the same breakpoint, File.directory(File.dirname(__FILE__)+"/") results in

pathname = "jar:file:/Users/mtoy/saml-event/saml-event.jar!/saml-event/bin/"

and THAT path doesn't return resources ( though if you strip the trailing slash it does). I am not sure how to write a test to replicate what my test app does.

because I am a noob, I am currently stalled because the debugger I am using won't let me step into the getResources call to get closer to the line of code which is failing.

@the-michael-toy the-michael-toy changed the title from File.directory? of a uri:classpath resources ending in "/" incorrectly returns false to File.directory? of a uri:classloader resources ending in "/" incorrectly returns false Dec 13, 2016
@the-michael-toy
the-michael-toy commented Dec 13, 2016 edited

I was able to force a test failure by matching the jar path that uri:classloader gets re-written to in my warbler packaged test app. This test fails

  def test_stat_trailing_slash_via_uri_classloader
    jar_file = File.expand_path('../jar_with_relative_require1.jar', __FILE__)
    $CLASSPATH << jar_file
    source_file = "jar:file:#{jar_file}!/test/require_relative1.rb"
    assert File.exist?(source_file), "test is wrong, #{source_file} doesn't even exist"
    source_dir = File.dirname(source_file)
    assert File.directory?(source_dir), "#{source_dir} not found"
    source_dir += "/"
    assert File.directory?(source_dir), "#{source_dir} claims to not be a directory"
  end

Still no clue how to fix it other than the obvious workaround-without-understanding of stripping the trailing slash in URLResource.createClassloaderURI just before it calls getResources.

@headius
Member
headius commented Dec 19, 2016

Hey thanks for you patience. I can never remember whether the leading slash should be there or not, so this workaround may not be far from a fix. Any thoughts @mkristian?

@enebo
Member
enebo commented Dec 20, 2016

@the-michael-toy @mkristian @headius fwiw adding a slash also does not work in 1.7.26. I do not see the harm in stripping the "/" since: 'dir' -> true, 'dir/' -> false, 'dir/.' -> true. In fact, it really seems like a bug to me. I think @mkristian is the ultimate judge though. If there a reason why we would want this to return false?

@the-michael-toy

The question I had which prevented me from submitting a fix, and which problems in learning how to use the debugger kept me from answering: Patch URLResource.listClassLoaderFiles or is the more correct fix is to figure out why findResources doesn't like the slash and fixing it there.

@enebo enebo added a commit that closed this issue Dec 20, 2016
@enebo enebo Fixes #4350. File.directory? of a uri:classloader resources ending in…
… / incorrectly returns false
057fd3f
@enebo enebo closed this in 057fd3f Dec 20, 2016
@enebo enebo added this to the JRuby 9.1.7.0 milestone Dec 20, 2016
@enebo
Member
enebo commented Dec 20, 2016

@mkristian I am optimistically fixing this so I can ask some people to run broader tests but I am still hoping for you to bless this fix.

@the-michael-toy

Test is named in a misleading way because I wrote it and it passed, the classloader-path re-writing difference between the test and a warble jar forced me to not use a classloader url in the test. Maybe?

  def test_stat_directory_in_jar_with_trailing_slash
@the-michael-toy

thanks for the fix!

@enebo
Member
enebo commented Dec 20, 2016

@the-michael-toy I am confused by that .jrubydir logic in that method. To me, a path delimiter on the end of a file would never be considered to be a normative path since it is a delimiter and there is nothing after the delimiter. It is possible there is some reason why having a '/' at the end of a URI is valid here? I think we only use it for resolving real paths and not in the more generic sense of what you might use for a URI. Even then I am not sure that is valid for a URI?

I am not sure why that one-off check exists but it is the one thing which gives me pause about my fix.

@enebo
Member
enebo commented Dec 20, 2016

@the-michael-toy 04ba645 I corrected the test name and tried to acknowledge you wrote that test code. Thanks for digging in. It really helped focus things.

@mkristian
Member

@enebo a fix for the JarResource is OK for me. but I am surprised that a fix there resolved the initial issue with the uri:classloader: - if there is not .jrubydir file in the directory it tries to fall back on the JarResource if the underlying classloader is some URLClassloader - maybe that is why the JarResource fix helps. all the cases I cared regarding the j2ee and osgi classloaders I added unit and integration tests (which reminds me that I still need to look into some of those). so I am quite confident that it is OK .

for me dir/ and dir and dir/. should all be a directory. the Classloader.getResource is tricky since some return something for a directory path and others do not.

@the-michael-toy

@mkristian the problem is that the code you wrote for loading resources in the absence of a .jrubydir file fails when asking for the resources in a path of the form jar:file:JAR_PATH:/path/_in_jar/ because the findResouces on path/in_jar/ returns no entries.

@the-michael-toy
the-michael-toy commented Dec 20, 2016 edited

@enebo in the original bug, I started with essentially this failing code

File.directory?(File.dirname(__FILE__) + "/.")

__FILE__ becomes uri:classloader:/path/to/file.rb

and after normalization and failure to find a .jrubydir, the uri that gets requested is jar:file:PATH_TO_JAR:/path/to/ I assumed the normalization of "/." to "/" was to make file system references look more like URI like references and was probably there for a case like "/path/./to" and the end-of-path thing was an accident?

If you leave the dot out though and just use a path that ends in slash, you end up with the same problem.

File.directory?(File.dirname(__FILE__) + "/")
@the-michael-toy

But to the question about trailing slash in a URI ... it seems to indeed be legal and there are specific guidelines in the RFC about how to normalize

@enebo
Member
enebo commented Dec 20, 2016

@the-michael-toy ah thanks for that last comment. I sort of figured it is not really a delimeter in the general sense of an URI but in how we use URI I am not sure we need to worry about it since our URIs are exclusively related to filesystems. If we ever decide to support require "http://blergh/foo/bar/" then perhaps I introduced an issue but with that said we will probably have a new resource type for that and this code will still be ok...Too much coffee :)

@the-michael-toy

Hmmm ....

You have made File.stat(PATH_TO_DIRECTORY) and File.stat(PATH_TO_DIRECTORY + "/") be the same, which is correct, but have you also made File.stat(PATH_TO_FILE) and File.stat(PATH_TO_FILE + "/") the same.

I suspect that UNIX throws an ENOENT on the second case and you probably would want to do that to, somewhere in the world there is a piece of code which will lose it's mind if you don't.

Easy test to write .... (checking now)

@the-michael-toy

Yeah @enebo I don't know if this is a separate bug or just a deeper dive into this one, but even without any changes, MRI and JRuby handle /UNIX/PATH_TO_NORMAL_FILE/ differently ...

% cat test.rb
puts "This file is #{__FILE__}, add a slash to it in #{RUBY_DESCRIPTION}"
puts File.exist?(__FILE__+"/") ? "is INCORRECTLY found" : "is correctly not found"

Here's MRI

% rvm use 2.3.1
Using /Users/mtoy/.rvm/gems/ruby-2.3.1
% ruby test.rb 
This file is test.rb, add a slash to it in ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-darwin16]
is correctly not found

And both 9.1.5.0 and 1.7.19 ( the two versions we use around here) ...

% rvm use jruby-9.1.5.0
Using /Users/mtoy/.rvm/gems/jruby-9.1.5.0
% ruby test.rb 
This file is test.rb, add a slash to it in jruby 9.1.5.0 (2.3.1) 2016-09-07 036ce39 Java HotSpot(TM) 64-Bit Server VM 25.112-b16 on 1.8.0_112-b16 +jit [darwin-x86_64]
is INCORRECTLY found
@enebo
Member
enebo commented Dec 20, 2016

@the-michael-toy yeah I agree this is an issue. Your example also points out the difficulty of supporting this. In the case of directory? we can know we are testing for a directory. In that sense, we have knowledge to remove a stray '/' so the zip/jar stuff works as expected. In stat, we don't know what it is. My fix would break your last stat.

This is particularly tricky to fix. For directory? if we knew it ended in we test without the '/'. For stat, if we have no / then it works as before the fix. If we do have / then we must assume the result is a directory or error?

@enebo
Member
enebo commented Dec 20, 2016

@the-michael-toy I would say that must be a new issue since we are not using JarResource for that. Can you open an issue for it?

@the-michael-toy
the-michael-toy commented Dec 20, 2016 edited

yup ... will do (#4403 )

@the-michael-toy

Interestingly, running that test from a jar (and 9.1.6.0 ... sorry gem juggling failure )

This file is uri:classloader:/saml-event/bin/saml-event, add a slash to it in jruby 9.1.6.0 (2.3.1) 2016-11-09 0150a76 Java HotSpot(TM) 64-Bit Server VM 25.112-b16 on 1.8.0_112-b16 +jit [darwin-x86_64]
is correctly not found
@enebo
Member
enebo commented Dec 20, 2016

This should solve all issues I hope :) we potentially ask zip file twice to check to see if it has children but for a file this will never work and we construct the proper subtype. The name is also not pruned. I think that is reasonable?

@the-michael-toy

Looks like MRI returns Errno::ENOTDIR on File.stat(NORMAL_FILE_ENDING_IN_SLASH) not ENOENT --- just FYI

@the-michael-toy

might be different depending on OS though

@enebo
Member
enebo commented Dec 20, 2016

@the-michael-toy I will make a note of ENOTDIR in #4403. This behavior is all wound together in respect to the notion that the resource exists but it ends in slash so it is ENOTDIR. When this is fixed the regression will break but we will realize why and fix it. At this point since regular files also return ENOENT I feel like it is still forward progress :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment