Skip to content
This repository has been archived by the owner on Sep 19, 2018. It is now read-only.

Dep.path on a file inside a symlinked directory doesn't work #25

Open
chenglou opened this issue May 30, 2016 · 27 comments
Open

Dep.path on a file inside a symlinked directory doesn't work #25

chenglou opened this issue May 30, 2016 · 27 comments

Comments

@chenglou
Copy link

Say I have top/foo/a.txt where foo/ is symlinked, creating a dependency on the path fails:

*** jenga: ERROR: top/foo/a.txt: No directory for target: not a directory

Discussion:

Hmm. It looks like jenga tries to ensure that foo is a directory before reading it. Really it should only do this if there is a rule for foo/a.txt not if it is a source file.

cc @lpw25

@jordwalke
Copy link

Thank you for any help you can provide on this issue. We chose to pursue Jenga because (among other things) it handles symlinks in every other respect aside from this issue.

@v-gb
Copy link

v-gb commented May 31, 2016

We're looking at a fix. If this is blocking you, I suppose changing this code from fs.ml:

let ensure_directory ~dir =
  stat ~follow_symlinks:false dir >>= function ...

to ~follow_symlinks:true should do the trick.

@jordwalke
Copy link

Thank you @sliquister. We will try this.

@chenglou
Copy link
Author

Tested and works. Thanks @sliquister! I guess I'll keep this issue open since maintaining a fork with a single change doesn't solve it for others =).

@jordwalke
Copy link

@sliquister : This is really great, thank you. Would you mind pushing an update to fs.ml and we can update our installation instructions to point to master?

@chenglou
Copy link
Author

Update: seems to only work with absolute path symlinks. If the symlink is relative, and points to somewhere above the root, we get the (somewhat expected) error:

  ("unhandled exception"
   ((src/monitor.ml.Error_
     ((exn (Failure ".. from the root!"))
      (backtrace
       ("Raised at file \"pervasives.ml\", line 30, characters 22-33"
        "Called from file \"list.ml\", line 84, characters 24-34"
        "Called from file \"lib/fs.ml\", line 590, characters 15-64"
        "Called from file \"tenacious/lib/tenacious.ml\", line 121, characters 19-25"
        "Called from file \"src/deferred0.ml\", line 64, characters 64-69"
        "Called from file \"src/job_queue.ml\", line 160, characters 6-47"
        ""))
      (monitor
       (((name try_with) (here ()) (id 4) (has_seen_error true)
         (is_detached true))))))
    ((pid 15455) (thread_id 17))))

@v-gb
Copy link

v-gb commented Jun 18, 2016

I made the change, not sure how long it takes before it shows up here. I'm guessing a few days with the bleeding edge packages, I don't know about the stabler ones.

About the symlink to outside the root, this is intentional and it's not specific to symlinks. We just don't want dependencies on the location of the repository, so the repository can be relocated without causing rebuilds.

@jordwalke
Copy link

We just don't want dependencies on the location of the repository, so the repository can be relocated without causing rebuilds.

@sliquister : I'm curious how preventing relative symlinks helps this.

Relative symlinks are really helpful because it actually allows me to check in several packages into a repo that should all interconnect via symlinks, but then also check in those symlinks into the repo (because they're relative, they won't be committed as absolute paths to my hard drive directories). That means I can create a repo that has: /packages/Package1, /packages/Package2 where Package2 depends on Package1 and I can symlink /packages/Package2/deps/P1 to point to Package1. I would normally cd into /packages/Package2 and run the build command, which has a symlink pointing outside of that build root.

This isn't a contrived use case because we actually hit this lack of support for relative symlinks while building a project that had the above configuration.

@v-gb
Copy link

v-gb commented Jun 18, 2016

To be clear, it's not relative symlinks that are problematic, it's any relative dependency, so long as they reach outside the root. Maybe avoiding rebuilds is not the right reason. Still

  • we would only reach outside the tree by mistake
  • it makes the build not self contained.

I'm confused about your example: since you have a single repository that contains /packages/Package1 and /packages/Package2, you can have the jengaroot in /packages rather than in /packages/Package2, and then the symlinks stay below the root. And in fact I don't see what the symlinks are needed for.
Something like:
/packages/.{hg|git}
/packages/jengaroot.ml
/packages/Package1/build-config
/packages/Package1/src/..
/packages/Package2/build-config
/packages/Package2/src/..

@Nick-Chapman
Copy link

@chenglou

About the failure:

 (exn (Failure ".. from the root!"))

This will come, for example, if jengaroot executes:

relative ~dir:Path.the_root ".."

The error is intended to block what we consider a probable user bug: forming a path by ".."ing w.r.t. a repo-root-relative path which reaches above the root of the repo. We did consider, but decided against, returning the absolute path. However, this appears to be exactly what you want in this situation... Good news. It's easy to get the the behaviour you want by jumping through just a tiny hoop:

let abs_root = absolute (Path.to_absolute_string Path.the_root)

relative ~dir:abs_root ".."

@jordwalke
Copy link

@sliquister

I'm confused about your example: since you have a single repository that contains /packages/Package1 and /packages/Package2, you can have the jengaroot in /packages rather than in /packages/Package2, and then the symlinks stay below the root. And in fact I don't see what the symlinks are needed for.

The reason is that each package should have its own jengaroot.

@chenglou
Copy link
Author

@Nick-Chapman I'm forming some deps through Dep.path foo where foo might be a relative symlink that points to somewhere outside of root. I turned all ~follow_symlinks in fs.ml to true and things work now. Not sure what consequence this has, beside the ones you've raised here.

@Nick-Chapman
Copy link

At the heart of the matter, jenga's support for symlinks is at best flaky. Changing [follow_links:true] just makes jenga use [stat] instead of [lstat]. This might be better for your purposes, but wont work properly in polling jenga if the link is repointed.

The proper solution requires jenga to track the dependencies at each stage of a symlink chain. I believe Leo started some work a while back to make this work properly. I'm not sure if there was something blocking its completion, or if it just dropped down the priority stack since we tend to avoid symlinks at JS.

@Nick-Chapman
Copy link

Actually it might not just be polling jenga which is broken by using stat instead of lstat.

@Nick-Chapman
Copy link

Sorry. The close issue was just my fat fingers.

@lpw25
Copy link
Member

lpw25 commented Jun 28, 2016

I think changing all follow_symlinks to true will make the symlink support much worse. There are plenty of cases where we deliberately lstat the link itself, and then follow it by hand. This means that Jenga has a record that there is a link involved and can respond to changes to this link.

@lpw25
Copy link
Member

lpw25 commented Jun 28, 2016

Leo started some work a while back to make this work properly.

Actually, Arseniy already did quite a bit of work on symlinks and the support is now pretty reasonable. This file describes the current state of symlink support.

@lpw25
Copy link
Member

lpw25 commented Jun 28, 2016

If you want to work around your issue I would suggest changing

stat (Path.relative_or_absolute ~dir link_destination)

in the uncached_stat function of Fs to be:

let abs_dir = Path.absolute (Path.to_absolute_string dir) in
stat (Path.relative_or_absolute ~dir:abs_dir link_destination)

I haven't tested this solution, but I reckon it should work.

This should force all links to be considered as pointing to an absolute path. This will allow your use case, but it means that there will be more absolute paths in the database so moving it will require more recompilation.

chenglou added a commit to chenglou/jenga that referenced this issue Jul 4, 2016
@jordwalke
Copy link

@lpw25 Is there any reason why this should not be upstreamed? If this is something that would not get in the way of your workflow, we would like to avoid maintaining a fork of jenga.

@lpw25
Copy link
Member

lpw25 commented Jul 20, 2016

It is something that would get in the way of our workflow unfortunately. We work hard to have very few absolute paths in our builds, and this would noticeably damage that. A more subtle fix, which only uses absolute paths when the path reaches outside the root, would probably be alright.

@jordwalke
Copy link

Okay, that's reasonable. I'm curious, does it get in the way of your workflow because you move directories around on disk frequently?

@lpw25
Copy link
Member

lpw25 commented Jul 20, 2016

I'm not sure about frequently, but it is useful to be able to move a directory without suddenly needing to rebuild the world. It's possible some of our CI stuff relies on it.

@aalekseyev
Copy link

Avoiding recompilation is only part of the issue. Avoiding absolute paths is necessary to be able to reproduce exactly the same build artifacts. We don't actually achieve perfect build determinism, but we like to get closer where we can.

@aalekseyev
Copy link

Actually, maybe this doesn't affect build determinism because these path never leak to build rules. Even stronger, I don't think they ever end up in the database so recompilation should not be affected either. Leo, you seem to have reached a different conclusion?

@lpw25
Copy link
Member

lpw25 commented Jul 20, 2016

Ah yes, you are right that it does not reach the database. Only the in-memory memoization tables see the path. It would still be slightly better to treat paths inside the root properly as it would mean fewer calls to stat, but it is not particularly important.

@jordwalke
Copy link

So to recap, there would be no objection to upstreaming a change that would convert to absolute paths only if the directories are symlinks?

@lpw25
Copy link
Member

lpw25 commented Jul 21, 2016

Yes I think it would be fine

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants