Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

canonicalizePath is no longer canonical for directories (trailing slashes) #63

Closed
dcoutts opened this issue Nov 26, 2016 · 11 comments
Closed
Assignees
Labels
type: a-bug The described behavior is not working as intended.
Milestone

Comments

@dcoutts
Copy link
Contributor

dcoutts commented Nov 26, 2016

It used to be the case that canonicalizePath for directories would return the directory without the trailing slash, irrespective of whether the directory was given with or without the trailing slash. As of some version, but certainly including 1.2.6 and still in 1.2.7.1 the result sometimes includes a trailing slash and sometimes does not. This violates the property that the result be canonical.

$ ghci -package directory-1.2.7.1
> :m System.Directory
> canonicalizePath "./"
"/home/duncan/"
> canonicalizePath "."
"/home/duncan"
> canonicalizePath "./foo/"
"/home/duncan/foo/"
> canonicalizePath "./foo"
"/home/duncan/foo"

This canonical property is important, lots of code relies on it. As an example of the kind of bug it causes:

cabal new-build ./
Unknown build target './'.
The project has no package directory './'.
Perhaps you meant the package directory './'?

This happens because this code relies on the canonical property to match up packages that are known to live at various locations with a directory given by the user. Since one code path happens to call canonicalizePath with a trailing slash and another without then there is no match and the confusing error.

@dcoutts dcoutts added the type: a-bug The described behavior is not working as intended. label Nov 26, 2016
@dcoutts
Copy link
Contributor Author

dcoutts commented Nov 26, 2016

Possibly related to #42

dcoutts added a commit to dcoutts/cabal that referenced this issue Nov 27, 2016
It is not actually canonical for directories with vs without a trailing
slash. This bug is present in newer directory releases, older versions
work correctly.

Filed as haskell/directory#63
dcoutts added a commit to dcoutts/cabal that referenced this issue Nov 27, 2016
It is not actually canonical for directories with vs without a trailing
slash. This bug is present in newer directory releases, older versions
work correctly.

Filed as haskell/directory#63
@Rufflewind
Copy link
Member

Rufflewind commented Nov 27, 2016

Sorry, I don't think there's a sensible way to fix this. The idea of “canonicalization” has always been quite vague and there does not seem to be an agreement on whether trailing slashes carry significant meaning.

The current behavior aligns with that of directory 1.2.2.1 or older. Between 1.2.3.0 and 1.2.5.0 the behavior was different, which was an unintentional result (#42) of fixing #23. The current behavior is now fully documented:

Similar to normalise, passing an empty path is equivalent to passing the current directory. The function preserves the presence or absence of the trailing path separator unless the path refers to the root directory /.

I suggest running normalise dropTrailingPathSeparator on it if you don't want the trailing slash.

@dcoutts
Copy link
Contributor Author

dcoutts commented Nov 29, 2016

Sorry, I don't think there's a sensible way to fix this.

Surely it's just a matter of integrating dropTrailingPathSeparator, no?

The idea of “canonicalization” has always been quite vague

The canonicalization is not that vague. Everyone accepts that file systems have multiple names that refer to the same file system object (hard links etc), but programmers and users have reasonable expectations about things that are really the same name.

there does not seem to be an agreement on whether trailing slashes carry significant meaning

There doesn't need to be any such agreement. It's clearly not canonical to preserve them. They should either always be there or never (the old behaviour was never). I'm fully aware about the vagaries of whether trailing slashes should carry meaning or not, but that really ought to be irrelevant for a function that is there to produce a canonical result.

The current behavior aligns with that of directory 1.2.2.1 or older.

No it does not. In particular the following versions have the correct behaviour (ie same result for input paths given with or without trailing slash):

  • 1.2.2.0 (shipped with ghc-7.10.3)
  • 1.2.1.0 (shipped with ghc-7.8.4)
  • 1.2.0.1 (shipped with ghc-7.6.3)
  • 1.1.0.2 (shipped with ghc-7.4.1)
  • 1.0.0.0 (shipped with ghc-6.8.2)

It is only the version shipped with ghc-8.0.1 that has the new incorrect behaviour of non-canonical results.

So it's quite clear the old behaviour is possible, and well established.

We need canonical results. Lots of apps rely on this. We need a form where we can compare strings and have those refer to the same name. To the best of my knowledge the old canonicalizePath was always fine for this. Sure being able to use it on paths that don't even exist is a bonus and there best effort is ok, but for ones that do exist it seems reasonably clear that there's a sensible meaning: and for trailing slashes that's plain and obvious.

It's not ok to just say "oh the behaviour changed, please add a call to dropTrailingPathSeparator", when there's no good reason for the change and particularly because it's a silent behaviour change that subtly breaks an important property. If we really thought that it was unsustainable to have a function that tries to produce canonical names then it'd be fine to add a few replacements and deprecate the old one. But I've not yet seen any argument that this is the situation here.

@Rufflewind
Copy link
Member

No it does not. In particular the following versions have the correct behaviour (ie same result for input paths given with or without trailing slash):

You're right. The quirky behavior did not exist prior to 1.2.2.1. Sorry, I did not verify the old behavior (I took #42 for granted). My guess is that the quirk had existed only on Windows systems.

In that case I will revert the fix for #42 and bump the major version.

@dcoutts
Copy link
Contributor Author

dcoutts commented Nov 29, 2016

Ok great, but presumably we don't want to re-introduce the odd behaviour described in #42. It should still be the case that canonicalizePath "." returns a path without the trailing slash.

BTW, I noticed this in the implementation of prependCurrentDirectory which is used by canonicalizePath:

      | isRelative path ->
          (</> path) . addTrailingPathSeparator <$> getCurrentDirectory

Specifically, addTrailingPathSeparator ought to be redundant there. The </> will add it if necessary. Or am I missing something?

@Rufflewind
Copy link
Member

Rufflewind commented Nov 29, 2016

Specifically, addTrailingPathSeparator ought to be redundant there. The </> will add it if necessary. Or am I missing something?

It used to matter, but now it's redundant because of the case path of { "" -> …; _ -> … }.

@Rufflewind Rufflewind added this to the 1.3.0.0 milestone Nov 29, 2016
@Rufflewind Rufflewind self-assigned this Nov 29, 2016
@Rufflewind
Copy link
Member

Let me know if Rufflewind@43488ba fixes it. In particular, the tests have been updated to reflect the change in behavior.

dcoutts added a commit to dcoutts/cabal that referenced this issue Dec 1, 2016
It is not actually canonical for directories with vs without a trailing
slash. This bug is present in newer directory releases, older versions
work correctly.

Filed as haskell/directory#63
@bgamari
Copy link
Contributor

bgamari commented Dec 2, 2016

@dcoutts any update on this? I'd like to be able to get a new release candidate out soon.

Rufflewind added a commit to Rufflewind/directory that referenced this issue Dec 3, 2016
After discussion with Duncan Coutts, it was found that the trailing
slash-preserving behavior was actually a bug on Windows.  This means
there is really no reason for the current, somewhat quirky behavior of
preserving trailing slashes.  However, it has been a while since the
change was made, so it would be safer to introduce this as a major
version bump.

The internal prependCurrentDirectory function has been reworked slightly
with regards to the behavior on empty paths, but this not have any
visible effect on the public API since they always end up normalizing
the result of prependCurrentDirectory in some way or another.

Fixes haskell#63.
@Rufflewind
Copy link
Member

The bug has been fixed (Rufflewind@43488ba), but I wanted to use the opportunity to also fix some other deficiencies with canonicalizePath's behavior on Windows (letter case normalization and symbolic link dereferencing). I hope to have it done before the weekend is over.

@bgamari
Copy link
Contributor

bgamari commented Dec 6, 2016

@Rufflewind, do you think we could just move ahead with a release so that I can push another release candidate out?

@Rufflewind
Copy link
Member

@bgamari 1.3.0.0 is out now. The fix for #64 is not included but hopefully that's not a major problem.

dcoutts added a commit to dcoutts/cabal that referenced this issue Dec 19, 2016
It is not actually canonical for directories with vs without a trailing
slash. This bug is present in newer directory releases, older versions
work correctly.

Filed as haskell/directory#63
dcoutts added a commit to dcoutts/cabal that referenced this issue Mar 7, 2017
It is not actually canonical for directories with vs without a trailing
slash. This bug is present in newer directory releases, older versions
work correctly.

Filed as haskell/directory#63
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: a-bug The described behavior is not working as intended.
Projects
None yet
Development

No branches or pull requests

3 participants