Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not attempt to find an executable if a script is referenced #216

Merged
merged 3 commits into from
Jan 25, 2023

Conversation

adinapoli
Copy link
Contributor

@adinapoli adinapoli commented Mar 28, 2022

@andreasabel This is a (potentially naive) attempt at fixing #189 (and perhaps #107?).

This was the result of a quick hacking session so I am sure the story is more complicated than this, and perhaps we should also have @gregwebs chime in on this one.

I am not sure why the switch from system-filepath to filepath triggered it, but the crux of the issue is that unless shelly is run with escaping False, the input FilePath fed into run & co will be "escaped", which in this context means calling the runCommand function, which will attempt to find an executable at PATH.

This works most of the time, but if we have something like ./foo/bar/baz.sh, this won't work, and Shelly will complain.

The easiest fix I could think of was to simply check if we had a FilePath commencing with . and, in this case, we don't attempt to find an executable, but we run the command as-if we used escaping False.

I have also added a regression test, to catch this in the future.

Thoughts?

EDIT: For visibility, here is the postmortem copied from an inline comment:

I have finally nailed down exactly why the regression happened. We have this code, which is responsible for locating the file, in the guts of whichEith:

    whichFull fp = do
      (trace . mappend "which " . toTextIgnore) fp >> whichUntraced
      where
        whichUntraced | isAbsolute fp          = checkFile
                      | dotSlash splitOnDirs   = checkFile
                      | length splitOnDirs > 0 = lookupPath  >>= leftPathError
                      | otherwise              = lookupCache >>= leftPathError

        splitOnDirs = splitDirectories fp
        dotSlash ("./":_) = True
        dotSlash _ = False

As you can see, we are already accounting for dotSlash, so why things fail? Well, it turns out that both system-filepath and filepath have a splitDirectory function, but hey behave differently:

> import qualified Filesystem.Path.CurrentOS as SFP
> import qualified Data.Text as T
> import System.FilePath as FP
> FP.splitDirectories "./test/data/hello.sh"
[".","test","data","hello.sh"]
> SFP.splitDirectories (SFP.fromText $ T.pack "./test/data/hello.sh")
[FilePath "./",FilePath "test/",FilePath "data/",FilePath "hello.sh"]
> 

As you can see the former doesn't include the "./", so the pattern match on dotSlash fails, and here is our regression.

@andreasabel
Copy link
Collaborator

CI reports some broken tests under Linux:

Failures:

  test/src/FindSpec.hs:114:22: 
  1) find follow symlinks
       expected: ["dir", "dir/symlinked_dir", "dir/symlinked_dir/hoge_file", "hello.sh", "nonascii.txt", "symlinked_dir", "symlinked_dir/hoge_file", "zshrc"]
        but got: ["dir", "dir/symlinked_dir", "dir/symlinked_dir/hoge_file", "nonascii.txt", "symlinked_dir", "symlinked_dir/hoge_file", "zshrc"]

  To rerun use: --match "/find/follow symlinks/"

  test/src/FindSpec.hs:129:22: 
  2) find not follow symlinks
       expected: ["dir", "dir/symlinked_dir", "hello.sh", "nonascii.txt", "symlinked_dir", "symlinked_dir/hoge_file", "zshrc"]
        but got: ["dir", "dir/symlinked_dir", "nonascii.txt", "symlinked_dir", "symlinked_dir/hoge_file", "zshrc"]

  To rerun use: --match "/find/not follow symlinks/"

  test/src/RunSpec.hs:31:5: 
  3) run script at $PWD
       uncaught exception: ReThrownException
       
       Ran commands: 
       ./test/data/hello.sh
       
       Exception: ./test/data/hello.sh: createProcess: exec: invalid argument (Bad file descriptor)

  To rerun use: --match "/run/script at $PWD/"

Could you please have a look what causes these and how to update your PR accordingly, @adinapoli ?

@adinapoli
Copy link
Contributor Author

adinapoli commented Mar 28, 2022

In a lucky turn of events I will be working remotely from Rome and my digital nomad workstation is a Linux machine (as opposed to my main Mac mini), so I can definitely look into this one later this week.

Thanks for the heads up!

@gregwebs
Copy link
Owner

Thanks for adding a test case. With enough testing we can get make a change.

@adinapoli adinapoli force-pushed the adinapoli/filepath-regression branch 3 times, most recently from c1dae8b to 51b2fa1 Compare April 1, 2022 06:14
@adinapoli
Copy link
Contributor Author

@andreasabel @gregwebs Ok, I think we should be in business now. Two failures were due to the fact I had forget to add hello.sh to the data files ( 🤦 ) whereas the run failure was due to the fact CI didn't have the +x permission set for hello.sh to execute it.

As far as windows testing goes, obviously running that bash script doesn't make sense, so I have guarded the test using isWindows. I am not sure what the best course of action is here.

src/Shelly.hs Outdated Show resolved Hide resolved
src/Shelly.hs Outdated
-- it would be better to specifically detect that bug
= case fp of
-- If the 'FilePath' contains a more articulated path than just
-- an executable, don't try to find it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this rather check whether there is a directory component in exe? What about bla/hello.sh instead of ./hello.sh? Currently you would accept e.g. .bla/hello.sh or .hello.sh.
I think checking for starting with dot isn't the correct criterion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I take your point this is too simplistic. I will have to think about this a little bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreasabel Aha! After a more thorough look, I have finally nailed down exactly why the regression happened. We have this code, which is responsible for locating the file, in the guts of whichEith:

    whichFull fp = do
      (trace . mappend "which " . toTextIgnore) fp >> whichUntraced
      where
        whichUntraced | isAbsolute fp          = checkFile
                      | dotSlash splitOnDirs   = checkFile
                      | length splitOnDirs > 0 = lookupPath  >>= leftPathError
                      | otherwise              = lookupCache >>= leftPathError

        splitOnDirs = splitDirectories fp
        dotSlash ("./":_) = True
        dotSlash _ = False

As you can see, we are already accounting for dotSlash, so why things fail? Well, it turns out that both system-filepath and filepath have a splitDirectory function, but hey behave differently:

> import qualified Filesystem.Path.CurrentOS as SFP
> import qualified Data.Text as T
> import System.FilePath as FP
> FP.splitDirectories "./test/data/hello.sh"
[".","test","data","hello.sh"]
> SFP.splitDirectories (SFP.fromText $ T.pack "./test/data/hello.sh")
[FilePath "./",FilePath "test/",FilePath "data/",FilePath "hello.sh"]
> 

As you can see the former doesn't include the "./", so the pattern match on dotSlash fails, and here is our regression.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent analysis! Thanks, @adinapoli!

test/data/hello.sh Show resolved Hide resolved
@adinapoli adinapoli force-pushed the adinapoli/filepath-regression branch from 51b2fa1 to 51aff5e Compare April 1, 2022 10:18
src/Shelly.hs Outdated
| otherwise = lookupCache >>= leftPathError
whichUntraced | isAbsolute fp = checkFile
| startsWithDot splitOnDirs = checkFile
| length splitOnDirs > 0 = lookupPath >>= leftPathError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand when the guard length splitOnDirs > 0 can fail (be False)... It seems this happens only when fp is []. But this should not happen, should it? So is the last case dead?
Here are some experiments:

ghci> :t splitDirectories
splitDirectories :: FilePath -> [FilePath]
ghci> splitDirectories "."
["."]
ghci> splitDirectories ""
[]
ghci> splitDirectories "/"
["/"]
ghci> splitDirectories "//"
["//"]
ghci> splitDirectories "///"
["///"]
ghci> splitDirectories "/a"
["/","a"]
ghci> splitDirectories "/a/"
["/","a"]
ghci> splitDirectories "/a/b"
["/","a","b"]
ghci> splitDirectories "c:/a"
["c:","a"]
ghci> splitDirectories "c:a"
["c:a"]
ghci> splitDirectories "c:foo.bat"
["c:foo.bat"]
ghci> splitDirectories "c:\\foo.bat"
["c:\\foo.bat"]
ghci> splitDirectories "c:\\\\foo.bat"
["c:\\\\foo.bat"]
ghci> splitDirectories "c:/foo.bat"
["c:","foo.bat"]

Copy link
Contributor Author

@adinapoli adinapoli Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which output of splitDirectories am I looking at? the one from system-filepath or filepath? 😅 Hard to tell why that guard was added, as it's alas not documented -- possibly some weird corner case?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, this is the one of filepath which is used in latest shelly. But the system-filepath one is not much different.
@gregwebs : Can you explain the mystery of the length splitOnDirs > 0 guard? You introduced it in d441583#r70391227

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greg agrees that the guard length splitOnDirs > 0 is unnecessary (and the clause below can be dropped).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! I will amend this PR with this and the other changes between today and tomorrow. Thanks!

src/Shelly.hs Outdated
-- function, but it returns \"./\" as its first argument,
-- so we pattern match on both for backward-compatibility.
startsWithDot (".":_) = True
startsWithDot ("./":_) = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it actually make sense to keep the old test for backwards-compatibility? I mean, System.FilePath.splitDirectorieswill never return list that contains "./".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does look harmless -- the only reason why I have kept it was mostly for documenting purposes and to defend ourselves from draconian cases where people would attempt frankestein builds that uses newer Shelly-s but system-filepath (assuming that is possible). I suppose it gives us a bit more flexibility at a very little cost, but I am also happy to get rid of the redundant test 😉

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't get me wrong: I am very much for documenting the why and also the history.
However, an impossible case shouldn't be there, because it will create insecurity in the reader of the code. It will actually suggest that the case is possible and should be handled. If you want to keep it in, it should be

startsWithDot ("./":_) = undefined -- this case is impossible

but I think this isn't necessary.
I'd keep references to previous behavior in the comment.

test/data/hello.sh Show resolved Hide resolved
@adinapoli
Copy link
Contributor Author

Ok, we do not have to improve on Shelly in this PR, just restore its old behavior...

@andreasabel hehe yes, my temptation would be to eschew any temptation to improve Shelly as part of this PR and instead deliver the smallest fix that restores the old behaviour, and perhaps we flag all these low hanging fruits in a ticket, as those sounds very inviting for newcomers willing to contribute to the project 😉

@adinapoli
Copy link
Contributor Author

@andreasabel Ok, I have addressed all the review remarks. Reshuffling that whichFull function also had the added bonus of removing some now-redundant functions.

Something I should point out though is that, at least on my Linux machine, splitDirectories can return a length of 0, which is when the empty string is passed:

> import System.FilePath as FP
> FP.splitDirectories ""
[]
> 

This is very weird, because you also ran the same test here but got something different. Having said that, the tests seem to still all pass, so perhaps that's nothing to worry about.

Furthermore, I couldn't conceivably see why somebody would try to run a command by passing the empty string as the executable (and I agree with your analysis, garbage in, garbage out). A case could be made that Shelly could get rid of these unrepresentable states if it had its own (internal) definition of something like newtype FilePath = FilePath (NonEmpty Char) but again, this might be overkill (or too simplistic) and definitely doesn't belong to this PR 😉

@adinapoli
Copy link
Contributor Author

@andreasabel I have lost track of this -- are you folks expecting anything else from me or is this essentially good to go? Thanks! 😉

adinapoli and others added 3 commits January 24, 2023 20:34
This commit simplifies the local `whichFull` function (local to `whichEith`)
by dropping a redundant guard as well as /always/ running
`lookupPath >>= leftPathError`. Previously we had this:

```
    whichFull fp = do
      (trace . mappend "which " . toTextIgnore) fp >> whichUntraced
      where
        whichUntraced | isAbsolute fp          = checkFile
                      | dotSlash splitOnDirs   = checkFile
                      | length splitOnDirs > 0 = lookupPath  >>= leftPathError
                      | otherwise              = lookupCache >>= leftPathError
        splitOnDirs = splitDirectories fp
```

However `splitOnDirs` can never return the empty list, so that guard was
redundant as that code path was always executed, and the /otherwise/
case never executed.
Copy link
Collaborator

@andreasabel andreasabel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @adinapoli and apologies for the long break.
I am planning to release this as 1.11.0 now, assuming you haven't experienced any problems with this code in the last 9 month.
I rebased this on latest master but otherwise left the commits untouched.

@andreasabel andreasabel linked an issue Jan 24, 2023 that may be closed by this pull request
@andreasabel andreasabel added this to the 1.11.0 milestone Jan 24, 2023
@adinapoli
Copy link
Contributor Author

Thanks again @adinapoli and apologies for the long break. I am planning to release this as 1.11.0 now, assuming you haven't experienced any problems with this code in the last 9 month. I rebased this on latest master but otherwise left the commits untouched.

No worries, and thanks for picking this one up 😉 In interest of full transparency, I ended up not using (just yet) this branch/commit, as I try to avoid depending on unreleased/unapproved forks/branches/patches for production code, so I just ended up downgrading to a suitable Shelly, but given how much we spent thinking about this issue, I think it should be fine. We know precisely where and why the regression happened and we have a test to prevent that from happening again, so we should be good 😉

@andreasabel andreasabel merged commit c68825e into gregwebs:master Jan 25, 2023
@andreasabel
Copy link
Collaborator

Thanks again, @adinapoli !
I pushed the button now, so you have opportunities to evaluate the fix in production now: https://hackage.haskell.org/package/shelly-1.11.0

@andreasabel andreasabel self-assigned this Jan 25, 2023
@adinapoli
Copy link
Contributor Author

Thanks again, @adinapoli ! I pushed the button now, so you have opportunities to evaluate the fix in production now: https://hackage.haskell.org/package/shelly-1.11.0

Beautiful, I'll keep you posted, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shelly can't run command with a relative path anymore
3 participants