Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partially revert glob support to Windows-only fallback #229

Merged
merged 3 commits into from Apr 24, 2023

Conversation

prettybits
Copy link
Contributor

This implements the approach discussed in #227 to remedy the reported regression by reverting to treating arguments as literal paths first and only on Windows attempting to match glob patterns when no direct match has been found.

I decided to implement the Windows-only behavior by checking against runtime.GOOS which is statically compiled into the executable as well and made more sense to me in this context than splitting this up across files. If you prefer I can certainly do that too. Also, to reduce the cases where a matching is attempted a bit I added a check for the relevant characters that could mark a valid pattern in the name.

I retained the changed matches/match variable names from #225 (which this PR subsumes) as you indicated you were fine with it, hope I got that right.

As already discussed, this won't solve all of the potentially surprising cases but will help for the most urgent and reported ones, we can keep further discussion for those in the original issue or open a separate one as you prefer. Seeing sul-dlss/technical-metadata-service#437 (comment) I'm slightly worried that there could already be more users that adjusted their integrations to shell-escape the arguments which they will have to revert should this PR get merged and eventually released.

Finally, I was wondering about whether the *coe "continue on errors" flag should have an effect on error handling across arguments and matches in addition to just the walk-function? I didn't add anything in this direction here, as this will be better discussed separately, just leaving this here for now while it's fresh in my head.

The support for glob arguments introduced in v1.10.0 led to regressions reported in richardlehane#227.

Special characters used for glob patterns are also allowed in filenames on Linux and to a reduced extent in Windows, consequently such files couldn't be matched directly anymore as before without escaping.

This switches the treatment of arguments back to the pre-1.10 behavior of doing a direct match first. Glob support is retained as an attempted fallback on Windows when no direct match could be found.
Copy link
Owner

@richardlehane richardlehane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move the os.Lstat call within the "if runtime.GOOS = "windows"" block so that linux users avoid the extra syscall?

@richardlehane
Copy link
Owner

thanks Bernhard!

@richardlehane
Copy link
Owner

richardlehane commented Apr 19, 2023

can you move the os.Lstat call within the "if runtime.GOOS = "windows"" block so that linux users avoid the extra syscall?

just realised os.Lstat may be insufficient to test existence of files on windows due to the long path name issue, see e.g. this old issue #58. I.e. if the given path is too long then a normal lstat will fail even if the path exists

suggest adding a new exists(path) bool function in the longpath and longpath_windows files. This would do a lstat on unix.
On windows it would do lstat then the retrystat func defined in the same file.

You could then do if runtime.GOOS = "windows" && string.Contains("v", ....) && !exists(v) {...} else {...}

@prettybits
Copy link
Contributor Author

prettybits commented Apr 20, 2023

can you move the os.Lstat call within the "if runtime.GOOS = "windows"" block so that linux users avoid the extra syscall?

Good call, totally overlooked that, done.

just realised os.Lstat may be insufficient to test existence of files on windows due to the long path name issue, see e.g. this old issue #58. I.e. if the given path is too long then a normal lstat will fail even if the path exists

Right, I took the opportunity to follow the trail a bit and found that Go gradually added longpath handling since 1.8 and the current state even builds the native support available since Windows 10, Version 1607 into compiled Windows executables. At the same time, the handling implemented still leaves UNC paths untouched and doesn't handle relative paths, which makes understanding what the right thing to do really is quite confusing to me.

suggest adding a new exists(path) bool function in the longpath and longpath_windows files. This would do a lstat on unix.
On windows it would do lstat then the retrystat func defined in the same file.

Given that the native longpath handling is incomplete I think this is probably still a good thing to do, thanks for the suggestion! I decided to implement this as tryStat(path string) error so that I have access to the *PathError and can display the correct error message. I thought it a good idea to wrap that error in a walkError, so that the messages are consistent.

Looking at the error display there I notice that the messages are somewhat redundant:

[FATAL] file access error for example- [1.epub: lstat example- [1.epub: no such file or directory

Do you think it might a good idea to do a errors.Unwrap(we.err) to not show the name twice?

@mjgiarlo
Copy link

@prettybits 💬

Seeing sul-dlss/technical-metadata-service#437 (comment) I'm slightly worried that there could already be more users that adjusted their integrations to shell-escape the arguments which they will have to revert should this PR get merged and eventually released.

👋🏻 Hi! In fact, I'm already looking at reverting ☝🏻 change as it broke some other stuff (unrelated to sf). We are eagerly interested in an sf patch as soon as that's available.

@richardlehane
Copy link
Owner

@prettybits 💬

Seeing sul-dlss/technical-metadata-service#437 (comment) I'm slightly worried that there could already be more users that adjusted their integrations to shell-escape the arguments which they will have to revert should this PR get merged and eventually released.

👋🏻 Hi! In fact, I'm already looking at reverting ☝🏻 change as it broke some other stuff (unrelated to sf). We are eagerly interested in an sf patch as soon as that's available.

I'll try to get a point release out over the weekend incorporating this fix + fixes for a few other things 1.10 broke (e.g. the deb packages)

@mjgiarlo
Copy link

@richardlehane Much obliged!

@richardlehane richardlehane merged commit c1aa32b into richardlehane:develop Apr 24, 2023
@prettybits prettybits deleted the glob-regression-fix branch April 24, 2023 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants