Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SF fails ungracefully meeting Windows path length limit 260+ #58

Closed
ross-spencer opened this issue Nov 16, 2015 · 15 comments
Closed

SF fails ungracefully meeting Windows path length limit 260+ #58

ross-spencer opened this issue Nov 16, 2015 · 15 comments
Assignees
Labels
Milestone

Comments

@ross-spencer
Copy link
Collaborator

Given a structure; /e/DC_Spencer/folder-test/ab/cdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopq

With a file in it: abcd.txt

SF will:

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x14 pc=0x41087e]

goroutine 1 [running]:
main.multiIdentifyS.func1(0x12938c30, 0xeb, 0x0, 0x0, 0x330602e0, 0x1292c0a0, 0x0, 0x0)
C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:123 +0x1be
path/filepath.walk(0x129384b0, 0xe2, 0xca4558, 0x12d541e0, 0x12921c38, 0x0, 0x0)
c:/go/src/path/filepath/path.go:370 +0x33d
path/filepath.walk(0x128c21a0, 0x2, 0xca4558, 0x12d540f0, 0x12921c38, 0x0, 0x0)
c:/go/src/path/filepath/path.go:374 +0x3f9
path/filepath.Walk(0x128c21a0, 0x2, 0x12921c38, 0x0, 0x0)
c:/go/src/path/filepath/path.go:396 +0xb7
main.multiIdentifyS(0x330601c8, 0x12d76008, 0x12d54050, 0x128c21a0, 0x2, 0x0, 0x0, 0x0)
C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:135 +0x87
main.main()
C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:344 +0x1e1e

If you shorten that path by removing the 'ab' directory then it works. Length 258 characters vs. 260.

Max path length specified here: https://msdn.microsoft.com/en-nz/library/windows/desktop/aa365247(v=vs.85).aspx

We have a real example I can't provide as its structure describes the content of the un-accessioned files we're working with.

Directory lengths like this can be created by transferring from Linux using Rsync or in Windows by using Cygwin to mkdir.

@richardlehane
Copy link
Owner

Thanks for reporting Ross.

There's an unchecked error there which I can tidy up to avoid the panic. But the underlying error may be difficult to address, it looks to be this: golang/go#3358

@ross-spencer
Copy link
Collaborator Author

Feedback from the error will help others and prevent them from having to chase it down. It seems we'll just have to work around the other issue outside of SF.

(Unless there was a way to simulate jumping into each directory one by one, referencing the next relatively instead of absolutely to ask isDir() so that we're never asking the call to look at a directory path as long? - I would have to understand what the SDK is doing better - I imagine the Go code base would already have considered that if possible)

@richardlehane
Copy link
Owner

there should be a workaround ... the docker devs ran into this and made this pkg: https://github.com/docker/docker/tree/master/pkg/longpath Can probably do something similar

@richardlehane richardlehane added this to the v1.4.2 milestone Nov 17, 2015
@richardlehane richardlehane self-assigned this Nov 17, 2015
@richardlehane
Copy link
Owner

fixed 06f73f3

(on os errors: try again with longpath code from docker repo)

@ross-spencer
Copy link
Collaborator Author

Tested, and the it works:

filename : 'folder-test\cdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuv

wxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopq\test.txt'

I'm wondering what the impact is on that tip in your SF recipes for piping a list of files to STD out. Without the drive letter does it reduce the number of potential uses like that in Windows?

@richardlehane
Copy link
Owner

hmmm... this should be fine so long as you don't change directories between commands I think? I.e. if it is a relative path and you are still in the right directory then that relative path will still resolve.

I was thinking about this the other day: is it better to return the filenames as passed to sf (relative or absolute) or should filenames all get turned into absolute paths (with drive letter etc.).

At the moment users get control over what kind of filenames get returned.

E.g. say I am in my desktop path I can either do:

sf cool_file.txt

and "filename: cool_file.txt" comes back
Or:

sf c:\Users\richardl\Desktop\cool_file.txt

and "filename: c:\Users\richardl\Desktop\cool_file.txt" comes back
Or even:

sf %cd%\cool_file.txt

which also gives "filename: c:\Users\richardl\Desktop\cool_file.txt"

The advantage of the way it presently is is that there might be legitimate reasons not to show the file system context your files are in (e.g. because you are processing a consignment on a lab PC and the meaningful context is the relationships between the files in a consignment folder and not the position of the consignment folder on the lab PC).

Disadvantage is that many users may need absolute paths for further processing work but not want to jump through hoops like typing full paths or using that %cd% trick to get there.

Options:

  • leave as is (but maybe add sf %cd%\file.txt as another tip and trick)
  • always report absolute filenames in results
  • return file URIs instead
  • or leave mostly as is, but change -log known | unknown to return absolute paths

@richardlehane
Copy link
Owner

OK ... I combined options 1) and 4).

Have added a tip and trick (https://github.com/richardlehane/siegfried/wiki/Tips-and-Tricks)

Also changed the logging output so the logging features always return absolute paths 7eaa3d0

@ross-spencer
Copy link
Collaborator Author

Thanks Richard. I wouldn't have been able to answer straight off. I think I will need to work through it with SF a little more and understand where the impacts are/how to work with it. I'm sure a combination of one and four will work well.

@ross-spencer
Copy link
Collaborator Author

I'm seeing that long path syntax in output now, e.g.

 \\?\E:\digital_archeology\sf\...\...\...

Should (can?) SF maintain a record of the original path before the IsFolder() check?

@richardlehane
Copy link
Owner

Hi Ross
Where you see the long path in output ... are you only seeing it for long file paths, or everywhere? This should relate to a second fix I applied to recover from long directory paths (the first fix only worked for file paths). It may be a little tricky to restore original paths post this fix but I will take a look

@ross-spencer
Copy link
Collaborator Author

In this case it seems to be in the final report (DROID output from SF). As
I haven't been able to test this more rigorously this week, I am happy to
rebuild SF at the beginning of next week and provide you with more
information. It doesn't seem to be a long path,
e.g. ?\E:\digital_archeology\sf###\ARCHIVES\REVIEWS##################92.DOC

On Thu, Dec 10, 2015 at 2:10 PM, Richard Lehane notifications@github.com
wrote:

Hi Ross
Where you see the long path in output ... are you only seeing it for long
file paths, or everywhere? This should relate to a second fix I applied to
recover from long directory paths (the first fix only worked for file
paths). It may be a little tricky to restore original paths post this fix
but I will take a look


Reply to this email directly or view it on GitHub
#58 (comment)
.

@richardlehane
Copy link
Owner

You might get this for a short path simply because sf will try a long path as a second attempt when it encounters any error. If the original error was something for which a simple retry might work (e.g. bandwidth related), then you may get long paths in your results. This would especially show up where you are throttling - and a longer throttle may eliminate it.

A fix might be to try to identify the exact error codes for path length errors and only try the longpath fix for those. In your case, this would mean you'd get a fatal error here instead (because the retry would never have happened). Or could keep the retry with long paths for any error, since it seems to have an unplanned nice side effect of recovering from other kinds of failures!, but add a new function to restore short paths.

@richardlehane
Copy link
Owner

Hi Ross
I pushed a change on develop branch to try to restore original paths if the long path fix is applied.
I may work on it further as it is all a bit of a hairy mess now but would be interested to hear if it resolves the latest issues you've been encountering.

@richardlehane richardlehane reopened this Dec 11, 2015
@ross-spencer
Copy link
Collaborator Author

Hi Richard.

I haven't been able to re-create this reliably since my return. I have been able to run the develop branch fix on the same source files and the files-paths with ?\ in them look correct, that is they no longer have the ?.

I am running the same set against the master branch properly overnight and should have the results tomorrow.

Just to keep you informed.

It does look like you will be able to close this issue out.

Thanks,

Ross

@richardlehane
Copy link
Owner

thx Ross - happy to reopen it if it recurs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants