Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support core.longpaths on Windows #3053

Closed
8 tasks
ethomson opened this issue Apr 17, 2015 · 23 comments
Closed
8 tasks

Support core.longpaths on Windows #3053

ethomson opened this issue Apr 17, 2015 · 23 comments

Comments

@ethomson
Copy link
Member

At some point we need to support the core.longpaths configuration setting and deal with Windows paths that are longer than MAX_PATH.

  • Change all the UTF8 to UTF16 code to not assume MAX_PATH
  • Change the checkout path sanitizer to (optionally) allow for long paths
  • Tests tests tests tests tests

Questions:

  • Do long paths work over SMB?

We will need copious tests here:

  • Test that we can create a folder whose absolute path would be longer than 260 characters
  • Test that a file can be created whose absolute path would be longer than 260 characters (inside a directory that is not excessively long)
  • Test a filename (just the name portion) that is itself longer than 260 characters
  • Ensure that we can't suddenly write long references
@DavidSimonsAdobe
Copy link

This is my #1 request for libgit2. I'd be happy to help test it in our app. We need no interop with other git tools. All access to the private repo is via libgit2 (on mac & win). Possibly on a remote file system.

@rhuijben
Copy link

Answering the question: yes long paths work over SMB if you properly use \?\UNC\server\share\etc...

@csware
Copy link
Contributor

csware commented Dec 19, 2016

In Git for Windows they have a working workaround by having wrappers around the POSIX filesystem API (open, unlink, chmod, ...): See https://github.com/git-for-windows/git/blob/master/compat/mingw.h and https://github.com/git-for-windows/git/blob/master/compat/mingw.c; search for MAX_LONG_PATH, xutftowcs_path and xutftowcs_long_path.

@michelschep
Copy link

I'm having the same issue with long file names.
With git via command line no issue (thanks to "git config --system core.longpaths true").
However I also use GitKraken which depends on nodegit which depends on libgit2.
Is this issue already solved (and nodegit/GitKraken is not aware of it/not using it) or a PR is required (unfortunately I'm not into C) or already PR is in progress?

@ethomson
Copy link
Member Author

No, there is no work in progress yet to deal with this.

@gw79
Copy link

gw79 commented Aug 23, 2017

My company is trying to develop an internal tool to automate cloning many repositories to more easily create and configure a full system developer workspace - we've got some long path problems and having this support would make them go away and save the pain of refactoring a lot of code to reduce file paths.

Unfortunately my knowledge of C is limited to 2 terms of C++ at uni 20 years ago so I doubt my efforts at forking would be very productive ;)

@ethomson
Copy link
Member Author

Indeed, this is regrettable, but this remains far from trivial.

This requires rewriting all the win32 file handling to understand long path support. And to do so efficiently. Right now we use static buffers for our utf8 to ucs2 conversion routines. Supporting long paths will require us to move to dynamically sized buffers, but that will probably charge us a malloc tax that is too onerous for us, so we'll probably end up doing something terrible like using fixed buffers when possible and move to a malloced buffer only when we need it.

The other problem is that core.longpaths is per-repository. This is fine for git, since it can only operate on repository at a time. But libgit2 can have multiple git_repositorys in play at any given time, and each of them can have different notions of longpath support. So we need some way to switch long path support on or off on a per-repository basis. That means our POSIX emulation layer needs to have knowledge of whether long paths should be on or not. This will be insanely disruptive to the code base in some ugly way or another.

It might be interesting to consider the longpath changes in the Windows 10 anniversary update, instead of trying to emulate git's core.longpaths option.

@neuhausjulian
Copy link

He guys, I activated this option. As it looks for me wingit doesn't have any need for the "core.longpaths" attribute anymore. So this works fine.
But Gitkraken is still failing. Maybe this is the fault of Gitkraken and not libgit2, but I like to add my experience here to keep this topic alive :D

@ethomson
Copy link
Member Author

I don't understand why Gitkraken even lists Windows releases on their site. It's completely broken due to this bug. This is cross platform development in 2018, people.

And do a lot of Windows applications deal with longpaths properly?

@ethomson
Copy link
Member Author

Because they use Git for Windows. Which is anomalous in its support for long paths in any year. This is something that almost every Windows application fails with because the standard library doesn’t support it. G4W went out of its way and you can trace its support to npm failing to understand how to build cross platform applications.

Supporting long paths is nontrivial. And even once we support them, now we have to support them. Long paths on Windows are a vipers nest of incompatibility with the rest of the system.

Yes, this is clearly something we need, but please don’t suggest it’s trivial or obvious because it’s neither.

@ethomson
Copy link
Member Author

And as a reminder: this is a volunteer-run open source project. If this is something that you would like to see supported, you can help us achieve it.

@AFulgens
Copy link

Just chipping in, I am currently evaluating using the new API in Windows for filenames via setting HKLM\SYSTEM\CurrentControlSet\Control\FileSystem LongPathsEnabled = true (cf. https://docs.microsoft.com/en-us/windows/desktop/FileIO/naming-a-file#maximum-path-length-limitation).

I did not test it directly with libgit2, but via Gitkraken this still does not work. The behaviour is the same as without setting the above flag, namely: Gitkraken says, there is no repository at the path.

Please note for cross-compatibility: for msysgit the above flag is too little too, and the repo won't work, unless core.longpaths = true is set.

@ethomson
Copy link
Member Author

Indeed. Path lengths are carefully tested and buffers are statically allocated for efficiency. There's two challenges here: allowing long paths if and only if core.longpaths is set, and not destroying perf with millions of tiny allocations when we do.

This is Of Interest to me, but I want to finish API stabilization and ship 1.0 before I think about tackling this.

@AFulgens
Copy link

"Path lengths are carefully tested and buffers are statically allocated for efficiency. [...] not destroying perf with millions of tiny allocations when we do." → what I honestly don't get is, how is this a problem on NTFS/Windows but not on ext/btrfs/etc. on *nix?

I understand the technical challenges (NTFS/Win support is always a can of worms), I just don't understand the performance argument.

@retep998
Copy link

You can do things like only heap allocating if the path is over a certain length, or you can even decide that 64KiB is a perfectly fine size for an array on the stack and not have to worry about heap allocations.

@pks-t
Copy link
Member

pks-t commented Jun 21, 2019

You can do things like only heap allocating if the path is over a certain length, or you can even decide that 64KiB is a perfectly fine size for an array on the stack and not have to worry about heap allocations.

I'd argue it's not, though. While many libc implementations have a huge stack of multiple megabytes by default, more lightweight ones like musl have a default stack size of 80k, only.

@ethomson
Copy link
Member Author

I understand the technical challenges (NTFS/Win support is always a can of worms), I just don't understand the performance argument.

On POSIX systems, we just hand them over a bunch of bytes that represent UTF8. On Windows, we need to do a UTF8 -> UTF16 conversion to hand them over a sequence of bytes that represent UTF16. That means we need a buffer to put it in.

At present, we can use static buffers on the stack since we know that they will never exceed 260 UTF16 characters. When we support core.longpaths will need to revisit our entire strategy for dealing with UTF16 conversion.

You can do things like only heap allocating if the path is over a certain length, or you can even decide that 64KiB is a perfectly fine size for an array on the stack and not have to worry about heap allocations.

Yep, I think that what probably makes sense here is to have a 260 character buffer on the stack and heap allocate if the string is larger than that. Then only people doing long paths will suffer. Or use our pool allocator.

@retep998
Copy link

I'd argue it's not, though. While many libc implementations have a huge stack of multiple megabytes by default, more lightweight ones like musl have a default stack size of 80k, only.

This is about Windows specifically where we can assume stack sizes of multiple megabytes.

So I guess the decision comes down to whether you want to heap allocate when it is over 260, or use a stack buffer all the way up to the real maximum of 64KiB.

@ItsGosho
Copy link

After 4 years and I cant clone and use my repository with GitKraken ,because if I clone it with gitbash or any other and then open it via GitKraken it doesnt detect changes ...

@ethomson
Copy link
Member Author

@ItsGosho This is not a constructive comment.

@myblindy
Copy link

I just ran into this issue which brought me to this thread. No luck on figuring out long paths yet, eh?

@ethomson
Copy link
Member Author

I'm locking this issue. There's work ongoing to implement this and this has just become a thread of pointless "me too!"s.

@libgit2 libgit2 locked and limited conversation to collaborators Mar 22, 2021
@ethomson
Copy link
Member Author

ethomson commented May 6, 2021

@ianhattendorf has added longpath support, and it is now available in main. 🎉

@ethomson ethomson closed this as completed May 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests