Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git discard changes fails for large changesets with "ENAMETOOLONG" Fix for #65693 #66095

Merged

Conversation

@rcbevans
Copy link
Contributor

commented Jan 5, 2019

Chunk clean, checkout and update submodule commands within repository.ts to ensure the length of the files passed to the repository are less than 30k characters to avoid ENAMETOOLONG failures on Windows when working with very large changesets.

This was reported in #65693 and #23943.

The problem is there is a limitation to the maximum size of the command line arguments passed to ChildProcess spawn, and the code was calling git commands with all full file paths listed which may exceed the maximum allowable length for arguments. This would cause a git failure, and the discard would not happen.

This change chunks the clean, checkout and submodule update commands to length of the joined file paths being passed to git is less than 30k characters.

Chunk clean, checkout and update submodule commands within repository…
….ts to ensure the length of the files passed to the repository are less than 30k characters to avoid ENAMETOOLONG failures on Windows when working with very large changesets
@msftclas

This comment has been minimized.

Copy link

commented Jan 5, 2019

CLA assistant check
All CLA requirements met.

@@ -844,21 +844,70 @@ export class Repository implements Disposable {
}
});

const promises: Promise<void>[] = [];
const maxCommandLineLength: number = 30000;

This comment has been minimized.

Copy link
@rcbevans

rcbevans Jan 5, 2019

Author Contributor

The limit on windows seems to be around 31k characters, so for now I chose 30k as a nice round number. The total length of the command line arguments will be something over the length of the joined files passed to the repository command based on other arguments added, such as checkout -q -- <files> in the case of the git repository.

Unix systems can potentially support much larger command line lengths, so this chunking size may be unnecessarily small for such systems, so it may be better to check the actual allowable chunk size for the system being ran on, than to use a hardcoded value such as this. xargs --show-limits can report the upper limit on command length on a unix, or most unix-like systems.

This comment has been minimized.

Copy link
@adamtajti

adamtajti Mar 4, 2019

Contributor

As a note, in the issue you mentioned that

As a first attempt, the code could take 4096 as an upper limit, which is the minimum allowable max length of a unix system, as reported by xargs, and is the smallest common denominator?

So if someone has that configured we could still get another Issue for vscode about the 30k not being small enough. This limitation is quite frustrating. I would love to know why they have these constraints in place.

@@ -844,21 +844,70 @@ export class Repository implements Disposable {
}
});

const promises: Promise<void>[] = [];

This comment has been minimized.

Copy link
@rcbevans

rcbevans Jan 5, 2019

Author Contributor

It appears that in the current code only one git operation was ever being done on a call to this function. With my changes producing multiple checkout operations for example, I was seeing failures because of the promises executing in parallel and fighting for index locks. Awaiting each chunk of work means they will execute sequentially, avoiding contention for the index lock.

@joaomoreno joaomoreno added this to the Backlog milestone Jan 7, 2019

@joaomoreno joaomoreno added the git label Jan 7, 2019


if (toClean.length > 0) {
promises.push(this.repository.clean(toClean));
let sliceStart: number = 0;

This comment has been minimized.

Copy link
@adamtajti

adamtajti Mar 4, 2019

Contributor

These if sections are quite similar in terms of logic. I think that the common slicing part should be extracted into a private method / utility function outside of clean. It could be useful for other repository methods in the future as well.

@joaomoreno

This comment has been minimized.

Copy link
Member

commented Aug 6, 2019

Lots of cleanup here:

  • Extracted the chunk logic into a splitInChunks() generator function in util.ts
  • Added tests for it
  • Moved the call to splitInChunk one level down, into git.ts
  • Added a limiter login on top of clean since that command might spawn too many processes (contention) yet we might not want to serialize all those spawns (too slow)

Overall, great work and thanks for the PR! 🍻

@joaomoreno joaomoreno modified the milestones: Backlog, August 2019 Aug 6, 2019

@joaomoreno joaomoreno merged commit fc488dd into microsoft:master Aug 6, 2019

2 checks passed

VS Code #20190105.11 succeeded
Details
license/cla All CLA requirements met.
Details
joaomoreno added a commit that referenced this pull request Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.