Skip to content

Optimization of Git commit dialog finding changed files #9304

Open
OndroMih wants to merge 1 commit intoapache:masterfrom
OndroMih:ondromih-git-commit-dlg-optimization
Open

Optimization of Git commit dialog finding changed files #9304
OndroMih wants to merge 1 commit intoapache:masterfrom
OndroMih:ondromih-git-commit-dlg-optimization

Conversation

@OndroMih
Copy link
Copy Markdown
Contributor

@OndroMih OndroMih commented Mar 28, 2026

Speeds up GitClient.getStatus() by deferring expensive evaluation of object Ids, which often compute file content hash, to evaluate them lazily only when needed.

Additionally, skips calling isEntryIgnored, which recursively scans for .gitignore files up to the root directory, in case it's not needed at all. This saves additional few hundres of milliseconds.

On Netbeans repository with a lot of files, this speeds up GitClient.getStatus() execution from 4 seconds to 1 second.

PR approval and merge checklist:

  1. Was this PR correctly labeled, did the right tests run? When did they run?
  2. Is this PR squashed?
  3. Are author name / email address correct? Are co-authors correctly listed? Do the commit messages need updates?
  4. Does the PR title and description still fit after the Nth iteration? Is the description sufficient to appear in the release notes?

If this PR targets the delivery branch: don't merge. (full wiki article)

@mbien mbien added git [ci] enable versioning job ci:dev-build [ci] produce a dev-build zip artifact (7 days expiration, see link on workflow summary page) labels Mar 28, 2026
@apache apache locked and limited conversation to collaborators Mar 28, 2026
@apache apache unlocked this conversation Mar 28, 2026
@mbien
Copy link
Copy Markdown
Member

mbien commented Apr 1, 2026

the description makes sense to me. Unfortunately this causes several test failures which I could also reproduce locally (same module). Those would have to be investigated first before this can proceed.

@OndroMih
Copy link
Copy Markdown
Contributor Author

OndroMih commented Apr 1, 2026

@mbien , it's getting a bit more complicated :)

I wonder if the tests are really valid and it's required that files need to be present in the resulting map if they are not changed. The Commit dialog always calls the StatusCommand for a directory, not for a single file.

For now, to please the tests, and potentially some real functionality that needs to get status for a single file, I modified the solution (in the second commit) to apply only for directories (where it omits the unchanged files from the statuses map) but it keeps the file in the statuses map if called for a single file.

A better approach would be to preload file status and parallelize computing hashes with fti.isModified(indexEntry, true,...), so that it doesn't block main thread. This, however, requires a lot of refactoring, so let's try a simple approach first. I asked AI to analyze how GIT does it, and it said that it does it in a similar way - first preloads status info from filesystem in parallel, computes hashes from content if needed, and then goes through the precomputed information to build the result.

P.S. I'm adding commits, without squashing, to keep the history until find the best solution. Then I'll squash commits.

Speeds up GitClient.getStatus() by deferring expensive evaluation of object Ids, which often compute file content hash, to evaluate them lazily only when needed.

Additionally, skips calling isEntryIgnored, which recursively scans for .gitignore files up to the root directory, in case it's not needed at all. This saves additional few hundres of milliseconds.

On Netbeans repository with a lot of files, this speeds up GitClient.getStatus() execution from 4 seconds to 1 second.
@OndroMih OndroMih force-pushed the ondromih-git-commit-dlg-optimization branch from 52ee181 to d72d934 Compare April 4, 2026 15:52
@OndroMih
Copy link
Copy Markdown
Contributor Author

OndroMih commented Apr 4, 2026

@mbien , I dug deeper and found out that there are better ways to optimize the current algorithm, without skipping any files.

I started from scratch and managed to reduce the time spent in the GitClient.getStatus() in the refreshAllRoots method by 75%, on Netbeans repo from 4s to 1s. Mostly by not executing some I/O operations, which were being executed and then their result was ignored.

This reduces the whole time it takes to compute files statuses on the Netbeans repo in the Commit dialog from about 17s to about 14s.

And I have even better news - I managed to speed up another part of the refreshAllRoots method by about 10 more seconds. I'll raise a separate PR for that because they are independent.

@OndroMih
Copy link
Copy Markdown
Contributor Author

OndroMih commented Apr 4, 2026

Here's the other PR: #9324

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:dev-build [ci] produce a dev-build zip artifact (7 days expiration, see link on workflow summary page) git [ci] enable versioning job performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants