Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: avoid the GitHelper and instead keep data on the stack #126

Merged
merged 4 commits into from
Mar 25, 2024

Conversation

Byron
Copy link
Contributor

@Byron Byron commented Mar 24, 2024

This enables far-better performance.

Fixes #123

Review Notes

I didn't actually test performance, and don't know how much tests would catch.

Copy link
Contributor Author

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also left some comments to better explain what's happening here.

Something I'd love to ask is if you could make all changes you need directly in this PR (gh pr checkout 126 && changes && git commit -m "better" && git push) - this is usually the most efficient way to get things done, both for you and for me.

Thanks for your consideration

fmt/src/selection.rs Show resolved Hide resolved
fmt/src/selection.rs Show resolved Hide resolved
fmt/src/selection.rs Show resolved Hide resolved
.strip_prefix(&workdir)
.expect("git repository encloses iteration");
let platform = excludes
.at_path(rela_path, Some(file_type.is_dir()))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why this is fast - after all the initialization, only the necessary work is performed here. All the pre-amble is hidden in git2, but that does only mean it does a lot of work for each call. Think about the relative-path requirement, it's the same for git2 and it will canonicalize everything because it has to, which is really expensive as well.

That's also the reason gix won't offer such a nice and convenient API like git2, it's simply too wasteful (or very stateful internally, which comes with its own problems).

Copy link
Member

@tisonkun tisonkun Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

Out of the thread, in another project I participate we discover an issue that PROST's performance is even "worse" than go-protobuf, because it makes less assumption/requirement on ownership/lifetime so it has to copy the buffer internally.

if file_type.is_symlink() {
debug!("skip symlink: {:?}", path);
} else if file_type.is_dir() {
if git_helper.ignored(path, true)? {
if platform.is_excluded() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call also performs work, as it now has to search through the initialized stack of .gitignore patterns to find a match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it clear. Isn't this change part of the cascading of

let platform = excludes.at_path(rela_path, Some(file_type.is_dir()))

above? In the previous code, it wrote:

        let platform = attrs
            .at_path(at_path, Some(is_dir))
            .context(GixCheckExcludeOpSnafu)?;
        Ok(platform.is_excluded())

The construction of attrs is different, but the rest is the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what I tried to say is that the is_excluded() information isn't cached, but it is computed during the call. It was probably unnecessary to point that out (but I found it interesting enough).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know :D

@tisonkun
Copy link
Member

tisonkun commented Mar 24, 2024

Something I'd love to ask is if you could make all changes you need directly in this PR (gh pr checkout 126 && changes && git commit -m "better" && git push) - this is usually the most efficient way to get things done, both for you and for me.

Of course. I like this way; although I was told not to do this in another project >_<.

It's 11pm here now. Perhaps I will give this a look tomorrow.

rust-toolchain.toml Outdated Show resolved Hide resolved
@Byron
Copy link
Contributor Author

Byron commented Mar 24, 2024

Of course. I like this way; although I was told not to do this in another project >_<.

Yeah, you never know. Fortunately authors can untick a checkbox to prevent it, so if it's enabled I usually do it.

@tisonkun tisonkun changed the title refactor: avoid the GitHelper and instead keep data on the stack (#123) refactor: avoid the GitHelper and instead keep data on the stack Mar 25, 2024
Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Overall it looks good.

Only one thing I'd like to request a change with your knowledge guard.

fmt/src/selection.rs Outdated Show resolved Hide resolved
Signed-off-by: tison <wander4096@gmail.com>
@tisonkun
Copy link
Member

tisonkun commented Mar 25, 2024

I didn't actually test performance, and don't know how much tests would catch.

I don't have time to build a bench suite yet. The observed slow process is when I check against a large repo (>~10 thousands files tested), whether it can finish < 1 second. git2 can fail this check.

@Byron
Copy link
Contributor Author

Byron commented Mar 25, 2024

I didn't actually test performance, and don't know how much tests would catch.

I don't have time to build a bench suite yet. The observed slow process is when I check against a large repo (>~10 thousands files tested), whether it can finish < 1 second. git2 can fail this check.

I am definitely curious to learn some of the performance numbers, if you get to compare the two on a large repo one day.

Besides that, the PR seems ready once your patch is applied.

Signed-off-by: tison <wander4096@gmail.com>
Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback.

Pending to merge ...

@tisonkun tisonkun enabled auto-merge (squash) March 25, 2024 07:15
@tisonkun tisonkun merged commit 17a466e into korandoru:main Mar 25, 2024
10 checks passed
@tisonkun
Copy link
Member

Thanks for your contribution @Byron! I enjoy to collaborate with you.

@Byron Byron deleted the simplify branch March 25, 2024 07:28
@Byron
Copy link
Contributor Author

Byron commented Mar 25, 2024

Likewise! Fast, efficient, constructive and friendly - my pleasure :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve the Gix check ignore code
2 participants