Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ghq get is slow when directory already exists #379

Closed
davinkevin opened this issue Apr 3, 2024 · 6 comments · Fixed by #386
Closed

ghq get is slow when directory already exists #379

davinkevin opened this issue Apr 3, 2024 · 6 comments · Fixed by #386

Comments

@davinkevin
Copy link

Hello 👋

First, thank you for your tool, which is really good and I use it a lot… maybe too much considering this issue 😅.

Sometimes, when I want to jump into a directory, I use ghq get -l <repo_url> and it's very slow (~7 seconds). See hyperfine results:

image

I also include ghq in scripts when I have to automate tasks… and it's slow (+2sec), especially when the repository is already here. See results just for a get on an already existing repository:

image

Is there something misconfigured on my setup or it's expected?

@stong1994
Copy link
Contributor

In your case, ghq traverses all repositories three times, and I'll explain why:

Firstly, before getting or updating a repository, ghq traverses all repositories to determine if a specific repository exists.

Secondly, after getting or updating a repository, ghq attempts to match the repository with the provided URL. However, it matches the repository's subpath, which results in not finding a match.

Lastly, ghq parses the URL provided by the user and uses the parsed information to match the repository, where it succeeds in finding a match.

Overall, the execution time of the ghq get -l command is approximately three times longer than the ghq get command.

@davinkevin
Copy link
Author

Thank you for the explanation.

However, don't you feel 7 secs is reasonable? Does that depend on the number of repository I've already checked-out?

@stong1994
Copy link
Contributor

I have created a PR here to ensure that it scans all repositories only once in your scenario. By the way, how many projects do you have? I have dozens of projects, and it only takes me 40 milliseconds to scan them.

@davinkevin
Copy link
Author

Thanks.

I would say I have more, but how can I count that?
Due to groups in GitLab, I can't just count folders.
It should be around ~60 maximum, so not something very big.

Is ghq affected by other things stored in $GHQROOT root? I just discovered I have pkg/mod folder from a golang env I've setup some times ago.

image

And it contains… ~60k folders 😅.

@stong1994
Copy link
Contributor

You can use this command to display the projects:

ghq root --all | xargs -I {} fd --type d --max-depth 4 --hidden .git {} | sed 's/\/\.git//'

And, I think pkg should not under the workspace 🤔

@davinkevin
Copy link
Author

davinkevin commented Apr 22, 2024

image

I think --max-depth 4 reduces the list, because I have groups inside group with GitLab, then we filter some.
But it should be approximatively that…

about pkg, I'll give it a try tomorrow by (re)moving it somewhere else and see if it's fater.

yujinyuz added a commit to yujinyuz/dotfiles that referenced this issue May 25, 2024
When the list of repositories in my ghq source directory got large, it
was getting extremely slow when performing a clone[^1][^2].

It turns out that I have a lot of directories that are not managed by
git especially within my work folder. I was able to figure this out
after moving out the _work_ directory outside the `~/Sources` dir.

I found https://github.com/siketyan/ghr and it seems to be fast because
it doesn't check `.git` folders (which I'm fine with)

Refs:

[^1]: x-motemen/ghq#323
[^2]: x-motemen/ghq#379
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants