Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

identifying nested repositories #49

Closed
pythys opened this issue Dec 18, 2022 · 5 comments
Closed

identifying nested repositories #49

pythys opened this issue Dec 18, 2022 · 5 comments

Comments

@pythys
Copy link

pythys commented Dec 18, 2022

Great work, I really like the project and thank you for your initiative.

In reference to #48 it would be great if grm repos find local would not stop as soon as it finds a repository but it would continue to walk the directory tree to find other possible git repos. Some projects have a pattern of adding some directory to .gitignore and then putting sub repositories over there without adding them as submodules.

@pythys
Copy link
Author

pythys commented Dec 18, 2022

Perhaps we can add a flag to the above idea if performance is an issue. I will investigate the code to see if I can contribute to that bit

@hakoerber
Copy link
Owner

Hey, I saw your other issue and agree that it's a use case makes sense to support.

I share your concern about performance. For the most common use case, I expect the repositories to usually be quite "high" in the file tree, so they will be found quickly and recursion stops. Also, repositories may contain a lot of files and directories to search through.

I thought about only going through paths that are part of .gitignore, but this has two issues:

  • It adds additional complexity, while still not covering every single use case 100% (maybe someone has a sub-repository in a repo without having it in the .gitignore). Too many what-ifs in my view.
  • It does not totally solve the performance problems. Especially gitignored directories contain a huge number of files (like build directories).

I think having a command line flag, as you suggested, would be the best option.

I will investigate the code to see if I can contribute to that bit

Awesome!

The relevant code would be in the find_repo_paths() function. As you can see, it's recursive, but stops as soon as a git repository is found.

Adding a command line flag would be straightforward.

Let me know if you need any more information. Otherwise, just go for it, we can discuss implementation details on a PR!

@pythys
Copy link
Author

pythys commented Jan 14, 2023

On my first attempt, I got nesting to work perfectly well. BUT I think two things are missing to make a PR:

  1. I must check against .gitmodules and ignore any repos defined over there since that is the responsibility of git itself
  2. Possibly add a flag with a default "false" for recursing
diff --git a/src/tree.rs b/src/tree.rs
index c53882f..3f818c0 100644
--- a/src/tree.rs
+++ b/src/tree.rs
@@ -100,43 +100,43 @@ pub fn find_repo_paths(path: &Path) -> Result<Vec<PathBuf>, String> {
 
     if git_dir.exists() || git_worktree.exists() {
         repos.push(path.to_path_buf());
-    } else {
-        match fs::read_dir(path) {
-            Ok(contents) => {
-                for content in contents {
-                    match content {
-                        Ok(entry) => {
-                            let path = entry.path();
-                            if path.is_symlink() {
-                                continue;
-                            }
-                            if path.is_dir() {
-                                match find_repo_paths(&path) {
-                                    Ok(ref mut r) => repos.append(r),
-                                    Err(error) => return Err(error),
-                                }
-                            }
+    }
+
+    match fs::read_dir(path) {
+        Ok(contents) => {
+            for content in contents {
+                match content {
+                    Ok(entry) => {
+                        let path = entry.path();
+                        if path.is_symlink() {
+                            continue;
                         }
-                        Err(e) => {
-                            return Err(format!("Error accessing directory: {}", e));
+                        if path.is_dir() {
+                            match find_repo_paths(&path) {
+                                Ok(ref mut r) => repos.append(r),
+                                Err(error) => return Err(error),
+                            }
                         }
-                    };
-                }
-            }
-            Err(e) => {
-                return Err(format!(
-                    "Failed to open \"{}\": {}",
-                    &path.display(),
-                    match e.kind() {
-                        std::io::ErrorKind::NotADirectory =>
-                            String::from("directory expected, but path is not a directory"),
-                        std::io::ErrorKind::NotFound => String::from("not found"),
-                        _ => format!("{:?}", e.kind()),
                     }
-                ));
+                    Err(e) => {
+                        return Err(format!("Error accessing directory: {}", e));
+                    }
+                };
             }
-        };
-    }
+        }
+        Err(e) => {
+            return Err(format!(
+                "Failed to open \"{}\": {}",
+                &path.display(),
+                match e.kind() {
+                    std::io::ErrorKind::NotADirectory =>
+                        String::from("directory expected, but path is not a directory"),
+                    std::io::ErrorKind::NotFound => String::from("not found"),
+                    _ => format!("{:?}", e.kind()),
+                }
+            ));
+        }
+    };
 
     Ok(repos)
 }

@pythys
Copy link
Author

pythys commented Jan 14, 2023

I should note that with the above fix:

  1. performance was not an issue at all, I got thousands of directories and everything synced pretty fast, less than a second I believe.
  2. I find myself much happier now in using grm given that I can structure my repositories into sub-directories and into various layouts. This gives me much more freedom in structuring where I'd like to keep things.

@pythys
Copy link
Author

pythys commented Jan 23, 2023

Pull request created

@pythys pythys closed this as completed Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants