Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

submodule: check index for path and prefix before adding submodule #4378

Merged
merged 9 commits into from
Mar 30, 2018

Conversation

cjhoward92
Copy link
Contributor

This is to fix an issue where libgit2 will allow you to add a submodule even if the path you are trying add is found on the index. In the CLI we get a nice error about this, but not in libgit2.

I would obviously like some pointers on how to do this, as I do not think it is 100% correct. For one, when I add the trailing slash to the path, I feel like I could clean that up or have a better implementation. I am also unsure if there is a more streamlined routine to check the index for the path.

Some advice on how I could clean this up would be awesome.

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I think it looks good regarding what it tries to do, but I'm a bit in a hurry right now. I'll do a proper review of the logic at a later point, probably when you've got around to amending this PR :)

src/submodule.c Outdated
@@ -688,6 +691,34 @@ int git_submodule_add_setup(
goto cleanup;
}

/* get the index for the repo */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to put the following checks into a simple function. Something like can_add_submodule(repo, path).

src/submodule.c Outdated
path_len = strlen(path);
dir = git__malloc(path_len + 1);
strcpy(dir, path);
git_path_string_to_dir(dir, path_len + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use a buffer here with git_path_to_dir:

git_buf path = GIT_BUF_INIT;
git_buf_sets(&path, path);
git_path_to_dir(&path);

Error handling is obviously missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is obviously bogus, I should've chosen two different variable names. But I guess you get what I mean :P

/* In this repo, HEAD (master) has no remote tracking branc h*/
g_repo = cl_git_sandbox_init("testrepo");

git_buf_joinpath(&dirname, git_repository_workdir(g_repo), "TestGitRepository");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wrap every call which you expect to just return 0 for success in cl_git_pass.


cl_git_pass(
git_repository_index__weakptr(&index, g_repo)
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for the newlines here. In most cases, we simply ignore line length limits for our tests and put those things in one line.


p_mkdir(dirname.ptr, 0700);
fd = fopen(filename.ptr, "w");
fclose(fd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also assert that those three functions return expected values.

);

git_buf_free(&dirname);
git_buf_free(&filename);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to free the submodule, as well.

git_buf_joinpath(&filename, dirname.ptr, "test.txt");

p_mkdir(dirname.ptr, 0700);
fd = fopen(filename.ptr, "w");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have cl_git_mkfile for that

/* In this repo, HEAD (master) has no remote tracking branc h*/
g_repo = cl_git_sandbox_init("testrepo");

git_buf_joinpath(&name, git_repository_workdir(g_repo), "TestGitRepository");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please assert the return value here.

git_buf_joinpath(&name, git_repository_workdir(g_repo), "TestGitRepository");

fd = fopen(name.ptr, "w");
fclose(fd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have cl_git_mkfile for that.

);

git_buf_free(&name);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, you also have to free the submodule here

@cjhoward92
Copy link
Contributor Author

@pks-t Also thank you for reviewing this. I will make these changes as soon as I can! I appreciate you taking the time to look over my work.

@cjhoward92
Copy link
Contributor Author

@pks-t Updated. Let me know what you think!

@cjhoward92 cjhoward92 changed the title WIP: submodule: check index for path and prefix before adding submodule submodule: check index for path and prefix before adding submodule Nov 1, 2017
@cjhoward92
Copy link
Contributor Author

Any chance anyone will take a look at this?

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking that long to review, I've been quite busy recently with life :/

Anyway, thanks for your fixups, it's a lot more readable already. Besides some style-related things, I've also taken a look at it regarding functionality now. I've added some comments regarding that, would be nice if you could clarify those.

Keep going your good work and this PR will be ready soon :)

src/submodule.c Outdated
@@ -149,6 +149,41 @@ static int find_by_path(const git_config_entry *entry, void *payload)
return 0;
}

static int can_add_submodule(git_repository *repo, const char *path) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the opening curly brace to a new line.

src/submodule.c Outdated
if ((error = git_path_to_dir(&dir)) < 0)
return error;

/* get the index for the repo */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if you removed those comments. You shouldn't describe what you are doing with comments, as most often that is obvious from the code itself. Instead, you should try to document why you are doing that. E.g. a useful comment here would be one preceding the function which explains exactly when you can add a submodule and when you can't.

src/submodule.c Outdated
if ((error = git_buf_sets(&dir, path)) < 0)
return error;

if ((error = git_path_to_dir(&dir)) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You start to leak memory as soon as one of those functions returns an error. Instead of "return error;", we usually have e.g. a "goto error;" where all memory is free'd. E.g.:

static int can_add_submodule(git_repository *repo, const char *path)
{
...

if ((error = git_path_to_dir(&dir)) < 0)
    goto out;

...

out:
    git_buf_free(&dir);
    return error;
}

src/submodule.c Outdated

/* see if the submodule name exists as a file on the index */

if ((error = git_index_find(NULL, index, path)) == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's something off here. You first normalize path to a directory path by using git_path_to_dir and now you try to check if there's a file at that location matching the path. I guess what you wanted to do instead is first using the "normal" path value and only normalize it to a directory name after you've used git_index_find here to check whether there's a file there.

src/submodule.c Outdated

if ((error = git_index_find_prefix(NULL, index, dir.ptr)) == 0) {
giterr_set(GITERR_SUBMODULE,
"'%s' already exists in the index", path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be cool if you disambiguated those two error messages based on their type (that is directory vs file already existing).

git_buf dirname = GIT_BUF_INIT;
git_buf filename = GIT_BUF_INIT;

/* In this repo, HEAD (master) has no remote tracking branc h*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo. Also: why does this matter for our particular unit test here? I don't want to propose it doesn't, but for myself I don't get why this is interesting. Please either extend that comment and explain a bit more why this is important here or just remove it :)

git_submodule *sm;
git_buf name = GIT_BUF_INIT;

/* In this repo, HEAD (master) has no remote tracking branc h*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

/* In this repo, HEAD (master) has no remote tracking branc h*/
g_repo = cl_git_sandbox_init("testrepo");

cl_git_pass(git_buf_joinpath(&dirname, git_repository_workdir(g_repo), "TestGitRepository"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename "TestGitRepository" to e.g. just "subdirectory" or "directory". Otherwise it's a bit confusing, as that path should not represent a real git repository but should only contain a file later on.

@@ -128,3 +128,51 @@ void test_submodule_add__url_relative_to_workdir(void)

assert_submodule_url("TestGitRepository", git_repository_workdir(g_repo));
}

void test_submodule_add__path_exists_in_index(void)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one problem with that test. You want to verify that creation of a submodule fails if the path exists in the index, but actually when you create your submodule it exists both inside the index and also inside of the working directory. So us failing here might not be the result from the entry existing in the index, but instead that file existing in the working directory.

You could either just remove the directory and file after you've added them to the index or, which is the preferred method if you ask me, instead add the index entry directly by using git_index_add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the problem you are describing, but I am unsure how to use git_index_add. How can I create a git_index_entry properly? I s there something I am missing? Other than index_create_entry, which is not exposed outside of index.c I do not know how to create an entry. For now, I am updating the routine to just delete the file and directory after I add to the index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at for example "tests/index/add.c", test_add_entry(). You can create an index entry by simply assigning the structure fields.


git_submodule_free(sm);
git_buf_free(&name);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now you're checking two out of four cases:

  1. an index entry exists inside of the directory you want to add
  2. a file exists in the working directory

But there are two cases missing:

  1. an index entry exists directly where you want to create the submodule
  2. a directory exists where you want to create the submodule

Would it make sense to add tests for those two cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure exactly what your test cases mean. I was trying to test two cases:

  1. A file path exists in the index with the same name as the submodule
  2. A directory exists in the index with the same name as the submodule

If you could offer some clarification I would be more than happy to add extra test cases :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean we should test against the workdir as well? I am not sure that my added code does that, it is an explicit check against the index but it makes sense that we should check untracked files in the workdir too. Probably a good check to have in can_add_submodule. Thoughts?

@cjhoward92
Copy link
Contributor Author

I janked this up bad. I will recover the remote from my other machine :(

@cjhoward92 cjhoward92 force-pushed the fix/submodule-add-check-index branch 3 times, most recently from 12c939c to 420ffbd Compare November 20, 2017 04:00
@cjhoward92 cjhoward92 force-pushed the fix/submodule-add-check-index branch 3 times, most recently from 5d7f680 to bea77bb Compare December 18, 2017 17:50
@cjhoward92
Copy link
Contributor Author

@pks-t I finally got around to making all of the changes you requested, I think. Let me know if there is anything else I need to do. Sorry for the wait, been a busy few months!

@cjhoward92
Copy link
Contributor Author

bump

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I'm being a bit nitpicky with the can_add_submodule function, but we do usually return -1 codes in case of an unexpected error only. It's also a bit weird to see something like if (can_add_submodule() < 0), because that doesn't explain the semantics immediately. On the other hand, if (is_path_occupied(&occupied)) < 0) return -1; if (occupied) ... makes the intentions totally clear.

The tests now look perfect, thanks a lot for improving them!

src/submodule.c Outdated
error = GIT_EEXISTS;
goto out;
}
error = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed here, as the next line will already unconditionally set error

src/submodule.c Outdated
@@ -688,6 +723,9 @@ int git_submodule_add_setup(
goto cleanup;
}

if ((error = can_add_submodule(repo, path)) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Looks alright to me, too. Dunno either :P

src/submodule.c Outdated
* Checks to see if the submodule shares its name with a file or directory that
* already exists on the index. If so, the submodule cannot be added.
*/
static int can_add_submodule(git_repository *repo, const char *path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit misnamed. You do not check if we can add the submodule at a given path, as that would involve other checks as well. Instead, you only check whether the given path is already occupied by something else. How about is_path_occupied instead?

One more thing is that functions which should answer a certain question are expected to return a bool, not an int. So I'd propose a signature like static int is_path_occupied(bool *occupied, git_repository *repo, const char *path). Like this, a real unexpected error can be returned by returing a negative value. If no error happens we return 0 and occupied will then be set to true if the path is occupied and false otherwise.


cl_git_pass(git_repository_index(&index, g_repo));

test_add_entry(index, valid_blob_id, filename.ptr, GIT_FILEMODE_BLOB);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, these tests look very nice now! :)

@cjhoward92
Copy link
Contributor Author

@pks-t Should be updated with is_path_occupied. Let me know what you think!

src/submodule.c Outdated
if ((error = git_path_to_dir(&dir)) < 0)
goto out;

if ((error = git_index_find_prefix(NULL, index, dir.ptr)) == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're ignoring any potential error returned by git_index_find_prefix other than GIT_ENOTFOUND. Granted, right now it will not return any other error, but you never know how it will change in the future. Maybe something like this:

if ((error = git_index_find_prefix(NULL, index, dir.ptr)) < 0 && error != GIT_ENOTFOUND)
    goto out;

if (!error) {
    giterr_set(GITERR_SUBMODULE, ...);
    *occupied = true;
    goto out;
}

error = 0;

@cjhoward92 cjhoward92 force-pushed the fix/submodule-add-check-index branch from d89af2c to c07abd6 Compare March 27, 2018 14:37
@cjhoward92
Copy link
Contributor Author

This is also updated! Thanks again!

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be the last iteration, I guess. Looks very clean already

src/submodule.c Outdated
if ((error = git_repository_index__weakptr(&index, repo)) < 0)
goto out;

if ((error = git_index_find(NULL, index, path)) == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it'd be a bit less fragile if the check was if ((error = git_index_find(NULL, index, path)) != GIT_ENOTFOUND). Otherwise, we might just ignore all errors by git_index_find (even though it currently does not return any error different than GIT_ENOTFOUND.

src/submodule.c Outdated
goto out;

if ((error = git_index_find_prefix(NULL, index, dir.ptr)) < 0 && error != GIT_ENOTFOUND)
goto out;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses spaces instead of tabs.

src/submodule.c Outdated
if ((error = git_path_to_dir(&dir)) < 0)
goto out;

if ((error = git_index_find_prefix(NULL, index, dir.ptr)) < 0 && error != GIT_ENOTFOUND)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably do the same as above:

if ((error = git_index_find_prefix(NULL, index, dir.ptr)) != GIT_ENOTFOUND) {
    if (!error) {
        giterr_set(GITERR_SUBMODULE, "Already exists");
        *occupied = true;
        error = -1;
    }
    goto out;
}


cl_git_pass(git_buf_joinpath(&filename, "subdirectory", "test.txt"));

cl_git_pass(git_repository_index(&index, g_repo));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use git_repository_index__weakptr, otherwise you'd have to free index


g_repo = cl_git_sandbox_init("testrepo");

cl_git_pass(git_repository_index(&index, g_repo));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@cjhoward92
Copy link
Contributor Author

Thanks for all of the hard work. I have no problem making as many iterations as possible to get this to be perfect!

@pks-t
Copy link
Member

pks-t commented Mar 28, 2018

Looks good to me, thanks! Waiting for CI to catch up and will give others a chance to review.

@pks-t pks-t merged commit dc27772 into libgit2:master Mar 30, 2018
@pks-t
Copy link
Member

pks-t commented Mar 30, 2018

Thanks again for this PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants