Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkout submodules error if the previous update of submodules was interrupted #12806

Open
AlenaKarp opened this issue May 28, 2024 · 16 comments · May be fixed by #12807
Open

Checkout submodules error if the previous update of submodules was interrupted #12806

AlenaKarp opened this issue May 28, 2024 · 16 comments · May be fixed by #12807

Comments

@AlenaKarp
Copy link

Issue Type
  • Bug Report
Summary

There is a repository with submodules. Each time switching between branches, the git is reset and the submodules are updated. In case of interruption of the update process, all the following build processes crash at the stage of checking out submodules, because the repository is in an invalid state.

Basic environment details
  • Go Version: 23.1.0
Steps to Reproduce
  1. Switch branch
  2. Interrapt job at the update step. Get an error:
The plugin sent a response that could not be understood by Go. Plugin returned with code '500' and the following response: '"Exception (Process exited with an error: 1 (Exit value: 1)) Occurred: [git, submodule, update] - /Users/agent/go-agent/pipelines/pipelineName"'
  1. Switch to another branch. There is an error when doing a job:
The plugin sent a response that could not be understood by Go. Plugin returned with code '500' and the following response: '"Exception (Process exited with an error: 128 (Exit value: 128)) Occurred: [git, submodule, foreach, --recursive, git, checkout, .] - /Users/agent/go-agent/pipelines/pipelineName"'

This error is repeated in all jobs that run further, until you manually reset the submodules.

Expected Results

Submodules are reset automatically.

Actual Results

Error when trying to checkout submodules.

Possible Fix

Add reset submodules step when reset working directory.

@AlenaKarp AlenaKarp linked a pull request May 28, 2024 that will close this issue
@chadlwilson
Copy link
Member

Based on those error messages, this looks like you are using a plugin material, not a built-in material for GoCD. The fix for this would have to be with the relevant plugin, not in GoCD (although GoCD may have a similar issue, would have to check).

Which plugin material are you using?

@AlenaKarp
Copy link
Author

Based on those error messages, this looks like you are using a plugin material, not a built-in material for GoCD. The fix for this would have to be with the relevant plugin, not in GoCD (although GoCD may have a similar issue, would have to check).

Which plugin material are you using?

We use this: https://github.com/ashwanthkumar/gocd-build-github-pull-requests

The error falls into the plugin, however, as far as I can tell, the problem is not in it, but in the reset algorithm:

commands

I ran these commands in the git console, outside of the GoCD, and still got an error in the git submodule foreach --recursive git checkout step. When the git submodule update is interrupted, the submodules are in an invalid state and the checkout cannot be performed. When switching branches, the submodules do not change, and still invalid. An up to date reset is needed to fix this.

Here are the steps to reproduce the situation in the console:

  1. Clone a project with submodules
  2. Run commands before git submodule update
  3. Execute git submodule update and abort it without waiting for completion
    In this step get a diff: part of the submodules is marked as modified. In fact, one half of the folders with submodules is empty, the other half has only a .git file.
git error
  1. Switch the branch. The diff remains.
  2. Run commands once again, an error occurs at the step git submodule foreach --recursive git checkout

Perhaps this case is more visual when there are a lot of submodules, in my project about 40, the update takes time. During this time, it may fall, and I have to fix it manually every time.

@chadlwilson
Copy link
Member

chadlwilson commented May 28, 2024

The clean + reset algorithm is the responsibility of the plugin. While GoCD "proper" might have had the same problem, the plugin is responsible for everything on the material including prepping it, resetting etc and the error indicates the problem is in the plugin right now.

Similar to https://groups.google.com/g/go-cd/c/tfaenMirnLs the problem needs to be fixed there.

In GoCD core Git materials, I believe this issue was fixed in #5964 by switching to -dffx by default alongside some other changes to submodule algorithms, after a long historical conversation at #1059 and other places.

Likely a similar thing needs to be done on the plugin, which uses this "reusable" library code for git interactions:

https://github.com/ashwanthkumar/git-cmd/blob/4459e79a71e34dc3e56fc1de906fd87030f1b625/src/com/tw/go/plugin/git/GitCmdHelper.java#L287-L290

Since the intent is for it to work similarly to GoCD itself, it could be changed to a similar default behaviour as GoCD itself, even relying upon the same system property "opt out".

private String gitCleanArgs() {
if ("Y".equalsIgnoreCase(System.getProperty(GIT_CLEAN_KEEP_IGNORED_FILES_FLAG))) {
LOG.info("{} = Y. Using old behaviour for clean using `-dff`", GIT_CLEAN_KEEP_IGNORED_FILES_FLAG);
return "-dff";
} else {
return "-dffx";
}
}

@chadlwilson chadlwilson linked a pull request May 28, 2024 that will close this issue
@chadlwilson
Copy link
Member

Hiya - have you checked this out empirically?

I can create PRs to the library and the plugin it uses to align with GoCD out-of-the-box semantics, but only Ashwanth can release the library to Maven and a new version, which is what it sounds like it needs to address your problem if -dffx during clean is sufficient to get the behaviour you desire without the recursive submodule reset.

@AlenaKarp
Copy link
Author

AlenaKarp commented Jun 4, 2024

Hi, Chad!

I tried to use -dffx in the console, diff is still there, it crashes exactly the same way, on the next update after the interrupted one, so I think the submodule reset is still needed.

By the way, we have decided to abandon this plugin material, waiting fixes here :)
I didn't screen out some of the steps, but followed the algorithm.

2test 3test

Hiya - have you checked this out empirically?

I can create PRs to the library and the plugin it uses to align with GoCD out-of-the-box semantics, but only Ashwanth can release the library to Maven and a new version, which is what it sounds like it needs to address your problem if -dffx during clean is sufficient to get the behaviour you desire without the recursive submodule reset.

@chadlwilson
Copy link
Member

What's the git version you reproduced this on? (just curious, and also to see if linux/Mac behave the same, or whether this is a Windows thing)

If this is the only possible fix, you should be aware that I am pretty nervous about changing the logic here, or otherwise changing the semantics of anything core within the git material. I don't have deep experience with git submodules (which I personally consider a pretty painful and unreliable feature to work with) and there aren't many other folks to help validate or give input. It would seem to have some capacity to break things, and I am not entirely sure it's the "right way" to fix it, as it would seem to impose a tax on all material updates, even for submodules in the normal state, rather than doing something to perhaps avoid leaving them broken in the first place.

@AlenaKarp
Copy link
Author

This is git version 2.37.0.windows.1, but I also reproduced this on a Mac.

I understand your concerns about this case, so it would be great if you (or maybe someone else) could reproduce it. Submodules are really not that popular, a project with a lot of submodules is a fairly rare thing, I think this is the reason why I caught it now.

About the proposed fix: one of the existing commands is git reset --hard revision at the main repository. Submodules are also repositories that also need to be reset. The git submodule foreach --recursive git checkout command, which, as I think, is contained to restore submodules to the default state, does not allow resetting local submodule changes.

What's the git version you reproduced this on? (just curious, and also to see if linux/Mac behave the same, or whether this is a Windows thing)

If this is the only possible fix, you should be aware that I am pretty nervous about changing the logic here, or otherwise changing the semantics of anything core within the git material. I don't have deep experience with git submodules (which I personally consider a pretty painful and unreliable feature to work with) and there aren't many other folks to help validate or give input. It would seem to have some capacity to break things, and I am not entirely sure it's the "right way" to fix it, as it would seem to impose a tax on all material updates, even for submodules in the normal state, rather than doing something to perhaps avoid leaving them broken in the first place.

@chadlwilson
Copy link
Member

Yeah, I understand and somewhat share the mode of thinking here, however the hard reset is necessary for the top level material to get to the specific target revision that GoCD is tracking. In the submodule case the revisions need to follow that implies by the parent repository, not anything GoCD is tracking, so it’s not necessarily true that they need to be hard reset to get there - theoretically the submodule update should be enough to get there.

it’s possibly just an accident that the hard reset on the parent happens to clean up any broken updates on the parent?

@AlenaKarp
Copy link
Author

You're right, and in valid situations, updating the submodule should be enough. However, as you can see, errors happen. This creates a local diff, and until the state of the submodule has not been reset, the following operations cannot be performed with them, including updating. I don't really know how to catch an exception in this case, so clearing the submodules looks quite logical to me.

I've been thinking about a solution now, and it might be safer to use git submodule deinit -f . instead of git submodule foreach git reset --hard. If do this before init submodules, it will clear the local changes. I checked it out, it's working for this case.

Yeah, I understand and somewhat share the mode of thinking here, however the hard reset is necessary for the top level material to get to the specific target revision that GoCD is tracking. In the submodule case the revisions need to follow that implies by the parent repository, not anything GoCD is tracking, so it’s not necessarily true that they need to be hard reset to get there - theoretically the submodule update should be enough to get there.

it’s possibly just an accident that the hard reset on the parent happens to clean up any broken updates on the parent?

@chadlwilson
Copy link
Member

You're right, and in valid situations, updating the submodule should be enough. However, as you can see, errors happen. This creates a local diff, and until the state of the submodule has not been reset, the following operations cannot be performed with them, including updating. I don't really know how to catch an exception in this case, so clearing the submodules looks quite logical to me.

Yeah, I don't disagree - just nitpicking on "the original motivation for hard reset" being the same between parent and subs. They are different, so not necessarily true to need to apply the same mechanism to the subs. No big deal though.

I've been thinking about a solution now, and it might be safer to use git submodule deinit -f . instead of git submodule foreach git reset --hard. If do this before init submodules, it will clear the local changes. I checked it out, it's working for this case.

Worth noting in #5856 (comment) that git submodule deinit --all was discussed, but only seemed to go back to git clean on the sub due to Git version support at the time?

Worth thinking about and determining which is "safer" and which is "faster". My instinct is that the hard reset is probably faster with fewer file writes to the local working copy, but if we can consolidate the commands to what reasonably modern Git can be expected to support, it might be the best of all worlds.

... and then there is git reset --hard --recurse-submodule as yet another option (adding the last flag if submodules are enabled/found, but potentially running before any have been inited in some cases) :-)

Current GoCD does all manner of stuff for submodules that I don't really understand though (removing them from git config? doing checkout manually rather than git submodule init --checkout or even git update --init --recursive?). Perhaps to support old git versions. Some of this dates from before GoCD was open-sourced, and I do have access to the internal repository from prior to open source.

The below was essentially introduced in 2009, for example: 😅

"Fixed bug related to submodule removal not working"

private void removeSubmoduleSectionsFromGitConfig(ConsoleOutputStreamConsumer outputStreamConsumer) {
log(outputStreamConsumer, "Cleaning submodule configurations in .git/config");
for (String submoduleFolder : submoduleUrls().keySet()) {
configRemoveSection("submodule." + submoduleFolder);
}
}
private void configRemoveSection(String section) {
String[] args = new String[]{"config", "--remove-section", section};
CommandLine gitCmd = gitWd().withArgs(args);
runOrBomb(gitCmd, false);
}

This joyous code also dates from March 2009:

"Added expand method into GitMaterial for submodule"

protected Map<String, String> submoduleUrls() {
String[] args = new String[]{"config", "--get-regexp", "^submodule\\..+\\.url"};
CommandLine gitCmd = gitWd().withArgs(args);
ConsoleResult result = runOrBomb(gitCmd, false);
List<String> submoduleList = result.output();
HashMap<String, String> submoduleUrls = new HashMap<>();
for (String submoduleLine : submoduleList) {
Matcher m = GIT_SUBMODULE_URL_PATTERN.matcher(submoduleLine);
if (!m.find()) {
bomb("Unable to parse git-config output line: " + result.replaceSecretInfo(submoduleLine) + "\n"
+ "From output:\n"
+ result.replaceSecretInfo(StringUtils.join(submoduleList, "\n")));
}
submoduleUrls.put(m.group(1), m.group(2));
}
return submoduleUrls;
}

At one point in 2012 it did do an update --init for submodules. Then in #631 this was changed back to a three-step process with init + sync + update!

There was also some back and forth in #5856 with some submodule stuff reverted in #6024.

Some of that "support old git" stuff can definitely be removed now, if we have confidence there is a better/safer/easier way. You can probably get an idea why I am a bit nervous to go too deep though. There are a lot of scenarios to validate, and requires some time to see how many of these the automated tests cover.

@AlenaKarp
Copy link
Author

Yes, you're right, if take into account the speed and support of older versions of git, then the hard reset looks better. The main advantage of deinit is working whatever the state of submodules, it seemed to me that this is more reliable, but in the case of a large number of submodules, the difference in processing time will probably be very unpleasant.

Worth noting in #5856 (comment) that git submodule deinit --all was discussed, but only seemed to go back to git clean on the sub due to Git version support at the time?

Worth thinking about and determining which is "safer" and which is "faster". My instinct is that the hard reset is probably faster with fewer file writes to the local working copy, but if we can consolidate the commands to what reasonably modern Git can be expected to support, it might be the best of all worlds.

Talking about git reset --hard --recurse-submodule - it work for Git 2.14+ and I personally had a bad experience with this :,3
I don't remember exactly the problem, but there were some difficulties with this command. By the way, I'll try it as soon as I have time.

... and then there is git reset --hard --recurse-submodule as yet another option (adding the last flag if submodules are enabled/>found, but potentially running before any have been inited in some cases) :-)

About removing from git config - it caches submodules data there (active flag, url), and if submodule url changed without removing from config step - we have invalid data. Unfortunately, it doesn't pull it up automatically.
Other commands are manual to support old git versions indeed. We won't increase productivity by writing these commands in one line, it will just look more concise, but some versions of the git will not support this.
It seems to me there is no need to raise the old git version just because of this.

Current GoCD does all manner of stuff for submodules that I don't really understand though (removing them from git config? >doing checkout manually rather than git submodule init --checkout or even git update --init --recursive?). Perhaps to support old >git versions. Some of this dates from before GoCD was open-sourced, and I do have access to the internal repository from prior >to open source.

Some of that "support old git" stuff can definitely be removed now, if we have confidence there is a better/safer/easier way. You >can probably get an idea why I am a bit nervous to go too deep though. There are a lot of scenarios to validate, and requires >some time to see how many of these the automated tests cover.

There is another idea - it is not so easy to implement, but it will solve this problem. Would it be possible to make a preprepare script to add custom commands before preparing the material? Perhaps this will give more flexibility to users.

@chadlwilson
Copy link
Member

chadlwilson commented Jun 6, 2024

Yeah, to be clear, I am OK dropping support for old git versions.

GoCD doesn't have the engineering input/capacity to care about this type of stuff (ancient git), so as long as it supports the versions that are available on standard repos for the oldest LTS linux distros we have container images for (Debian 11, RHEL 8, Ubuntu 20) it's fine to me. Which I think basically means Git 2.25+ from ~2020 era (from Ubuntu 20.04) would be reasonable.

@chadlwilson
Copy link
Member

There is another idea - it is not so easy to implement, but it will solve this problem. Would it be possible to make a preprepare script to add custom commands before preparing the material? Perhaps this will give more flexibility to users.

I'd rather not do this.

Some operations need to be performed on the server, and some on the agent. These have different needs, and as soon as you give control to the user you have all manners of "broken" that a repo or its working copy could be in before GoCD touches it.

That's a support nightmare and it'd make it actively harder to implement things GoCD really should support (but does not right now), such as partial clones.

@AlenaKarp
Copy link
Author

Yo!
I tried the git reset --hard --recurse-submodule command and unfortunately it didn't work - crashes with the same error as it is now.

изображение

@chadlwilson
Copy link
Member

It seems rather odd to me that that command could give not a git repository whereas doing it manually yourself inside the submodule or via git submodule foreach does not? This is insanity!

@AlenaKarp
Copy link
Author

Hey, I have not come up with new solutions, but I made a test repo with 30 submodules, you can use it. It's just a random data inside. https://github.com/AlenaKarp/submodules_owner.git

I reproduced the bug on it, there is the state of the repository after an interrupted update in the archive. If possible, please try it too. test_submodules_invalid.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants