-
Notifications
You must be signed in to change notification settings - Fork 23.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix git submodule tracking #80456
base: devel
Are you sure you want to change the base?
Fix git submodule tracking #80456
Conversation
The test
|
@s-hertel What Do I need to do to get this merged? |
Hi, I am also interested in an update on this I believe it was first brought up a year ago now in #77978 Thank you, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to work on this.
submodule_revisions_remote = {} | ||
for submodule_name in submodule_revisions: | ||
submodule_path = get_submodule_config(git_path, module, dest, submodule_name, 'path') | ||
if not submodule_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like it's possible to hit this. I'm guessing this shouldn't be defaulting to the legacy default branch name though, since it's a path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of legacy codebases still use master
, and many git versions still init repositories with master
and not main
.
I do agree though, either the default should go or this check should go.
I am leaning towards removing the default and leaving the error check, since its less likely to cause misinterpretations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug is that if not submodule_path
cannot be truthy because the path
config is defaulting to a branch name. Unless there's a default submodule path convention, then get_submodule_config()
should be refactored to only conditionally use a default (maybe by passing in an optional default similar to dictionary.get('key', 'default_value')
) or only setting the default if the config == 'branch'. Using a branch name as the default submodule directory name seems like it would generally be incorrect, and is adding the legacy branch name to new code.
Changing the default for the other config (branch
) would need a deprecation period since just changing it will break any playbooks relying on it. That would be good to add if you're interested (or I can as a follow-up), since it wasn't really possible without this fix.
module.fail_json(msg='Unable to detect path of submodule: %s' % submodule_name) | ||
|
||
submodule_branch = get_submodule_config(git_path, module, dest, submodule_name, 'branch') | ||
if not submodule_branch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also unused, since the submodule_branch is set to a default.
module.fail_json(msg='Unable to detect branch of submodule: %s' % submodule_name) | ||
|
||
submodule_path = os.path.join(dest, submodule_path) | ||
revision = get_version(module, git_path, submodule_path, '%s/%s' % ('origin', submodule_branch)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this hardcoding 'origin' instead of using the remote specified by the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.gitmodules does not appear to list the origin.
[submodule "submodule3"]
path = submodule3
url = https://github.com/theunkn0wn1/ansible_test_submodules_subm3.git
branch = user/theunkn0wn1/random_branch
[submodule "submodule4"]
path = submodule4
url = https://github.com/theunkn0wn1/ansible_test_submodules_subm4.git
branch = main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't hardcoded to 'origin'
before your change, the value from the module parameter remote
was used.
repo_submodules_b: 'https://github.com/theunkn0wn1/ansible_test_submodules_monorepo.git' | ||
repo_submodule3: 'https://github.com/theunkn0wn1/ansible_test_submodules_subm3.git' | ||
repo_submodule4: 'https://github.com/theunkn0wn1/ansible_test_submodules_subm4.git' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should add more external repos for these tests to depend on. It would probably be better for these to be added to an ansible-managed org.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to move these to an ansible-managed org, please do.
I have no such access, and the existing test suite repositories lack the necessary coverage
Just so I'm reading this 100% correctly, the current "workaround" change the submodule repo to use master, and I should be good to go while we wait for this thing to be resolved> :) |
The current "workaround" is ensure your submodules have a |
Is this just waiting on some repos to get moved so to the ansible org? or can this get merged now. Its been a headache for us for awhile now.. |
Hello! I see this has made some great progress! At this point is there any way the community can support getting this merged? I think the next action is for someone to take the testing repositories and move.re-create them in the Ansible org, and I don't think the community can do that. Please correct me if I am wrong, and also if there is anything else we can do to help. |
I think at this point this is blocked on two issues:
|
We could probably add tests for this bug without creating any new Github repos, which would make it much easier to maintain the tests. The git tests use local repos to exercise some code paths: https://github.com/ansible/ansible/blob/devel/test/integration/targets/git/tasks/setup-local-repos.yml, can we do the same for these? Otherwise the test could use a local git server. The main blocker is just the backwards incompatible changes mentioned in my review. It does not works as-is for some cases. |
Dears, do you have any estimate for when this PR will be merged? |
This PR needs to be rebased. /azp run |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
the PR needs to be rebased |
A new method get_submodule_versions_from_remote() is added to the git module and used to detect the remote revisions of the configured submodules. The new method loops over the detected submodules and determines the remote branch of each submodule as configured in the .gitmodules file. The default remains "master".
0084f8a
to
f432feb
Compare
rebased. |
We probably can.
took a swing at addressing these, but am unable to run the test suite as it appears to be broken on devel (#83787 ) |
SUMMARY
Due to the lack of movement on #77978, I decided to rebase that PR against devel and implement the requested unit tests.
Closes #77978
Fixes #77691
Fixes ansible/awx#12293
Fixes ansible/awx#12465
ISSUE TYPE
COMPONENT NAME
ansible.builtin.git module
ADDITIONAL INFORMATION
Issue: as implemented on
devel
, thegit
module'strack_submodules: yes
implementation is defective.The implementation ALWAYS tracks the branch
origin/master
on the submodules when updating an existing git repository on the target.There are two fatal assumptions made by this implementation:
master
branch. In recent times, Git and Github have opted to generate new repositories withmain
branches by default.master
branch. This disregards.gitmodules
, and the assumption does not hold in all use cases. I, for one, have non-default branches pinned in.gitmodules
in production codebases.Combing these factors, We arrive at scenarios where downstream usages such as AWX Tower can fail in scenarios where
master
does not exist for the git submodule, or the wrong branch gets deployed by ansible.Original additional information follows:
Before the git module executed this code to detect the remote versions:
git submodule foreach rev-parse [remote from config or "origin"]/master
With this patch the git module loops over the detected submodules and uses the configured branch name
from .gitmodules. Also the remote is now always origin because the submodules do not seem to care about
the remote name in the parent. Their remote is always called origin.