Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix git submodule tracking #80456

Open
wants to merge 2 commits into
base: devel
Choose a base branch
from

Conversation

theunkn0wn1
Copy link

@theunkn0wn1 theunkn0wn1 commented Apr 8, 2023

  • Fix detection of submodule remote versions
  • Add tests for affected code path
SUMMARY

Due to the lack of movement on #77978, I decided to rebase that PR against devel and implement the requested unit tests.

Closes #77978
Fixes #77691
Fixes ansible/awx#12293
Fixes ansible/awx#12465

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

ansible.builtin.git module

ADDITIONAL INFORMATION

Issue: as implemented on devel, the git module's track_submodules: yes implementation is defective.
The implementation ALWAYS tracks the branch origin/master on the submodules when updating an existing git repository on the target.
There are two fatal assumptions made by this implementation:

  1. all git remotes have a master branch. In recent times, Git and Github have opted to generate new repositories with main branches by default.
  2. All ansible users will always want to deploy the master branch. This disregards .gitmodules, and the assumption does not hold in all use cases. I, for one, have non-default branches pinned in .gitmodules in production codebases.

Combing these factors, We arrive at scenarios where downstream usages such as AWX Tower can fail in scenarios where master does not exist for the git submodule, or the wrong branch gets deployed by ansible.

Original additional information follows:

Before the git module executed this code to detect the remote versions:

git submodule foreach rev-parse [remote from config or "origin"]/master

With this patch the git module loops over the detected submodules and uses the configured branch name
from .gitmodules. Also the remote is now always origin because the submodules do not seem to care about
the remote name in the parent. Their remote is always called origin.

# in each submodule directory
git rev-parse origin/[branch name from .gitmodules or "master"]

@ansibot ansibot added affects_2.16 bug This issue/PR relates to a bug. module This issue/PR relates to a module. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. needs_triage Needs a first human triage before being processed. new_contributor This PR is the first contribution by a new community member. labels Apr 8, 2023
@s-hertel s-hertel removed the needs_triage Needs a first human triage before being processed. label Apr 11, 2023
@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Apr 20, 2023
@ansibot ansibot removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. labels May 5, 2023
@ansibot
Copy link
Contributor

ansibot commented May 5, 2023

The test ansible-test sanity --test pep8 [explain] failed with 1 error:

lib/ansible/modules/git.py:575:10: E225: missing whitespace around operator

click here for bot help

@ansibot ansibot added ci_verified Changes made in this PR are causing tests to fail. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed ci_verified Changes made in this PR are causing tests to fail. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels May 5, 2023
@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label May 18, 2023
@theunkn0wn1
Copy link
Author

@s-hertel What Do I need to do to get this merged?
(tagged since you triaged this PR)

@cberg2048
Copy link

Hi, I am also interested in an update on this I believe it was first brought up a year ago now in #77978

Thank you,

Copy link
Contributor

@s-hertel s-hertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to work on this.

submodule_revisions_remote = {}
for submodule_name in submodule_revisions:
submodule_path = get_submodule_config(git_path, module, dest, submodule_name, 'path')
if not submodule_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like it's possible to hit this. I'm guessing this shouldn't be defaulting to the legacy default branch name though, since it's a path.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lots of legacy codebases still use master, and many git versions still init repositories with master and not main.

I do agree though, either the default should go or this check should go.

I am leaning towards removing the default and leaving the error check, since its less likely to cause misinterpretations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug is that if not submodule_path cannot be truthy because the path config is defaulting to a branch name. Unless there's a default submodule path convention, then get_submodule_config() should be refactored to only conditionally use a default (maybe by passing in an optional default similar to dictionary.get('key', 'default_value')) or only setting the default if the config == 'branch'. Using a branch name as the default submodule directory name seems like it would generally be incorrect, and is adding the legacy branch name to new code.

Changing the default for the other config (branch) would need a deprecation period since just changing it will break any playbooks relying on it. That would be good to add if you're interested (or I can as a follow-up), since it wasn't really possible without this fix.

module.fail_json(msg='Unable to detect path of submodule: %s' % submodule_name)

submodule_branch = get_submodule_config(git_path, module, dest, submodule_name, 'branch')
if not submodule_branch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also unused, since the submodule_branch is set to a default.

module.fail_json(msg='Unable to detect branch of submodule: %s' % submodule_name)

submodule_path = os.path.join(dest, submodule_path)
revision = get_version(module, git_path, submodule_path, '%s/%s' % ('origin', submodule_branch))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this hardcoding 'origin' instead of using the remote specified by the user?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.gitmodules does not appear to list the origin.

[submodule "submodule3"]
        path = submodule3
        url = https://github.com/theunkn0wn1/ansible_test_submodules_subm3.git
        branch = user/theunkn0wn1/random_branch
[submodule "submodule4"]
        path = submodule4
        url = https://github.com/theunkn0wn1/ansible_test_submodules_subm4.git
        branch = main

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't hardcoded to 'origin' before your change, the value from the module parameter remote was used.

Comment on lines +39 to +41
repo_submodules_b: 'https://github.com/theunkn0wn1/ansible_test_submodules_monorepo.git'
repo_submodule3: 'https://github.com/theunkn0wn1/ansible_test_submodules_subm3.git'
repo_submodule4: 'https://github.com/theunkn0wn1/ansible_test_submodules_subm4.git'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should add more external repos for these tests to depend on. It would probably be better for these to be added to an ansible-managed org.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to move these to an ansible-managed org, please do.
I have no such access, and the existing test suite repositories lack the necessary coverage

lib/ansible/modules/git.py Show resolved Hide resolved
@steinzi
Copy link

steinzi commented Aug 7, 2023

Just so I'm reading this 100% correctly, the current "workaround" change the submodule repo to use master, and I should be good to go while we wait for this thing to be resolved> :)

@theunkn0wn1
Copy link
Author

Just so I'm reading this 100% correctly, the current "workaround" change the submodule repo to use master, and I should be good to go while we wait for this thing to be resolved> :)

The current "workaround" is ensure your submodules have a master branch, and expect that branch to be deployed.

@dw-seanelliott
Copy link

Is this just waiting on some repos to get moved so to the ansible org? or can this get merged now. Its been a headache for us for awhile now..
Side note, we have been working around this by using the delete flag. Takes longer but is working.

@cberg2048
Copy link

Hello! I see this has made some great progress! At this point is there any way the community can support getting this merged? I think the next action is for someone to take the testing repositories and move.re-create them in the Ansible org, and I don't think the community can do that. Please correct me if I am wrong, and also if there is anything else we can do to help.

@theunkn0wn1
Copy link
Author

theunkn0wn1 commented Oct 19, 2023

Hello! I see this has made some great progress! At this point is there any way the community can support getting this merged? I think the next action is for someone to take the testing repositories and move.re-create them in the Ansible org, and I don't think the community can do that. Please correct me if I am wrong, and also if there is anything else we can do to help.

I think at this point this is blocked on two issues:

  1. I haven't had the bandwidth to address the review comments wrt python issues (though the PR works as-is)
  2. I can't address the comments on using non-ansible repositories, since I don't have the rights to move them under the ansible org. None of the existing ansible-org repositories can be used as a test case. ** Someone from ansible-org will need to take ownership of the repositories (I will happily have them transferred, or for Ansible to fork it into their namespace), or for them to accept my repositories as-is **.

@s-hertel
Copy link
Contributor

We could probably add tests for this bug without creating any new Github repos, which would make it much easier to maintain the tests. The git tests use local repos to exercise some code paths: https://github.com/ansible/ansible/blob/devel/test/integration/targets/git/tasks/setup-local-repos.yml, can we do the same for these? Otherwise the test could use a local git server.

The main blocker is just the backwards incompatible changes mentioned in my review. It does not works as-is for some cases.

@vamirreza
Copy link

Dears, do you have any estimate for when this PR will be merged?

@ansibot ansibot added the needs_ci This PR requires CI testing to be performed. Please close and re-open this PR to trigger CI. label Aug 3, 2024
@webknjaz
Copy link
Member

webknjaz commented Aug 5, 2024

This PR needs to be rebased.

/azp run

@webknjaz
Copy link
Member

webknjaz commented Aug 9, 2024

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ansibot ansibot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed needs_ci This PR requires CI testing to be performed. Please close and re-open this PR to trigger CI. stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. labels Aug 9, 2024
@webknjaz webknjaz added the ci_verified Changes made in this PR are causing tests to fail. label Aug 9, 2024
@webknjaz
Copy link
Member

webknjaz commented Aug 9, 2024

the PR needs to be rebased

astehlik and others added 2 commits August 13, 2024 19:24
A new method get_submodule_versions_from_remote() is added
to the git module and used to detect the remote revisions
of the configured submodules.

The new method loops over the detected submodules and determines the
remote branch of each submodule as configured in the .gitmodules
file. The default remains "master".
@theunkn0wn1 theunkn0wn1 force-pushed the user/theunkn0wn1/submodule_tracking branch from 0084f8a to f432feb Compare August 14, 2024 02:24
@theunkn0wn1
Copy link
Author

the PR needs to be rebased

rebased.

@theunkn0wn1
Copy link
Author

theunkn0wn1 commented Aug 14, 2024

We could probably add tests for this bug without creating any new Github repos, which would make it much easier to maintain the tests. The git tests use local repos to exercise some code paths: https://github.com/ansible/ansible/blob/devel/test/integration/targets/git/tasks/setup-local-repos.yml, can we do the same for these? Otherwise the test could use a local git server.

We probably can.

The main blocker is just the backwards incompatible changes mentioned in my review. It does not works as-is for some cases.

took a swing at addressing these, but am unable to run the test suite as it appears to be broken on devel (#83787 )

@ansibot ansibot removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. ci_verified Changes made in this PR are causing tests to fail. labels Aug 14, 2024
@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects_2.16 bug This issue/PR relates to a bug. has_issue module This issue/PR relates to a module. new_contributor This PR is the first contribution by a new community member. stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested.
Projects
None yet
9 participants