Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git module: Failed to init/update submodules: warning: could not look up configuration 'remote.origin.url' #83146

Open
1 task done
wtcline-intc opened this issue Apr 26, 2024 · 9 comments
Labels
affects_2.16 bug This issue/PR relates to a bug. module This issue/PR relates to a module. needs_verified This issue needs to be verified/reproduced by maintainer

Comments

@wtcline-intc
Copy link

Summary

The git module fails to properly clone submodules when using:

  1. A relative submodule path
  2. A non-standard value for remote
  3. A non-standard value for version

I am able to consistently reproduce the issue with the below "Steps to reproduce". My hypothesis is that something along the following is happening:

  1. The git module tries to clone the repo submodule, but it doesn't know what protocol to use because the git submodule URL is relative
  2. The git module is then hardcoded to find the protocol via remote.origin.url, not taking into account that the remote is github, and thus fails to find the correct protocol for cloning the repo submodule.
  3. The git module tries to clone with a relative path, which fails because it's supposed to use HTTPS.

I don't know how the version field plays into this, but the module works fine when a custom version field is not specified. It also works fine if I keep the version field and remove the remote field. It's only the combination which appears to cause the issue.

Issue Type

Bug Report

Component Name

git

Ansible Version

$ ansible --version
ansible [core 2.16.6]
  config file = None
  configured module search path = ['/home/wtcline/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/wtcline/.local/lib/python3.10/site-packages/ansible
  ansible collection location = /home/wtcline/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (/usr/bin/python3)
  jinja version = 3.0.3
  libyaml = True

Configuration

# if using a version older than ansible-core 2.12 you should omit the '-t all'
$ ansible-config dump --only-changed -t all
CONFIG_FILE() = None
EDITOR(env: EDITOR) = vim

OS / Environment

Ubuntu 22.04 LTS

Steps to Reproduce

I created two public repos that are capable of reproducing the problem:

The following playbook tries to clone the main repo with its submodule and then fails.

- name: Checkout repos
  hosts: localhost
  tasks:
  - git:
      repo: https://github.com/wtcline-intc/submod-relpath-test-mainrepo.git
      dest: submod-relpath-test-mainrepo
      remote: github
      version: "v0.1"

Expected Results

I expected the main repo to be cloned with its source (remote) named github and its submodule cloned as expected (though the submodule's remote would be named origin); instead, the submodules were not cloned!

Actual Results

[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that
the implicit localhost does not match 'all'
ansible-playbook [core 2.16.6]
  config file = None
  configured module search path = ['/home/wtcline/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/wtcline/.local/lib/python3.10/site-packages/ansible
  ansible collection location = /home/wtcline/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible-playbook
  python version = 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (/usr/bin/python3)
  jinja version = 3.0.3
  libyaml = True
No config file found; using defaults
setting up inventory plugins
Loading collection ansible.builtin from 
host_list declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
Skipping due to inventory source not existing or not being readable by the current user
script declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
auto declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
Skipping due to inventory source not existing or not being readable by the current user
yaml declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
Skipping due to inventory source not existing or not being readable by the current user
ini declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
Skipping due to inventory source not existing or not being readable by the current user
toml declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
Loading callback plugin default of type stdout, v2.0 from /home/wtcline/.local/lib/python3.10/site-packages/ansible/plugins/callback/default.py
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.

PLAYBOOK: main.yaml ************************************************************
Positional arguments: main.yaml
verbosity: 4
connection: ssh
become_method: sudo
tags: ('all',)
inventory: ('/etc/ansible/hosts',)
forks: 5
1 plays in main.yaml

PLAY [Checkout repos] **********************************************************

TASK [Gathering Facts] *********************************************************
task path: /home/wtcline/main.yaml:1
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: wtcline
<127.0.0.1> EXEC /bin/sh -c 'echo ~wtcline && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/wtcline/.ansible/tmp `"&& mkdir "` echo /home/wtcline/.ansible/tmp/ansible-tmp-1714095341.5799274-5820-229251185465971 `" && echo ansible-tmp-1714095341.5799274-5820-229251185465971="` echo /home/wtcline/.ansible/tmp/ansible-tmp-1714095341.5799274-5820-229251185465971 `" ) && sleep 0'
Using module file /home/wtcline/.local/lib/python3.10/site-packages/ansible/modules/setup.py
<127.0.0.1> PUT /home/wtcline/.ansible/tmp/ansible-local-5816xbdj7yow/tmp2hh3hnw7 TO /home/wtcline/.ansible/tmp/ansible-tmp-1714095341.5799274-5820-229251185465971/AnsiballZ_setup.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /home/wtcline/.ansible/tmp/ansible-tmp-1714095341.5799274-5820-229251185465971/ /home/wtcline/.ansible/tmp/ansible-tmp-1714095341.5799274-5820-229251185465971/AnsiballZ_setup.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3 /home/wtcline/.ansible/tmp/ansible-tmp-1714095341.5799274-5820-229251185465971/AnsiballZ_setup.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /home/wtcline/.ansible/tmp/ansible-tmp-1714095341.5799274-5820-229251185465971/ > /dev/null 2>&1 && sleep 0'
ok: [localhost]

TASK [git] *********************************************************************
task path: /home/wtcline/main.yaml:4
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: wtcline
<127.0.0.1> EXEC /bin/sh -c 'echo ~wtcline && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/wtcline/.ansible/tmp `"&& mkdir "` echo /home/wtcline/.ansible/tmp/ansible-tmp-1714095342.2404609-5881-217583858493159 `" && echo ansible-tmp-1714095342.2404609-5881-217583858493159="` echo /home/wtcline/.ansible/tmp/ansible-tmp-1714095342.2404609-5881-217583858493159 `" ) && sleep 0'
Using module file /home/wtcline/.local/lib/python3.10/site-packages/ansible/modules/git.py
<127.0.0.1> PUT /home/wtcline/.ansible/tmp/ansible-local-5816xbdj7yow/tmp8xkgvqnk TO /home/wtcline/.ansible/tmp/ansible-tmp-1714095342.2404609-5881-217583858493159/AnsiballZ_git.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /home/wtcline/.ansible/tmp/ansible-tmp-1714095342.2404609-5881-217583858493159/ /home/wtcline/.ansible/tmp/ansible-tmp-1714095342.2404609-5881-217583858493159/AnsiballZ_git.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3 /home/wtcline/.ansible/tmp/ansible-tmp-1714095342.2404609-5881-217583858493159/AnsiballZ_git.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /home/wtcline/.ansible/tmp/ansible-tmp-1714095342.2404609-5881-217583858493159/ > /dev/null 2>&1 && sleep 0'
fatal: [localhost]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "accept_hostkey": false,
            "accept_newhostkey": false,
            "archive": null,
            "archive_prefix": null,
            "bare": false,
            "clone": true,
            "depth": null,
            "dest": "submod-relpath-test-mainrepo",
            "executable": null,
            "force": false,
            "gpg_whitelist": [],
            "key_file": null,
            "recursive": true,
            "reference": null,
            "refspec": null,
            "remote": "github",
            "repo": "https://github.com/wtcline-intc/submod-relpath-test-mainrepo.git",
            "separate_git_dir": null,
            "single_branch": false,
            "ssh_opts": null,
            "track_submodules": false,
            "umask": null,
            "update": true,
            "verify_commit": false,
            "version": "v0.1"
        }
    },
    "msg": "Failed to init/update submodules: warning: could not look up configuration 'remote.origin.url'. Assuming this repository is its own authoritative upstream.\nSubmodule 'submod-relpath-test-submodule' (/home/wtcline/submod-relpath-test-submodule.git) registered for path 'submod-relpath-test-submodule'\nfatal: repository '/home/wtcline/submod-relpath-test-submodule.git' does not exist\nfatal: clone of '/home/wtcline/submod-relpath-test-submodule.git' into submodule path '/home/wtcline/submod-relpath-test-mainrepo/submod-relpath-test-submodule' failed\nFailed to clone 'submod-relpath-test-submodule'. Retry scheduled\nfatal: repository '/home/wtcline/submod-relpath-test-submodule.git' does not exist\nfatal: clone of '/home/wtcline/submod-relpath-test-submodule.git' into submodule path '/home/wtcline/submod-relpath-test-mainrepo/submod-relpath-test-submodule' failed\nFailed to clone 'submod-relpath-test-submodule' a second time, aborting\n"
}

PLAY RECAP *********************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

Code of Conduct

  • I agree to follow the Ansible Code of Conduct
@ansibot ansibot added bug This issue/PR relates to a bug. needs_triage Needs a first human triage before being processed. affects_2.16 module This issue/PR relates to a module. labels Apr 26, 2024
@ansibot
Copy link
Contributor

ansibot commented Apr 26, 2024

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the component bot command.

@s-hertel
Copy link
Contributor

It looks like the initial git clone suceeds, but eventually git submodule update --init --recursive fails.

The .git/config for the main repo looks like this before the failing call:

$ cat /tmp/submod-relpath-test-mainrepo/.git/config
[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true
[remote "github"]
	url = https://github.com/wtcline-intc/submod-relpath-test-mainrepo.git
	fetch = +refs/heads/*:refs/remotes/github/*
[branch "master"]
	remote = github
	merge = refs/heads/master

But the checkout is not on the branch with remote = github, it's on v0.1 (detached HEAD), so git itself uses origin.

I think it should work if you set track_submodules: true, and edit the .gitmodules to look like this:

[submodule "submod-relpath-test-submodule"]
        path = submod-relpath-test-submodule
        url = ../submod-relpath-test-submodule.git
        remote = github

This should add the --remote option to git submodule update --init --recursive.

@mkrizek mkrizek added needs_verified This issue needs to be verified/reproduced by maintainer and removed needs_triage Needs a first human triage before being processed. labels Apr 30, 2024
@s-hertel s-hertel added the needs_info This issue requires further information. Please answer any outstanding questions. label Apr 30, 2024
@wtcline-intc
Copy link
Author

It looks like the initial git clone suceeds, but eventually git submodule update --init --recursive fails.

The .git/config for the main repo looks like this before the failing call:

...

But the checkout is not on the branch with remote = github, it's on v0.1 (detached HEAD), so git itself uses origin.

Oh, I see what you mean (from man git-submodule: The remote used is branch's remote (branch.<name>.remote), defaulting to origin.). The following series of commands in the shell reproduces the issue:

$ git clone -o github https://github.com/wtcline-intc/submod-relpath-test-mainrepo.git
$ git checkout v0.1
$ git submodule update --init --recursive
...
file:.git/config        submodule.submod-relpath-test-submodule.url=/home/wtcline/test1/submod-relpath-test-submodule.git

While in interactive sessions I've always happened to run something like (which works):

$ git clone -o github https://github.com/wtcline-intc/submod-relpath-test-mainrepo.git
$ git submodule update --init --recursive
$ git checkout v0.1
$ git submodule update --init --recursive
...
file:.git/config        submodule.submod-relpath-test-submodule.url=https://github.com/wtcline-intc/submod-relpath-test-submodule.git

Turns out another valid option is:

$ git clone -o github --recurse-submodules https://github.com/wtcline-intc/submod-relpath-test-mainrepo.git
$ git checkout v0.1
$ git submodule update --init --recursive

This feels more like a faulty git assumption/limitation than an Ansible bug...

I think it should work if you set track_submodules: true, and edit the .gitmodules to look like this:

[submodule "submod-relpath-test-submodule"]
        path = submod-relpath-test-submodule
        url = ../submod-relpath-test-submodule.git
        remote = github

I don't believe it's a feasible workaround to specify the remote's name as part of the submodule metadata in the main repo since that would affect all users of the main repo. I'm also not trying to check out the latest version of a submodule; in fact, the reason for specifying a tag and being in a detached HEAD is to explicitly lock in a given version of the main repo and its submodules.

@ansibot ansibot removed the needs_info This issue requires further information. Please answer any outstanding questions. label May 1, 2024
@s-hertel
Copy link
Contributor

s-hertel commented May 3, 2024

I don't believe it's a feasible workaround to specify the remote's name as part of the submodule metadata...

Ok, it makes sense setting a remote tracking branch won't work since you don't want the latest commit on the branch. I meant to configure a branch for the submodule (which should have a configured remote), but wrote the example incorrectly.

Turns out another valid option is:

$ git clone -o github --recurse-submodules https://github.com/wtcline-intc/submod-relpath-test-mainrepo.git

I can also reproduce the issue by cloning a repo without a submodule, and then updating to a tag/commit with a submodule. I feel like the same task should work regardless of the initial state of the repo (whether cloning or recursively updating) since a remote is specified, but it does seem like there's no way to use a non-origin remote after cloning for this cornercase, and this is just a git limitation.

Adding --recurse-submodules by default when recursive: true and bare: false seems to break the existing tests for submodules. Here's a branch with a backwards compatible fix + warning for the cornercase devel...s-hertel:git-fix-relative-submodules, but a new option would not be backportable, so I'm still looking for a way to fix your scenario without changing the existing tests. It seems like the module should at least have documentation on how a non-origin remote might impact updating submodules.

@wtcline-intc
Copy link
Author

I can also reproduce the issue by cloning a repo without a submodule, and then updating to a tag/commit with a submodule. I feel like the same task should work regardless of the initial state of the repo (whether cloning or recursively updating) since a remote is specified, but it does seem like there's no way to use a non-origin remote after cloning for this cornercase, and this is just a git limitation.

What I would expect is something like git clone -o ${REMOTE} to set a configuration option, say, core.remote.default=${REMOTE}, and then for git submodule update --init --recursive to go through its usual resolution process except using core.remote.default before defaulting to origin if said configuration value is undefined. Or just look through remote.*.url and pick an existing entry before defaulting to origin. Or have an argument for git submodule update that allows specifying a default remote. It really feels like a git issue to me.

Adding --recurse-submodules by default when recursive: true and bare: false seems to break the existing tests for submodules. Here's a branch with a backwards compatible fix + warning for the cornercase devel...s-hertel:git-fix-relative-submodules, but a new option would not be backportable, so I'm still looking for a way to fix your scenario without changing the existing tests. It seems like the module should at least have documentation on how a non-origin remote might impact updating submodules.

That might work, though I wonder if it's a good idea to add an option for a corner-case that git should be handling. I'd change:

        description:
            - Option to clone with --recurse-submodules. By default submodules are
              are synced and updated after checking out O(version).

to something like Option to perform the initial clone with --recurse-submodules. This can help resolve relative submodule paths. By default submodules will then be synced and updated after the initial clone of O(version).

It might also make sense to try and fix the configuration. A good clone sets configuration:

submodule.submod-relpath-test-submodule.active=true
submodule.submod-relpath-test-submodule.url=https://github.com/wtcline-intc/submod-relpath-test-submodule.git

while a bad one does:

submodule.submod-relpath-test-submodule.active=true
submodule.submod-relpath-test-submodule.url=/home/wtcline/test2/submod-relpath-test-submodule.git

Since the module knows what the remote is at module execution time (remote.github.url=https://github.com/wtcline-intc/submod-relpath-test-mainrepo.git), it should be able to validate the submodule URL against the configured remote's URL. Though these options aren't set until the initial submodule update fails and I'm not sure of a way to force git to set them without calling a submodule update. There could also be interesting corner-cases with recursion. So maybe clone_recurse_submodules is the best option.

@s-hertel
Copy link
Contributor

s-hertel commented May 4, 2024

Setting the configuration means we'd need to parse the .gitmodules configuration ourselves, unless there's a git helper we can easily use. It's also totally valid to have a submodule relative to the CWD, so I think anything we attempt may need to be done as a fallback to the existing behavior...

@wtcline-intc
Copy link
Author

Setting the configuration means we'd need to parse the .gitmodules configuration ourselves, unless there's a git helper we can easily use.

The submoudle.${NAME}.url can be queried and set with git config --local, but the entry doesn't appear until running git submodule update --init --recursive; there may be another way to make the entry appear...

It's also totally valid to have a submodule relative to the CWD, so I think anything we attempt may need to be done as a fallback to the existing behavior...

Well, a relative submodule URL is assumed to be relative to the default remote's URL, so if the module sets the submodule paths based off of remote's URL then this should work just fine when the remote is file:// or a path.

A side-benefit to fixing the URLs is that it might allow one to change the repo field schema and have the submodules automatically update without relying on what was set in the initial clone. Though it might not be worth attempting if the code isn't maintainable.

@s-hertel
Copy link
Contributor

s-hertel commented May 7, 2024

The submoudle.${NAME}.url can be queried...

I guess I wasn't clear, but I meant a utility to do the relative URL/path conversion for .gitmodules urls would help here. There was a helper before 2.34.0:

$ git submodule--helper resolve-relative-url ../submod-relpath-test-submodule.git
https://github.com/wtcline-intc/submod-relpath-test-submodule.git
$ git checkout v0.1
$ git submodule--helper resolve-relative-url ../submod-relpath-test-submodule.git
/old-git/submod-relpath-test-submodule.git

I don't see a way to do something similar on recent versions.

If the Ansible module reimplements it, it can unintentionally diverge with git. I kinda think the module should just document that the submodules are updated after switching to the version to clear up the ambiguity.

Well, a relative submodule URL is assumed to be relative to the default remote's URL, so if the module sets the submodule paths based off of remote's URL then this should work just fine when the remote is file:// or a path.

My point was that existing playbooks that currently work using relative paths should continue working as they do now.

@wtcline-intc
Copy link
Author

I don't see a way to do something similar on recent versions.

That's unfortunate.

If the Ansible module reimplements it, it can unintentionally diverge with git. I kinda think the module should just document that the submodules are updated after switching to the version to clear up the ambiguity.

That's probably the best that can be done with the way git currently behaves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects_2.16 bug This issue/PR relates to a bug. module This issue/PR relates to a module. needs_verified This issue needs to be verified/reproduced by maintainer
Projects
None yet
Development

No branches or pull requests

4 participants