Skip to content

GitSubmodules

Jeff Squyres edited this page Feb 7, 2020 · 6 revisions

The Open MPI project started using Git submodules in early 2020.

This allows us to import code from remote Git repositories without carrying it directly in our own repository. There are a number of benefits to this; Google around for expositions on why submodules are Good Things.

Philosophy

There are generally two ways we use submodules in Open MPI:

  1. To track a specific commit (e.g., a tag) in a remote repository.
    • Such tags generally refer to a specific release.
    • E.g., the hwloc-2.0.1 tag in the hwloc repository.
  2. To track a specific branch in a remote repository.
    • This allows us to keep up with development of a remote project.

The first use -- tracking a specific commit (usually a tag) -- is more common, because we tend to want stability when importing remote projects.

Use cases

When using Git submodules, there are a few differences to "traditional" (i.e., not-submodule-related) Git usage. It's easiest to talk about them in terms of specific use cases:

  1. Initial clone of the Open MPI repository
  2. Git updating the Open MPI repository
  3. Helpful day-to-day tips
  4. Adding a new submodule pointing to a specific commit
  5. Updating the commit that a submodule refers to
  6. Updating along a branch that a submodule refers to

Initial clone of the Open MPI repository

Now that we use submodules, it is not sufficient to simply git clone <OPEN_MPI_GIT_URL_REPO>.

Instead, you must add --recursive into your clone command:

$ git clone --recursive git@github.com:open-mpi/ompi.git
# or
$ git clone --recursive https://github.com/open-mpi/ompi.git

If you already cloned the Open MPI repository and didn't use --recursive, you can initialze / download all submodules thusly:

$ git submodule update --init --recursive

NOTE: This will update all of Open MPI's submodules to whatever is current upstream. If you have local changes to a submodule that you do not want to destroy, do not use this.

Git updating the Open MPI repository

Note that git pull ... is no longer sufficient to update your entire Open MPI tree. This will still update everything inside the Open MPI repository, but it will not (by default) update any changes to submodules.

You have a few options:

  1. Use --recurse-submodules:
$ git pull --recurse-submodules
  1. Use submodule --update:
$ git submodule update --init --recursive

NOTE: This will update all of Open MPI's submodules to whatever is current upstream. If you have local changes to a submodule that you do not want to destroy, do not use this.

Helpful day-to-day tips

You may find it useful to set the following two Git config variables:

$ git config --global diff.submodule log
$ git config --global status.submodulesummary 1

These will show you a bit more status about submodule status in git log and git status outputs, respectively.

Additionally, if you're lazy (like me), you may wish to make some Git aliases to include "submodule" variants of commands, such as:

$ head -n 3 $HOME/.gitconfig
[alias]
	prrs = pull --rebase --recurse-submodules
	cloner = clone --recursive

Adding a new submodule pointing to a specific commit

If you need to track a new Git submodule, keep in mind the following:

  1. If the submodule is part of an MCA component, make the submodule be a subdirectory in the component (e.g., see the current hwloc component in the opal/mca/hwloc/hwloc2 tree).
  2. Use a public, widely-available URL for the target Git repository.
    • For example, for Github repositories, use the HTTPS version of the URL (not the SSH form -- because not everyone has ssh keys setup on Github).

Here's the commands to run. The latter half are almost identical to the steps you follow when updating the commit that a submodule points to:

$ cd PATH_TO_OPEN_MPI_GIT_REPOSITORY

# Cd to the directory where the submodule will live
$ cd opal/mca/foo/bar50x

# Add the submodule, giving it a reasonable name
$ git submodule add --name bar-50x \
    https://github.com/open-mpi/bar.git

# Then check out the specific desired commit (e.g., tag)
$ git checkout v5.0.3

# Now cd back into the main Open MPI repository
$ cd ..

# Make a branch in the Open MPI repo
# (because this will turn into a pull request)
$ git checkout -b pr/add-submodule-for-foo-bar-v5.0.3

# Git add the "bar" dir to record the new commit you check out
$ git add bar
$ git commit -s -m 'bar: Add submodule to bar v5.0.3 tag'

Updating the commit that a submodule refers to

Let's consider a concrete case in updating a submodule to point to a new commit: let's update the OPAL hwloc component to point to a new hwloc Git tag (e.g., hwloc had a new release and we want to move our submodule to point to the tag of that release):

$ cd PATH_TO_OPEN_MPI_GIT_REPOSITORY

# Change into the directory of the submodule
$ cd opal/mca/hwloc/hwloc2/hwloc

# Check out the new commit (e.g., tag) that you want
$ git checkout hwloc-2.0.2

# Now cd back into the main Open MPI repository
$ cd ..

# Make a branch in the Open MPI repo
# (because this will turn into a pull request)
$ git checkout -b pr/update-hwloc-to-2.0.2

# Git add the "hwloc" dir to record the new commit you check out
$ git add hwloc
$ git commit -s -m 'hwloc: Update submodule to hwloc-2.0.2 tag'

Then push the branch (pr/update-hwloc-to-2.0.2) to Github and make a pull request, as normal.

When that PR is merged, others will use the methods above to update their local submodule pointers to point to the change you just made.

Updating along a branch that a submodule refers to

...to be written...

Clone this wiki locally