Skip to content

GitGuide:GitHub in IceCube

Erik Blaufuss edited this page May 29, 2024 · 19 revisions

GitHub in IceCube

Git and GitHub together form a powerful software version control system. In IceCube, we've been slowly moving from the older Subversion system to Git based workflows for IceCube software development over the past few years. As of early 2021, development of Icetray, via the IceTray repository is moving to Github.

In order to see IceCube private repositories on Github, clone software, make and commit changes, a GitHub account, as well as membership in the right Github organizations is required. This page will help you get your account established, and guide you through starting with GitHub.

Note: even for GitHub experts, this page lays out account setup requirements and development workflow guidelines. Please take a careful look.

Status: As of Feb 2021, this transition is complete for icecube/icetray, and primary development of IceTray is now in GitHub. If you have any questions, ask the group in #software on Slack.

About GitHub in IceCube

IceCube has several active organizations on GitHub. Two in particular are most active:

  • icecube - where development of the IceTray software framework takes place, development and cataloging of analysis code and scripts, and utilities such as analysis containers.
    • Note: this organization represents what used to be IceCubeSPNO and IceCubeOpenSource GitHub organizations.
    • To join, make a request in #software on Slack
  • WIPACRepo - where development of IceCube and Upgrade projects takes place, such the DAQ, I3Live Experiment control and other projects to support detector operations and development take place. If you're supporting an aspect of IceCube operations or construction, this is where you're code should live.
    • To join, please talk to one of the project leads in Detector Operations.
  • User 'sandboxes' - Users are encouraged to make use of their own GitHub accounts and repositories for early development of processing scripts, new software ideas, etc.
    • Note: as an analysis matures and reaches unblinding, the Working group leaders and tech leads will guide you through moving your scripts and software into releases within the IceCube Github organization. This will allow other IceCube members to reproduce your work both now and in the future.

Historical Subversion

For more than a decade, IceCube has used Subversion as our software version control system. This vast archive of older and retired packages is not going away. However, as development moves to GitHub, this archive will be made read-only.

Getting Started

To get started with Git and GitHub for IceCube, two things are required:

  1. A GitHub account. If you don't already have one, these are freely obtained from Github. Note, this does not have be created with your icecube.wisc.edu or institutional account. Feel free to use the one you already have.
  2. Git command line utility. This software is widely available on most Linux, Mac and Windows systems. It's likely already installed on your system, but if not, see:

Setting up your Git and GitHub setup to work with IceCube

We have a few requirements that users need to be aware of before committing any code to the IceCube organizations. These are in place to make sure we are able to trace each change back to a real person in the collaboration.

  1. Please make sure you update your GitHub account profile and make sure.
    • Your Name and Institution are set.
      • Full name is preferred, but require at mimimum your first initial and full last name (like a paper byline)
      • List your IceCube institution in the Company Name field
    • Add additional email addresses you might commit from as well, such as your @icecube.wisc.edu email.
      • While we encourage you to make your email address publicly visible, this is not required.
  2. Properly set your Email addresses and names in git command line workflow
  • First, set up all systems you plan to develop software on (IMPORTANT), with obvious per-user changes:
   git config --global user.name "First Last"
   git config --global user.email "user@icecube.wisc.edu"
  • Please make sure the email you use is both valid and associated with your GitHub account!
    • You do not have to use your @icecube.wisc.edu email, so long as you use one associated with your GitHub account.
  1. We require you to use 2-Factor Authentication for your GitHub account.
    • Additionally, SSH keys can be added to allow easy access to GitHub repositories on the command line

Congratulations! You're now ready to join IceCube organizations (see above) and checkout, develop and/or commit code.

Checking out IceTray

  • To initially check out the repository:
    git clone https://user@github.com/icecube/icetray.git
  • To update your checkout:
    git pull --rebase

The --rebase is important, especially if you have local changes. This reapplies any local commits you have on top of the commits that may have been pushed to the GitHub repository since you last pulled. This makes the chain of commits linear and easier to follow. If you don't want to type --rebase all the time, you can set --rebase to be the default behavior:

   git config --global pull.rebase true

To push (send to GitHub) your changes (NOTE: please see comments below about using branches and pull requests for changes):

   git diff files_to_commit <- Examine this
   git commit files_to_commit
   git push

If your changes collide with others and the push is rejected, pull (with rebase!) and then push again. For those new to git, there are two pieces here: commit and push. Commit stores the changes with a log message, but does not yet upload them to the central repository -- only push does that. If someone else pushes changes between the last time you pulled and your push, your push may be rejected since you can no longer just make the server's state match yours. A rebase takes your changes since you last pulled and reapplies ("rebases") them on top of the current state of the remote repository.

IceCube IceTray Development Rules and Guidelines

When developing software in IceTray, please use the following guidelines when committing changes, improvements and bug fixes. This workflow is also recommended for use in other repositories within IceCube.

Please ensure:

  1. We're preferring PRs to direct commits. Commit direct to main only when you're sure it's appropriate.
  2. You must use your real name and real email in commits. Take especial care when using shared login accounts that the author is properly set to be actually you.
  3. Do not commit any data files (or files of any kind larger than ~ 100 KB).
  4. Remember to rebase when pulling (GitHub will reject pushes if you don't).
  5. Read your diffs before committing; make sure each commit is what you want (only the files you meant to change, no extraneous whitespace changes, etc.).
  6. Check logs after commit and before pushing: make sure the author information (Name and email) is properly set and that you are about to push what you think you are pushing.
  7. Make sure each commit is a discrete unit, that the metaproject builds after every commit, and that the log message is informative (tell everyone what and why you've made this change).
  8. When in doubt, make a branch and submit a PR.

Backing out a commit

So you accidentally pushed something directly to main that you meant to commit to a branch and you want to fix this. The preferred method is to use git revert.

First you'll need the git revision ID of the commit you want to back out, via git log or GitHub, etc...

   commit 1058b325c63b54f6e89a0398cb2075a8e73f989e (HEAD -> main, origin/main, origin/HEAD)
   Author: Alex Olivas <aolivas@umd.edu>
   Date:   Fri May 21 09:29:33 2021 -0600
   
       committing breakage intentionally to illustrate how git revert works.

In the example above the revision ID is 1058b325c63b54f6e89a0398cb2075a8e73f989e. This automatically creates a commit and, by default on the command line, allows you to edit the commit message. Leave the original message intact (for clarity), but be sure to add your reason for reverting this commit.

   $ git revert 1058b325c63b54f6e89a0398cb2075a8e73f989e
   [main 0a0c026] I meant to commit this to a branch and not main.  Oops.
    1 file changed, 24 insertions(+)

Finally don't forget to push upstream:

   $ git push
  Enumerating objects: 9, done.
  Counting objects: 100% (9/9), done.
  Delta compression using up to 8 threads
  Compressing objects: 100% (5/5), done.
  Writing objects: 100% (5/5), 1013 bytes | 1013.00 KiB/s, done.
  Total 5 (delta 4), reused 0 (delta 0)
  remote: Resolving deltas: 100% (4/4), completed with 4 local objects.
  To github.com:icecube/cicada.git
     1058b32..0a0c026  main -> main

Developing in GitHub - Using Branches and Pull Requests (PRs)

When you should make a branch

You should do your work in a branch, and later request review, if any of the following are true:

  • You expect the work in the branch to take a while to complete (days) and may want history of your changes during your work in progress
  • It changes behavior (even for the better!) of code that anyone else is using
  • You think there is a chance it will break things
  • The change is "big" (touching multiple projects, a bunch of files, etc.)
  • You would like input from another member of the collaboration before committing and merging with main development branch.

If you are not sure if what you are doing should be in a branch or reviewed, err on the side of caution and put it in a branch and ask for review. Simple, small PRs are easily reviewed and merged into main.

When you should not make a branch

Branches are not recommended if any of the following are true:

  • What you are doing is trivial and will not change behavior that anyone else may be relying upon (fixing typos, small and quick bug fixes)
  • You do not intend to merge what you are doing into the main branch soon (within weeks)

Branch development guidelines

When working with branches in development, be sure to:

  • Keep them short lived. We really do not want long-lived branches that are used to generate analysis results. They are bad for the collaboration, since they result in duplicated software. Additionally, they will complicate your life, since the longer they live outside of the main thrust of development, the more work merging requires. Try to divide what you are working on into smaller pieces (each piece in a branch or not, as above) so you don't drift away from the rest of the collaboration. If you have immediate processing needs, request a bugfix release from the software development team (#software).
  • Keep them as small as they can be. Try to break large changes up into several branches and PRs to keep the review and acceptances of these changes tractable.
  • Be sure to update the documentation and tests to match your changes/additions.

How to make a branch

Pick a descriptive name and then:

   git checkout -b mynewbranchname

Now you are on your branch named mynewbranchname. (Note: 'git branch' will confirm what branch you're on.) The name of the branch should be descriptive, unique, and brief. If only you are working on the branch, prefixing it with your GitHub ID is usually a good idea ("nwhitehorn/recofixes" for example). If you're fixing an existing issue, include the issue number in the branch name as well.

You can go switch back to the main master branch (and forth, by changing 'main' to your branch name) like this:

   git checkout main

Once you are on the branch (confirm which you are currently using with git branch), make whatever commits you like. Then push your branch to GitHub:

   git push -u origin mynewbranchname

Making a Pull Request

When you push to a branch, GitHub will send a message to your terminal with a link to make a "pull request". Once your work on the branch is complete (whatever you were trying to do is done and tested), click the link. (You can also initiate the pull request through the GitHub web interface). This will file a request for code review. Ask someone (for example in #software on Slack) to review the changes for you, update the change in response to their reviews, and, once they approve, press the "Squash and Merge" button.

If you cannot find a code reviewer, or they don't respond, feel free to keep asking. It is possible to approve your own branch merges, but please avoid doing this. Making the change small, readable, and well-described (especially including the rationale for the change), will improve the quality and speed of your review markedly. Feel free to bring this up in #software on Slack if you're feeling your PR is being ignored.

Once the pull request is approved and merged, please delete the branch (GitHub will provide a button for this).

Issues vs. Pull Requests

There's no need to generate an issue, which links to a pull request (PR), when a standalone pull PR is sufficient. When the information would be truly redundant prefer a PR in isolation.

Pull Request - Best Practices

When putting your pull request together, a few best practices can help reviewers more quickly evaluate your PR and move it through to merging more quickly. When putting together your PR, please try to:

  • Separate cleanups from new functionality: As part of developing new functionality, you also cleanup/improve other code. This is normal and expected, but can complicate the review of new functionality. In these cases, make 2 separate PRs, one with the code cleanups and another with the new functionality added.
  • Be sure to include new or updated tests that highlight the changes being made in the PR, and highlight these in the text of your PR. These can really help highlight the importance of the change you're proposing.
  • Be sure to tag critical reviewers in your PR text. GitHub will use our CODEOWNERS file to highlight all contributors that have any overlap with any files changed. For things like an interface change, this can be SEVERAL people across many projects, many of which have very minor updates. By explicitly calling out the few people who's review is critical in our PR text, it makes it easier to see who's review we need to get before closing a PR. If you're unsure, please feel free to ask in #software on Slack.
  • If you're finding your PR isn't moving, please feel free to bring it up in #software. We'd hope to get PRs moved quickly (through review, updates to merging within 1-2 weeks), but often other priorities come up. A gentle reminder from the PR author can help get it unstuck.

An example PR for reference can be found here: TBD

Maintaining a short-lived branch

It may happen that changes are made to the main master branch (bug fixes, for example) that you would like in a working branch before merging it back to main.

   git pull --no-rebase origin main

Then commit and push as normal (to your branch). This will usually generate a merge commit, but that is beneficial here and GitHub will not reject one on non-main branches.

Note: that if you have to merge main into your working branch more than once or twice, it is usually a sign your branch is too long-lived and should be merged back sooner rather than later. It's also worth noting that if there are many merge-commits, it makes it nearly impossible to review a PR due to the merge-commits. See the next section for an alternative solution.

Dealing with merge conflicts with a branch

Sure, we've all been there, you're development branch:

  • lived longer than you expected and now main is far ahead of it
  • many people contributed to it, and you're worried a 'rebase' will mess with history of others and potentially cause problems

Git has some tools to help advance your changes to the latest version of main to prepare a PR, and this 'rebase' generally works well for isolated, private branches. A good comparison and discussion of Merging and Rebasing can can be found online . (Another here )

But a safer way to advance your work to prepare for a PR is available. Say you've been working on a "super_feature" branch for weeks/months and it's fallen quite out of agreement with main, and many people have cloned this branch and potentially contributed. A merge commit would overwhelm the changes you're making, but a rebase could be a problem for the shared history among the developers. Instead, you can make a new branch, rebase THAT, then make a PR.

  git checkout main; git pull             # make sure local main is up to date
  git checkout super_feature              # change to my out of date dev branch
  git checkout -b super_feature_pr        # make a new bracnch from this one
  git rebase main                         # rebase to main, could be --interactive if needed 
  git push origin super_feature_pr        # Push this new branch to GitHub

Now you can make a clean Pull Request on this new branch (super_feature_pr). Once the PR is accepted, you should also delete the original super_feature branch, since this won't happen automatically as part of the PR closing process.

Using the SVN interface to GitHub

Read more about using the Subversion interface GitHub supports many operations through SVN, as well. Their SVN support has a significant number of limitations, however:
  • svn blame, mv, and merge are not implemented
  • svn cp only works for making branches
  • svn properties do not work
  • the bridge is quite slow

To initially check out the repository:

   svn co https://user@github.com/icecube/icetray/trunk

To update your checkout:

   svn up

To send your changes back:

   svn diff files_to_commit <- Examine this
   svn ci files_to_commit

Things to check when reviewing and accepting a PR

When reviewing a PR to merge into the main line of development (main), please consider:

  • Did the author correctly identify themselves in their commit messages (Name and email)
  • Are tests provided for new functionality/bug fixes?
  • Are the documentation updated to match the changes to functionality or interfaces?
  • Does the code follow our [code style guidelines] (https://github.com/icecube/icecube.github.io/wiki/CodingStandards)

Git and GitHub tips, pointers and gotchas to avoid

While we are early in the transition to Git and GitHub, we'll undoubtedly find issues and conflicts that we have not thought of before we started. We'll continue to update this section as new things come up with guidance on how best to avoid them. If you're seeing strange behavior, bring it up in #software on Slack

Help! git lost its mind!

Feel free to ask in #software on Slack for help. If all else fails, rm -rf on your repository clone (or move it aside) and a re-clone. This is not a sign of cowardice, it teaches git who's boss.

Clone this wiki locally