Skip to content
Nick Lucius edited this page Mar 16, 2017 · 4 revisions

The Git flow process is important to establish the workflow for developers, both City of Chicago and external developers. It describes how the team will organize code, branches, and issues on GitHub. An ideal Git flow process will enhance the benefits of decentralized development, minimize merge conflicts, and provide clear organization of current production code, development code, and alpha code. The importance of the Git flow has been documented by others.

The proposed Git flow is heavily borrowed from Vincent Driessen proposal.

Protected Branch

The master branch is a protected branch that does not accept direct commits. Instead, only pull requests may be requested against the master branch from the dev branch.

Dev Branch: "Nightly" builds

The dev branch contains the most recent alpha build of the code. Once the code on an issue branch (see below) appears to be completed (i.e., passed unit tests, completes the feature or fixes the bug), then it should be committed directly to the dev branch. This should be done as soon as the feature/issue branch is completed.

If you do not have the ability to submit directly to the dev branch (e.g., outside contributors), then a pull request should be made against the dev branch.

Issue Branches: for every branch, a corresponding issue

An "issue branch" is the name for a variety of branches that correspond to parallel development of new features and bug fixes. It is called an "issue branch" because each branch should correspond to an issue in the GitHub Issue Tracker.

Each branch is a significant piece of work. Adding a new feature or fixing bug involves conversation, emails, and micro-decisions which should be documented. Since the issue tracker allows for conversation, status updates, etc., this method will allow us to tie the code with documentation.

The core components of an issue branch are:

  • Each branch should be labeled with issue and followed by the issue number. For instance, issue105 corresponds to Issue #105 found in the issue tracker.
  • Before you create a branch, create an issue which describes the work you're going to do. Label it as an enhancement, bug fix, or whatever label best fits your work. Then, create a branch that corresponds to the issue.
  • If you're working on an enhancement or bug fix opened by another developer or community member, create the branch that corresponds to the issue.
  • As you work on the issue, comment liberally in the issue tracker. This provides a link between the code and documentation/discussion.

For instance, suppose we want to work on a bug that was noted in issue #4. We begin with the development code base:

$ git checkout dev
$ git checkout -b issue4
$ git push issue4

Once you've finished your work, note the resolution in the commit message by referencing the issue, merge it into dev, and push the changes to GitHub.com:

$ git checkout issue4
$ git commit -am 'Made final changes, closes #3'
$ git checkout dev
$ git merge issue4
$ git push origin dev

However, this isn't a cardinal rule. Strictly interpreting this rule is sometimes impractical. Developers should apply common sense if a couple of issues are being resolved in the same branch. Simply note this in the corresponding issues

Creating Releases

A release should be created before submitting to the CRAN repository. By definition, each merge into master should have a corresponding release. Each release should also include the build of the package source code (a tar.gz file).

New releases should be created using the following syntax: vX.Y.Z, where X denotes major version, Y denotes minor version, and Z denotes patch number.

We will follow the semantic versioning 2.0.0 process to determine version number. In short, X should only increment upward when non-backward compatible changes are made, Y denote new features, bug fixes and Z denote small bug fixes.


### Attaching source package
When creating a release, please build the package and attach it to the release. Ideally, use the same file to submit to CRAN as you attach to the release. To create the package, run:
```bash
$ cd /path/to/RSocrata
$ R CMD build RSocrata

Attach the resulting RSocrata_X.Y.Z-buildNumber.tar.gz file to the release.

Major versions and CRAN

Per the semantic versioning specification, a major release (e.g., 2.0.0) is created when there are non-backward compatible changes made to the code base. Imagine, re-running install.packages("RSocrata") to find that it can broken your existing code.

Unfortunately, it's not terribly easy to determine which version you are getting when running install.packages("RSocrata"). Therefore, a new major release should result in a new package. That is, version 2.0.0 of RSocrata should result in the RSocrata2 package. However, all of the work should continue within the same GitHub repository.

Failing Travis CI and AppVeyor builds

Passing unit tests is important to maintaining quality while changes are made to the software. However, often for RSocrata, these tests often fail because of a conflicting test between Travis CI and AppVeyor.

RSocrata tests the write.socrata function by write random numbers to a data frame, uploading it to a test data set, then redownloads that data and compares the two values. If the data does not match, then the test fails. This test is ran 4 times (3 times in Travis CI, once in AppVeyor) and sometimes this test fails as the multiple CI systems are simultaneously writing and reading. Specifically, here is an example:

  1. Travis CI writes a data frame to the test data set.
  2. AppVeyor then also writes a data frame to the same test data set.
  3. Travis CI then downloads that data set, unbeknownst to it, that data was actually written by AppVeyor.
  4. Travis CI test then fails since the random numbers generated by Travis CI do not match those written by AppVeyor.

While annoying, if you encounter a failed test, you should inspect the logs from each build.

1. Failure: add a row to a dataset (@test-all.R#453) 
2. Failure: add a row to a dataset (@test-all.R#454)

If the failures are only caused by the above issues, then it is simply a consequence of test conflicts. These can be ignored and will be resolved by the core development team.