
Description
Originally reported by: jaraco (Bitbucket: jaraco, GitHub: jaraco)
@RonnyPfannschmidt and others have on more than one occasion asked about migrating from Bitbucket to Github. I'm creating this ticket to track that proposal and execution for Setuptools.
First, I'm generally in support of migration to Github. The current two-repo system (for supporting Travis-based CI and Github contributions) is clumsy at best, and the dominance of Github is undeniable. When Mercurial was chosen for Setuptools, the choice was made to align better with the existing Distribute repo, easier transition for SVN users, and for some of the same reasons that Mercurial was chosen for Python itself. This rationale has minimal value moving forward.
I've previously migrated a couple of projects from Bitbucket to Github including keyring and setuptools_scm. Here's the technique I used:
- Using late versions of Mercurial, Dulwich, and hg-git, create the Git clone of the Hg repository. Verify that heads, branches, etc, are all represented properly.
- From Github, request a temporary exemption to the rate throttling on the API, as anything more than 10 or so issues/comments will hit the rate limits.
- Use bitbucket_issue_migration to migrate the issues. If one migrates the issues to a clean repository, the new issues will have the same numbering.
- Disable issue tracking on the original repo.
- Update references and links in the new and old repositories to direct users as appropriate. Cut a new release to publish these references with the package.
While this process has been adequate for smaller projects, there are some issues that I suspect cannot be simply ignored in migrating larger projects like Setuptools.
issue attribution and timestamps
My biggest concern is about issue attribution and timestamps. The migration is not lossless. Every issue and comment gets a current timestamp and is attributed to the user under which the migration runs. The migration works around this by adding a timestamp and note of original attribution into the body of the text. This makes reviewing of these tickets harder to comprehend. This degradation of quality is acceptable for trivial repositories, but will be substantially less acceptable for a project like Setuptools with hundreds of open and closed tickets. Can this be improved?
open issues and pull requests remain in the old project
With hundreds of open tickets in the old project, users will be subscribed to those tickets and not to the new ones. It will take a great deal of effort to get those tickets closed and updated to refer to the migrated copy. Can this be done automatically (reference the migrated ticket, close it if not closed already, and finally disable the issues on the old repo)?
closed, anonymous heads
Mercurial allows for closed, anonymous heads in the repository. These heads will likely be omitted from the history when pushing to a Git repo. Perhaps that's acceptable, but it will mean a loss of history.