Skip to content
This repository
branch: master
Fetching contributors…


Cannot retrieve contributors at this time

file 232 lines (160 sloc) 8.348 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231
A placeholder for the gitPAN project so it can have an issue tracker,
wiki and all that.

There is very little code written for gitpan. It is mostly just gluing
together other people's work: Leon's Parse::BACKPAN::Packages and Yanick's
git-cpan-patch. The patched versions of these necessary to make gitpan
work are linked in as git submodules.


What is gitPAN?

gitPAN is a project to import the entire history of CPAN (known as
BackPAN) into a set of git repositories, one per distribution.

What is CPAN?

CPAN is the Comprehensive Perl Archive Network at It is an
archive of tens of thousands of Perl modules written by thousands of
authors. A good interface to it is

What is BackPAN?

In order to limit CPAN's size, authors are requested to delete old
releases. BackPAN maintains all CPAN releases, even deleted ones, and
is a complete history of CPAN. There are only a few BackPAN mirrors
such as

Why is gitPAN?

CPAN (and thus BackPAN) is a pile of tarballs organized by author. It
is difficult to get the complete history of a distribution, especially
one that has changed authors or is released by multiple authors (for
example, Moose). Because releases are regularly deleted from CPAN
even sites like provide an incomplete history. Having
the complete history of each distrubtion in its own repository makes
the full distribution history easy to access.

gitPAN also hopes to make patching CPAN modules easier. Ideally you
simply clone the gitPAN repository and work. New releases can be
pulled and merged from gitPAN.

gitPAN hopes to showcase using a repository as an archive format,
rather than a pile of tarballs. A repository is far more useful than
a pile of tarballs, and contrary to many people's expectations, the
repository is turning out smaller.

Finally, gitPAN is being created in the hope that "if you build it
they will come". Getting data out of CPAN in an automated fashion has
traditionally been difficult.

Where is gitPAN?

The repositories are on at
(watch out, it may overload your browser).

Code, discussion, and issues can be had at

How do I access a distribution on gitPAN?

Simplest way is to go to<distribution>.
For example, Acme-Pony can be found at
Instructions for futher access can be found there.

The clone URL for a given distribution is git://<distribution>.git.
You can clone without a github account.

How big is BackPAN?

BackPAN (just the modules, we're not doing perl releases) contains
about 120k archive files (mostly gzipped tarballs) representing about
21,000 distributions from 5000 authors taking up 14 gigs of space.
Tarballs consume about 12 gigs, not sure where the other 2 is going
(readme files, meta files, random non-distro junk, block size

How big is gitPAN?

gitPAN consists of over 21,000 repositories representing each CPAN
distribution. Disk usage (garbage collected repositories with no
checkout) is 4.3 gigs. It imported 120,000 files weighing in at 9.7
gigs giving a compression ratio of over 2x.

gitPAN consumes about 150 gigs on github, presumably due to indexing.

Did gitPAN skip anything?

Yes. It skipped perl, parrot and parrot-cfg. They're not really CPAN
modules and they have far more complete repositories. It may
skip more in the future, these are the ones I noticed.

Will you be adding X to gitPAN?

The primary focus is to get accurate repositories for each CPAN
distribution and to make this data available for others to use. When
you think "will gitPAN do X" instead think "how can I use gitPAN to
build X?"

That said, suggestions on how to improve the data available from
gitpan heartily accepted.

How can I merge gitPAN's history with my module?

If you are the owner of a CPAN module and have an existing, but
incomplete, repository you can fill in the history using gitPAN. The
technique is outlined in this article.

How do I update my module on gitpan?

gitpan will automatically pull new releases from CPAN.

Updates are currently happening sporatically pending setting up the
hosting for the importer and improving logging and error recovery.

Where can I get a list of all the repositories?

You can get it from github's API <> with the
URL <>. At the time of
this writing the call is timing out. Your alternative is to get a
list of distributions from BackPAN::Index.

How can I help?

See for a list of open problems.

You can also contribute by looking through imported CPAN distributions,
checking for mistakes and reporting them as issues.

I'm the author of X distribution and my real repository is over here!

gitPAN is intended to co-exist with, not compete with, the development
repository for a distribution. It provides a consistent, easy to find
interface to your releases so you don't have to.

You can make your own development repository more visible by adding
a repository resource to your release meta-data. See

I'm the author of X distribution, can I get commit access to gitPAN?

Sorry, no. gitPAN is intentionally read-only to provide a consistent
interface over all of CPAN. Allowing developers to commit directly to
gitPAN would endanger this consistency. In this sense, gitPAN is
simply a read-only view on your releases.

As the developer of the project, you should continue to develop
against your regular repo. However, it is helpful to fill in back
history should you be missing it. You can use the release tags and
dates on the gitpan repo to place tags into your development repo. If
history is completely missing, you can splice your development
repository on top of the gitpan repo. See "How can I merge gitPAN's
history with my module?" above.

You could develop off a gitpan fork, but the actual development
history of your project up to this point would be lost. Merging your
dev repo with gitpan is left as an exercise for the reader to do
usefully. If you tag your releases in a consistent manner and publish
the location of your repository, gitpan doesn't offer anything new to
the developer.

I noticed a problem with a repository

Please report it at <> or to

Who do we have to thank for gitPAN?

gitPAN exists on top of a pile of pre-existing technology and services.
Very little new code was written and the yaks were already well shorn.

Elaine Ashton for instituting BackPAN.
Jarkko, Graham Barr and the rest of the CPAN cabal.
Andreas König for tirelessly maintaining PAUSE.
brian d foy for spearheading BackPAN archeology.
Léon Brocard for Parse::BACKPAN::Packages to access the backpan index
  and maintaining the BackPAN index.
Linus and the git devs for git (this was tried before on SVN and guh...) for a generous donation of space and support and angry unicorns
Yanick Champoux for git-cpan-patch which does most of the work. for an rsync-able BackPAN mirror.
Michael Schwern glued it all together.
Integra Telecom has generously donated bandwidth and server space for
  gitpan and our own BackPAN mirror.
Joshua Keroes and Chris Hufnagel for providing server support.

How can I contact gitPAN?

Twitter: #gitpan
Something went wrong with that request. Please try again.