Skip to content
Hackage mirroring tool
Haskell
Branch: master
Clone or download
Latest commit ed377c9 Apr 3, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src Update Travis Job & add support for GHC 8.4.2 Apr 2, 2018
.gitignore Refactor, clean-up, and do more graceful failure handling Oct 27, 2017
.travis.yml
ChangeLog.md Initial import Sep 11, 2016
LICENSE Initial import Sep 11, 2016
README.md Update Travis Job & add support for GHC 8.4.2 Apr 2, 2018
cabal.project Initial import Sep 11, 2016
hackage-mirror-tool.cabal Make `cabal check` happy Apr 3, 2018

README.md

Hackage mirroring tool Build Status

This is a simple tool for mirroring to S3-compatible object stores (e.g. Dreamhost or AWS).

See also hackage-mirror-tool --help.

Resource requirements

Currently, using this tool to operate a http://hackage.haskell.org mirror has the following requirements:

  • ~1 GiB local filesystem storage (used for by local 01-index.tar cache)
  • ~10 GiB of storage in S3 bucket (at time of writing ~7.1 GiB were needed, this size increases monotonoically over time)
  • A single-threaded hackage-mirror-tool run needs (less than) ~256 MiB RAM; IOW, a small 512 MiB RAM VM configuration suffices.

Example usages

cronjob-based

This is a simple example for how to set up a cronjob-based mirror job, which is triggered every 3 minutes.

Create the following cronjob(5) entry:

*/3 * * * *  ${HOME}/bin/run_mirror_job.sh

The ${HOME}/bin/run_mirror_job.sh script contains:

#!/bin/bash

mkdir -p ${HOME}/workdir/logs
cd ${HOME}/workdir/

S3_ACCESS_KEY="ASJKDS..." \
S3_SECRET_KEY="asdjhakjsdhadhadjhaljkdh..." \
timeout -k5 170 ${HOME}/bin/hackage-mirror-tool +RTS -t -A2M -M256M -RTS \
  --hackage-url      http://hackage.haskell.org \
  --hackage-pkg-url  http://hackage.haskell.org/package/ \
  --s3-base-url      https://s3.amazonaws.com \
  --s3-bucket-id     my-hackage-mirror \
   &>> ${HOME}/workdir/logs/$(date -I).log

The timeout -k5 170 arguments are defined that way in order to ensure that the current job is killed before the next cronjob gets started.

Sample AWS access policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "bucketlevel",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::hackage-mirror-tool"
            ]
        },
        {
            "Sid": "objectlevel",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::hackage-mirror-tool/*"
            ]
        }
    ]
}
You can’t perform that action at this time.