oai2pairtree.py harvests records from an oai-pmh repository and stores them in a pairtree on the filesystem.
oai2pairtree.py http://www.pubmedcentral.nih.gov/oai/oai.cgi
or if you want to limit to a particular set:
oai2pairtree.py http://www.pubmedcentral.nih.gov/oai/oai.cgi --set pmc-open
or if you want to also limit to a particular kind of record metadata:
oai2pairtree.py http://www.pubmedcentral.nih.gov/oai/oai.cgi --set pmc-open --metadata_prefix pmc
oai2pairtree requires that the lxml and ptree to run. The best way to get these is to:
easy_install oai2pairtree
or:
pip install oai2pairtree
or, if you prefer:
git clone https://github.com/edsu/oai2pairtree.git
cd oai2pairtree
python setup.py install
- CC0