Skip to content


Subversion checkout URL

You can clone with
Download ZIP
A tool for harvesting media files from Open Access articles for upload into Wikimedia Commons
Python Shell Stata
Pull request Compare This branch is 389 commits ahead, 13 commits behind RaphaelWimmer:master.
Failed to load latest commit information.
doc/man + man page for oa-get(1)
helpers + put URL in page template
sources reverting the license to licence change
tests adapted for CC0
.gitignore added .gitignore file with *.pyc
README added GPL3 licensing
doi_pref.tsv tiny fix + added converting flag to database
notes + URLs relevant to this project
oa-cache + oa-cache print-database-path functionality
oa-get * display name of MediaWiki installation instead of API URL
oa-pmc-ids * handle timeout of HTTP connections in oa-pmc-ids
oa-put + put URL in page template * test script for media helper
oami_pmc_doi_detect_duplicates + script for DOI-based duplicate detection
oami_pmc_doi_detect_duplicates_test + test script for duplicate detection
oami_pmc_doi_import - remove database removal for DOI and PMCID import
oami_pmc_doi_import_test + test script for problematic DOIs identified by Daniel Mietchen
oami_pmc_pmcid_import - remove database removal for DOI and PMCID import cron script now kill existing instances on startup, Ctrl-C, to avoid …
plot-helper * larger area for plot * plot does not count edits anymore, but edits resulting in new pages
screencast + screencast
userconfig.example +10.3352


The aim of this project is to write a tool that would:
* regularly spider PubMed Central to locate audio and video files published in the supplementary materials of CC BY-licensed articles in the Open subset
* convert these files to OGG
* upload them to Wikimedia Commons, along with the respective metadata
* provide for easy extension to other CC-BY sources, beyond PubMed Central
* (possibly) suggest Wikipedia articles for which the video might be relevant

Wiki page:

    oa-get [download-metadata|download-media] [dummy|pmc|pmc_doi]
    oa-cache [browse-database|clear-database|clear-media|convert-media|find-media|list-articles|stats] [dummy|pmc|pmc_doi]
    oa-put upload-media [dummy|pmc|pmc_doi]

    python-dateutil <>
    python-elixir <>
    python-gst0.10 <>
    python-magic <>
    python-mutagen <>
    python-progressbar <>
    python-xdg <>
    python-werkzeug <>
    python-wikitools <> (python-wikitools was imported into our tree and patched to ease deployment)

    sqlitebrowser <>

To use the upload feature of oa-put, copy the userconfig.example file to

A screencast showing usage can be played back with “ttyplay screencast”.

To plot mimetypes occurring in sources, install python-matplotlib and pipe the output of “oa-cache stats [source]” to the included plot-helper script.

The Open Access Media Importer is free software: 
you can redistribute it and/or modify it 
under the terms of the GNU General Public License 
as published by the Free Software Foundation, 
either version 3 of the License, 
or (at your option) any later version.
Something went wrong with that request. Please try again.