Tools to help you work with the ORCID datadump
Switch branches/tags
Nothing to show
Clone or download
Tom Demeranville Tom Demeranville
Tom Demeranville and Tom Demeranville updated readme
Latest commit 61b7dbc Nov 10, 2017
Type Name Latest commit message Commit time
Failed to load latest commit information.
docker-python added scholix transform and docker support Nov 10, 2017
.gitignore first draft Oct 16, 2017
dump_to_scholix_v2api.js added scholix transform and docker support Nov 10, 2017
readme.MD updated readme Nov 10, 2017


ORCID datadump to Mongo

This python3 script takes the ORCID JSON datadump and loads it directly into MongoDB. This is orders of magnitude quicker than extracting it to the file system.


you need pymongo

python3 -m pip install pymongo

and a mongo instance. Either install locally, or use docker:

docker run --name orcid-mongo -v /Volumes/Transcend/docker/mongostorage/:/data/db -d mongo:3.5


have an instance of mongo running

python3 --file filename.tar.gz --collection target-collection-name

other scripts

dump_to_scholix_v2api.js : this takes the dump (in mongo) and transforms it into a scholix like format


This also works with docker

To start mongo:

docker run --name orcid-mongo -d -p 27017:27017 -v /Volumes/Transcend/docker/mongostorage/:/data/db mongo:3.5

To start the scipt

cd docker-python
docker build -t orcid-mongo-import .
docker run --name importer --link orcid-mongo:orcid-mongo --rm orcid-mongo-import --file myFile.tar.gz --collection myCollection --resume 0 

To run the script again

docket start importer