Skip to content

caltechlibrary/inveniordm-migrate

Repository files navigation

Assorted scripts to migrate content to InvenioRDM and S3 data sources

This repo holds scripts user to migrate content into InvenioRDM. These have generally been used for one-time migration activities, but may be useful in the future.

License Latest release

Table of contents

Usage

CaltechDATA

migrate_caltechdata.py was usilized to move records from the TIND-managed Invenio instance to InvenioRDM

CaltechTHESIS

migrate_caltechthesis.py was utilized to creats some minimal test records in InvenioRDM. It is not complete.

OSN Migration

For large collections of data we sometimes need to move the data first, and then create InvenioRDM records. An S3 object store like the Open Storage Network is a great option. You can bulk move records efficiently with s5cmd and the management scripts.

Run python make_command.py to generate a list of files to sync. You'll need to set environment variables with

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
S3_ENDPOINT_URL https://renc.osn.xsede.org
AWS_REGION us-east-1

Then run the command with nohup ./s5cmd -numworkers 100 run commands.txt >> & log2017.txt ; echo Done >> & log2017.txt &. You may be able to adjust the numworkers component depending on the OS.

Getting help

Raise an issue on the issue tacker.

License

Software produced by the Caltech Library is Copyright (C) 2023, Caltech. This software is freely distributed under a BSD/MIT type license. Please see the LICENSE file for more information.

Authors and history

These scripts were written by Tom Morrell.

Acknowledgments

This work was funded by the California Institute of Technology Library.

About

Scripts to migrate content into Invenio RDM

Resources

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENSE
Unknown
LICENSE.html

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages