Scripts, files and documentation for the pyDataverse NESSTAR data migration with pyDataverse.
Requirements:
- pyDataverse
- pydantic
git clone git@github.com:AUSSDA/pyDataverse_nesstar.git
cd pyDataverse_nesstar
pipenv install
Set up
pipenv shell
export INSTANCE="INSTANCE_NAME"
Place .env files in expected location with needed variables set: see src/settings.py
to find out more.
Workflow
In general, the workflow for data imports is always as this:
- Prepare the CSV file
- Prepare the script
- Prepare Dataverse installations
- Prepare Dataverse
- Create Dataverse
- Test on
local
Dataverse installation - Upload 1 dataset with datafiles
- Review: Developer
- Publish the dataset with datafiles
- Review: Developer
- Upload 10 datasets with datafiles
- Publish the datasets
- Review: Developer
- Upload all datasets with datafiles
- Publish the datasets
- Review: Developer
- Review
- Test on
development
Dataverse installation - Upload 1 dataset with datafiles
- Review: Developer + Ingest
- Upload all datasets with datafiles
- Review: Developer + Ingest
- Import on
production
Dataverse installation - Upload 1 dataset with datafiles
- Review: Developer + Ingest
- Upload all datasets with datafiles
- Review: Developer + Ingest
- Publish all datasets
- Review: Developer + Ingest
- Clean up the Dataverse installations
- Delete/Destroy datasets
- Delete Dataverses
Execute Script
Before you run the script, adapt the data pipeline control flags in src/nesstar.py
.
cd src
python -m nesstar
Install
git clone git@github.com:AUSSDA/pyDataverse_nesstar.git
cd pyDataverse_nesstar
pipenv install --dev
pre-commit install