Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic metadata block generation and DVUploader integration #16

Merged
merged 111 commits into from
Apr 4, 2024

Conversation

JR-1991
Copy link
Member

@JR-1991 JR-1991 commented Mar 20, 2024

Overview

This pull request introduces the dynamic generation of metadata blocks for Dataverse version >= 5.14, which is based on IQSS/dataverse#9213. The metadata schemes are retrieved from a Dataverse instance and converted into metadata block objects that can be filled with metadata. When uploaded, the objects are transformed into compliant Dataverse JSON and sent through pyDataverse's standard dataset creation/update methods. Additionally, this pull request includes python-dvuploader as a file upload solution, allowing for parallel native/direct uploads to Dataverse. Furthermore, file downloads are also parallelized.

TLDR

  • Generates metadata block objects on the fly
  • Integrates python-dvuploader for file uploads
  • Extension of the unit/integration test framework (Coverage: 76%)
  • Parallel file downloads as well as subtractive selection using patterns
  • List metadata block configurations for an overview
  • Migration to pyDantic V2

Example

from easyDataverse import Dataverse

# Connect to a Dataverse installation
dataverse = Dataverse(
  server_url="https://demo.dataverse.org",
  api_token="MY_API_TOKEN",
)

# Initialize a dataset
dataset = dataverse.create_dataset()

# Fill metadata blocks
dataset.citation.title = "My dataset"
dataset.citation.subject = ["Other"]
dataset.citation.add_author(name="John Doe")
dataset.citation.add_dataset_contact(name="John Doe", email="john@doe.com")
dataset.citation.add_ds_description(value="This is a description of the dataset")

# Upload files or directories
dataset.add_file(local_path="./my.file", dv_dir="some/dir")
dataset.add_directory(dirpath="./my_directory", dv_dir="some/dir")

# Upload to the dataverse instance
dataset.upload("my_dataverse_id")

@JR-1991 JR-1991 added the enhancement New feature or request label Mar 20, 2024
@JR-1991 JR-1991 self-assigned this Mar 20, 2024
@JR-1991 JR-1991 marked this pull request as ready for review March 20, 2024 08:19
JR-1991 and others added 2 commits March 20, 2024 09:21
…egories-to-files-in-a-dataset

Add `categories` kwargs - fixes #15
@JR-1991 JR-1991 merged commit 7bc5059 into main Apr 4, 2024
9 checks passed
@JR-1991 JR-1991 deleted the flexible-connect branch May 11, 2024 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant