Skip to content

Islandora-Labs/islandora_datastream_exporter

Repository files navigation

Islandora Datastream Exporter

Introduction

This module provides a Drush script that can be used to bulk export datastreams given a query to a source of PIDs.

Requirements

This module requires the following modules/libraries:

Installation

Install as usual, see this for further information.

Troubleshooting/Issues

Having problems or solved a problem? Check out the Islandora google groups for a solution.

Usage

Output of drush islandora_datastream_export --help:

Exports a specified datastream from all objects given a fielded Solr query.

Examples:
 drush -u 1 islandora_datastream_export  Exporting datastream from object via default Solr query.
 --export_target=/tmp
 --query=PID:\"islandora:9\" --dsid=DC

Options:
 --dsid                                    The datastream id of to be exported datastream. Required.
 --export_target                           The directory to export the datastreams to. Required.
 --query                                   The query to be ran. Required.
 --query_type                              The type of query to run. Check the output of "drush islandora_datastream_export_types" for a list. Defaults to "islandora_datastream_exporter_solr_query".

Solr Backend

It's to be noted that when specifying a value that some values will need to be escaped as the value is passed directly to Solr. An example of this is for the PID field where islandora:test will not work, while "islandora:test" or islandora\:test will. For queries taking advantage of Lucene syntax all parts of the query string must be provided as escaped. Boolean logic is allowed.

Finally the user option (-u) needs to be specified or errors could be encountered when attempting to write the contents of the datastream to a file.

RI Backend

To use the RI Backend:

  • Queries should be written in SPARQL format
  • Queries should SELECT a ?pid
  • The contents of the query should be saved in a plaintext file, and provided to the drush script as the --query parameter
  • To facilitate cycling through objects, the query should contain the string %offset%, which will be replaced by the current offset of the batch. For example:
SELECT ?pid
FROM <#ri>
WHERE {
  ?pid <fedora-rels-ext:isMemberOfCollection> <info:fedora/some:collection>
}
OFFSET %offset%

Maintainers/Sponsors

Current maintainers:

This project has been sponsored by:

  • University of Saskatchewan The University of Saskatchewan is a Canadian public research university, founded in 1907, and located on the east side of the South Saskatchewan River in Saskatoon, Saskatchewan, Canada.

Development

If you would like to contribute to this module, please check out our helpful Documentation for Developers info, as well as our Developers section on the Islandora.ca site.

License

GPLv3

About

A Drush script that can be used to bulk export datastreams given a Solr query.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages