Simple Data Conservancy Package Ingest Service
This package ingest service is intended transfer the contents of file archives (i.e. “packages”) into an LDP linked data repository such as Fedora. It includes:
- A core library in Java for ingesting packages of various formats
- A Simple HTTP API
- An API-X extension for exposing deposit endpoints on repository containers.
An archive contains custodial content (i.e. packaged files), and possibly additional packaging-specific metadata. A profile defines how these are distinguished. For example, it can be presumed that all content of a simple zip or tar file is custodial content. BagIt defines custodial content as all files underneath a
/data directory, and specifies additional “tag files” which may describe the circumstances of creating a bag (its author, date, etc), checksums for files, etc.
The package ingest service creates a repository resource (an LDPR) from each file in the custodial content of a package.
Additional processing rules may apply for each supported profile which may enhance the contents of LDPRs (e.g. add metadata), or create additional LDPRs. For example, If the package relates its resources into an LDP containment or membership hierarchy, the packaging profile may provide a way to encode this information, if this information is not otherwise present within the resources in the package
The original package may be discarded, or may be kept as part of an audit trail, used for authorization, etc. based upon policy. At minimum, the package ingest service will provide a log of all events that occurred during ingest.
If ingesting a package succeeds, further interaction with the newly created resources may be performed as usual via Fedora’s LDP-based API.
- Accommodate arbitrarily large packages with stream-oriented processing
- Allow the use of using simple command-line tools to deposit and verify success/failure (e.g curl, grep, etc)
- Accommodate backend workflows and policies
- Support synchronous and asynchronous paradigms in exposed APIs
- Produce a package. For example
- Zipping up a file system
- Export from a repository
- Generating resources by some local process (e.g. a desktop GUI, laboratory instrument, etc)
- Choose a container in the repository to deposit into (an LDPC, identified by its URI)
- No specific discovery mechanism is defined; it is presumed that a client can inspect repository resources and pick one to deposit into, or is given a URI for this purpose.
- Submit the package to the container.
- A new member resource will be created, and contents of package placed into it
- Follow the deposit results.
- An event stream indicates processing as it happens, and indicates success or failure