What's Different Between Islandora 1.x and Islandora CLAW
In the most basic terms, Islandora CLAW is the version of Islandora that works with Fedora 4. Because Fedora 4 is a vastly different platform than Fedora 3, so too is Islandora CLAW a major departure from what came before. Switching to CLAW represents not just a typical upgrade with improvements, features, and bug fixes, but rather a major shift in how objects are stored and managed.
Moving from Islandora 7.x-1.x to CLAW requires a migration of objects, which you can learn about here. It also requires some adjustments in how you think about your objects and their relationships, and how to manage them in Islandora, which we will cover below.
You can also check out some of the documentation provided by the Fedora project:
- Concept Mapping - Fedora 3 to 4
- The Fedora 4 object model
- Fedora 3 to 4 Upgration
- LDP-PCDM-F4 In Action
Fedora 3 stored all objects at the top level of the repository, although presentation of the objects could mimic a directory structure by having objects 'in' collections and collections 'in' other collections. This image is a helpful oversimplification:
Fedora 4 differs considerably in that there is an innate tree hierarchy to the repository rather than a flat structure. Put less simply, "a Fedora 4 repository consists of a directed acyclic graph of resources where edges represent a parent-child relation".
Fedora 3 objects are FOXML (Fedora Object eXtensible Markup Language) documents, with three elements:
Digital Object Identifier: A unique, persistent identifier for the digital object. Also knowns as the PID.
System Properties: A set of system-defined descriptive properties that is necessary to manage and track the object in the repository.
Datastream(s): The element in a Fedora digital object that represents a content item.
In Fedora 4 , what we would have called
objects are now referred to as
Resources (and everything in Fedora 4 is a
Resource). Instead of being composed of XML as they were in Fedora 3, they are stored in ModeShape as nodes with RDF properties.
Resources come in two flavors:
RDF Sources, which are
Resources having only RDF data, and
Non-RDF Sources, which are
Resources that are binary files (HTML, PDFs, images, audio, video, etc). The terms [
RDF Source] and [
Non-RDF Source] both come from the W3C's Linked Data Platform specification, which also defines the idea of Linked Data Platform Containers, or
LDPC is an
RDF Source that functions as a collection of
Resources, similar to the way Islandora 7.x-1.x compound objects exist only as a way to tie together its children. An
LDPC may contain
Non-RDF Sources, as well as other
RDF Sources acting as
LDPCs; you can have a container of containers just like how Islandora 7.x-1.x can have a collection of collections.
CLAW makes use of the Portland Common Data Model (PCDM) as a layer of abstraction over
LDPCs to make containment simpler to understand for users; a
pcdm:Collection may contain other
pcdm:Objects (similar to an Islandora 7.x-1.x collection content model), and a
pcdm:Object may contain other
pcdm:Objects (similar to the way an Islandora 7.x-1.x compound object has child objects) or
pcdm:Files (similar to the way Islandora 7.x-1.x objects have datastreams).
In Islandora 7.x-1.x, every object has a specific content model which defined what datastreams it could have and which were absolutely required. Some of these Islandora 7.x-1.x datastreams contained metadata about the object while others contained binary files (JPG, PDF, MP3, PNG, TIFF, etc). In Islandora CLAW, all metadata about a resource is stored as RDF attributes directly on the resource itself, whether that resource is a
pcdm:Object or a
pcdm:File, so we no longer need to separate metadata by type (RELS-EXT, RELS-INT, MODS, DC, PREMIS, etc) and store it in binary files as we did in Islandora 7.x-1.x.
Binary files, such as JPGs, PNGs, MP3s, and PDFs, are handled via
pcdm:Files which are contained by a parent
pcdm:Object, similar to how an Islandora 7.x-1.x cmodel may hold a PDF or JPG as a datastream. Unlike Islandora 7.x-1.x, these binary files can actually have their own technical metadata attached them. This is because
pcdm:Objects and even
pcdm:Files are all
RDF Sources containing only RDF data, with
pcdm:Files having links to the URL of the
Non-RDF Source (binary file) they represent as part of their RDF data in addition to whatever other metadata you may want about the file. Using this system, a
pcdm:Object can contain as many
pcdm:Files as necessary, and each
pcdm:File can have separate metadata about itself and its relationship to other
pcdm:Files attached to the parent
pcdm:Object, serving the same purpose RELS-INT datastreams served in Islandora 7.x-1.x.
Note that you can use a
pcdm:File to represent a file of metadata, such as MODS, DC, or PBCore, in case you would like to preserve a copy of an object's legacy metadata when migrating into Fedora 4. These metadata files will be treated like any other binary file in Islandora CLAW, and will not be indexed or editable through the GUI.
Every object in a Fedora 3 repository had a Persistent Identifier following the pattern
namespace:pid. Fedora 4 resources do not have PIDs. Instead, since Fedora 4 is an LDP server, their identifiers are fundamentally their URIs. The PIDs of objects migrated from a Fedora 3 repository can still be stored in Fedora 4, as additional properties on the new Fedora 4 resource.
Since resources are stored as
nodes on the Drupal side of Islandora CLAW, they also have Drupal UUIDs.