Skip to content

7. Relationships

Matt Woodburn edited this page Jul 14, 2023 · 9 revisions

7.1 Linking ObjectGroups

There are several methods by which associations between different ObjectGroups, representing collections and subcollections, may be constructed.

Direct linkage using the ResourceRelationship class

The ResourceRelationship class can be used to make direct links between ObjectGroups, using the relationshipOfResource property to specify the nature of the relationship. For example, a hierarchy of parent-child relationships can be constructed by using ResourceRelationships with a relationshipOfResource of ‘part of’, or more semantic relationships can be reflected, such as between an ObjectGroup of physical collection objects, and an ObjectGroup representing digital multimedia artefacts derived from those objects.

With the application of the relationshipEstablishedDate property, this method of linkage may also be used to represent changes to physical collections over time (for example, collections that are split up or merged together), or track changes to how the data are organised in LtC (for example, a year after the initial creation of an institution’s collection descriptions dataset, it’s decided to split the ObjectGroup representing the whole zoological collection into two ObjectGroups representing the vertebrates and invertebrates separately, in order to store more specific descriptions and metrics against them. Creating this kind of provenance relationship enables some continuity of reporting across the two versions of the dataset, even though the shape of the data has changed.



Figure 18: An example of using the ResourceRelationship class to represent ObjectGroups that are part of a larger ObjectGroup in a collections hierarchy.

By a similar method, hierarchical structures may also be constructed in other relevant classes to represent breakdowns of collections in different contexts. For example, OrganisationalUnit instances may be chained together with part of relationships to represent the divisions and subdivisions within an institution. Taxon instances may be linked to form a taxonomic hierarchy, or ObjectClassification instances linked to create a less formal hierarchy of object types. Using these linked classes and attaching them to ObjectGroups enables collections to be arranged in contextual hierarchies instead of (or in addition to) creating explicit relationships between the ObjectGroups themselves. Figure 19 below shows a simple example of creating parallel hierarchies in ObjectGroups and OrganisationalUnits in a LtC dataset.

Fig19

Figure 19: Example of parallel hierarchies in the OrganisationalUnit and ObjectGroup classes, with some additional semantic relationships between instances of the two classes.

Simple representation of subcollections

A frequent use case of collection descriptions is to have the ability to store a list of notable, historic and important collections that may be part of a larger institutional collection. This scenario can be modelled in LtC using the approach above, by creating an ObjectGroup for the parent collection and an ObjectGroup for each of the child collections, and linking the children to the parent through the ResourceRelationship class.

While this approach provides the ability to attach rich LtC descriptions (including metrics and narratives) to each of these ‘named’ collections, there is also a degree of effort involved in assembling and maintaining this more complex dataset which may not be commensurate with the time and resources available. A more lightweight approach may also be taken in LtC, where use of the ResourceRelationship's relationshipResourceName and relationshipOfResource properties can be used to generate a simple list of subcollections and attach them to the parent ObjectGroup (see Figure 20 below).


"ResourceRelationship": [
    {
        "relatedResourceName": "Sloane Herbarium",
        "relationshipOfResource": "contains"
    },{
        "relatedResourceName": "W E Cutler Collections",
        "relationshipOfResource": "contains"
    }
]

Figure 20: Diagram and JSON example of using the ResourceRelationship class for simple representation of subcollections.

This approach also provides a methodology for starting with a fairly simple LtC dataset, and building up the detail over time as and when opportunity and resourcing permits. For example, a dataset might begin with a single ObjectGroup representing the whole institutional collection, and a set of ResourceRelationship records listing the divisional collections within that collection. At later points, those subcollections could then be broken out into their own ObjectGroups in order to describe and quantify them in more detail.

Using common entities and controlled vocabularies

ObjectGroups can also be indirectly related by the use of common entities and controlled vocabularies to allow associations to be made between them (although this method is more general data good practice, rather than anything specific to the LtC standard). For example, if a controlled vocabulary is used for the discipline property of the ObjectGroup class, then queries such as ‘find all the ObjectGroups representing ‘Botany’ objects’ and ‘provide the total number of ‘Botany’ objects held by the institution’ are easy to execute without the need to create explicit relationships between the multiple ObjectGroups involved.

Using a LatimerCoreScheme

The LatimerCoreScheme class provides a construct for grouping together ObjectGroups as part of the same LtC implementation, for a particular use case, and applying rules about how the data may be constrained, validated and interpreted by software agents. This approach is described in more detail in the Latimer Core Schemes section.

7.2 Modelling approaches

ObjectGroups and relationships

In many ways, the terms that are used to characterise the collection objects within an ObjectGroup are similar or identical to those that are used to describe an individual collection object. However, there is a fundamental difference in the relationship with those properties between the two examples. For an ObjectGroup, representing multiple objects, there is always the potential for there to be more than one value for any of those terms.

For example, a single object will only have one preservation method (e.g. ‘dried and pinned’), whereas an ObjectGroup can represent objects with a variety of preservation methods. The former represents a one-to-many relationship between the term and the object (an object may not have more than one preservation method, but one preservation method may relate to many objects). The latter has a many-to-many relationship between the term and the ObjectGroup (an ObjectGroup may reflect more than one preservation method, and one preservation method may relate to many ObjectGroups).

Why do relationships matter?

The main impact is on the metrics (represented by the MeasurementOrFact class) that can be attached to an ObjectGroup, and how they can be used. If an ObjectGroup has more than one value for the same term, then it’s not possible to tell how metrics attached to the ObjectGroup are distributed across those values.

For example, if an ObjectGroup has a single preservation method of ‘dried and pinned’, and an ‘object quantity’ (represented by a MeasurementOrFact record) of 10,000, we know that there are 10,000 dried and pinned objects. If however, that ObjectGroup has two preservation methods, ‘dried and pinned’ and ‘alcohol’, we know that there are 10,000 objects that are either dried and pinned OR preserved in alcohol, but we cannot calculate how many there are of each.

The only way to get an accurate assessment of the overall object quantity AND the object quantity for each preservation method is to split the ObjectGroup into two ObjectGroups: one containing just the ‘dried and pinned’ objects, and one for the ‘alcohol’ objects. This means that the preservation method maintains a one-to-many relationship with the ObjectGroups, rather than a many-to-many relationship. This is the key to being able to accurately aggregate and report metrics against the preservation method property, as well as the ObjectGroups.

‘Dimensions’ and ‘associations’

We’ve established that: * Terms can either have a one-to-many or a many-to-many relationship with the ObjectGroup * One-to-many relationships between the ObjectGroup and a term are required to be able to report accurate metrics against that property

Within the LtC model concept, a term where a one-to-many relationship with the ObjectGroup has been enforced is referred to as a dimension. These are the terms that are effectively used to determine how a collection needs to be broken down into multiple ObjectGroups, in order to satisfy the requirements for numeric reporting. It’s important to note that the term or terms to be designated as dimensions will vary between implementations, depending on the use case and requirements, and so are not prescribed as part of the LtC model. They can, however, be defined for an implementation using the LatimerCoreScheme structure, as described earlier in the document.

Terms that can have a many-to-many relationship with the ObjectGroup are referred to, for the purposes of this guidance documentation, as associations. They provide information about what is in the ObjectGroup, but cannot generally be quantified using the metrics attached to the ObjectGroup (but see the metric options discussed in section X).

When should dimensions and associations be used?

There are both benefits and limitations to applying terms as either dimensions or associations.

Associations

Associations essentially allow you to use the properties like ObjectGroup tags, attaching a number of values for the same property to a single ObjectGroup (Figure 21).



Figure 21: An ObjectGroup with two terms (Taxon and GeographicContext) attached as associations.

Associations enable you to: * reflect the scope of your collections using a range of properties and their associated values * keep the data structure relatively simple and maintainable, with a small number of ObjectGroups * link between ObjectGroups using common properties and vocabularies

However they will only give you basic figures for your collections, restricted to the absolute numbers in the ObjectGroups, and not reflecting the associated properties.

Dimensions

Using dimensions creates a consistent structure to the breakdown of the collection according to one or more common properties. Metrics can be used to accurately describe, analyse and visualise the collections in numeric terms.

As a dimension can only have one value for any given ObjectGroup, this effectively means the collections being described must be split into as many ObjectGroups as there are values for that dimension. If more than one property is designated as a dimension, then there must be an ObjectGroup for every valid combination of values in those dimensions.

Figure 22 shows an example where two terms - Taxon and GeographicContext - have been designated as dimensions. Each dimension has two possible values, so the collection must be split into four ObjectGroups, one for each combination of those two dimensions.



Figure 22: Breaking down a collection using two dimensions.

Metrics (represented by the MeasurementOrFact class) are attached to each ObjectGroup, in this example ‘object quantity’. This means that it’s possible to calculate the number of objects within a dimension (e.g. 175 objects from Tanzania or 375 reptiles), across the two dimensions (e.g. 200 Ethiopian mammals, 100 Tanzanian reptiles), and for the collection as a whole by aggregating numbers from all ObjectGroups.

In the real world however a collection is likely to have many more taxa, and many more geographic origins, and so the number of ObjectGroups in the grid would probably be considerably larger. Also, if a third dimension is introduced then the number of ObjectGroups is also multiplied by the number of values in that new dimension. So there are some practical limitations on the number of dimensions that can be used (and the number of values in each dimension), related to the manageability of the data but also primarily the amount of time and effort required to estimate collection metrics at such a detailed and granular level.

Within these constraints, using dimensions is most effective in scenarios for showing collections demographics or inventories, where: * a structured breakdown of the collection is needed using a small number of properties * there is a need for dynamic, quantitative reporting across those properties * there is sufficient resource available to estimate or calculate metrics for a larger number of ObjectGroups, and maintain a more complex dataset

Combining dimensions and associations

In practice, it is likely that most use cases of the LtC model would be best suited by a combination of terms as dimensions, and terms as associations. For example, an institution might describe its collection by applying an organisational hierarchy as a dimension, so that there is one ObjectGroup for each OrganisationalUnit defined by the institution, with associated metrics. To each of those ObjectGroups, there might then be a number of other terms attached as associations to further describe and reflect the scope (e.g. taxonomic or geographic) of the objects within the group.

Model options for metrics

There are two main options for how metrics can be used within the LtC data model, each with strengths and weaknesses.

Option 1: Metrics attached to the ObjectGroup

The first option (Figure 23) is to attach the metric directly to the ObjectGroup. This means that you can get an accurate figure for the total number of objects within the ObjectGroup, but not for the number of objects according to each of the terms (unless the terms are designated as dimensions, and the ObjectGroup split into multiple ObjectGroups, as described in the previous section). This option tends to be better suited to the dimensional model, for providing precise statistics.



Figure 23: Attaching metrics directly to the ObjectGroup class.

Option 2: Metrics attached to the relationship between the ObjectGroup and each property

The second option (Figure 24) is to attach metrics to the relationships between the ObjectGroup and the terms (designated as associations). For example, an ObjectGroup has 100 objects from Tanzania, 150 objects from Ethiopia, 75 reptiles and 125 mammals. This means that you have accurate figures relating to each property, but do not know how many objects there are overall - there is no denominator. In practice, this can be achieved by embedding instances of the MeasurementOrFact in a ResourceRelationship used to create the relationship between the ObjectGroup and the term.



Figure 24: Attaching metrics to links between classes, in order to quantify or qualify the relationship.

This option is better for a less structured, more graph-like approach to modelling the collections. It does avoid the need to break down into a greater number of more granular ObjectGroups, as per the dimensional model used in Option 1, but has limitations in providing accurate quantitative data.

As with the two options of using properties as associations or dimensions, there is potential to combine these two approaches to suit the use case.

Key points

  • Most terms related to an ObjectGroup can be used as either an association or a dimension.
  • Associations require less effort and are good for descriptive purposes, but bad for quantitative reporting.
  • Dimensions are good for quantitative reporting, but require more effort, and have limitations in how many can be applied.
  • Metrics can potentially be attached either directly to the ObjectGroup, or to the relationships between the ObjectGroup and its terms.