Skip to content

Metadata manipulator: SplitRepeatedValues

Mark Jordan edited this page Mar 1, 2017 · 11 revisions

Overview

This metadata manipulator splits a source metadata value on a character (for example a semicolon) and creates a corresonding MODS element for each resuslting value as defined in the metadata mappings file. For example, using the mapping Subjects,"<subject><topic>%value%</topic></subject>", it will convert the value "Boats; Havana; water" into

  <subject>
    <topic>Boats</topic>
    <topic>Havana</topic>
    <topic>water</topic>
  </subject>

This manipulator is a more general alternative to the FilterModsTopic manipulator.

Toolchains

Can be used within any MODS toolchain.

Configuration

To register this manipulator in your toolchain, add the following line to the "[MANIPULATORS]" section of your .ini file:

metadatamanipulators[] = "SplitRepeatedValues|Subjects|/subject/topic|;"

In this example, the metadata mappings file must contain a row defining the <subject> container element, e.g.,

Subjects,<subject><topic>%value%</topic></subject>

More complex XPath expressions can be used as well. For example, this configuration:

metadatamanipulators[] = 'SplitRepeatedValues|Subjects_TGM1|/subject[@authority="lctgm"]/topic|;'

in combination with this mapping:

"Subjects_TGM1","<subject authority=""lctgm""><topic>%value%</topic></subject>"

will result in MODS output like this:

  <subject authority="lctgm">
    <topic>Correspondence</topic>
    <topic>Post offices</topic>
    <topic>Postal service rates -- Canada</topic>
  </subject>

Parameters

This manipulator takes three required parameters:

  • The first parameter is the name of the source field in your metadata mapping. For CSV toolchains, use the source field label; for CONTENTdm toolchains, use the CONTENTdm nickname of the source field. The field must exist in your source metadata; you cannot use "null" mappings (see below for more detail).
  • The second parameter is an XPath expression identifying the MODS element that is the target for the mapping. Do not include the 'mods' namespace.
  • The third parameter is the character that delimits the repeated values in the source metadata. Any character or string that works as the first parameter of PHP's explode() function is allowed, except for the pipe (|).

Functionality

This manipulator splits the source value on the delimiter character and creates a separate MODS element for each of the repeated values as illustrated above. Input metadata fields that are split are logged to the location defined in ['LOGGING']['path_to_manipulator_log'].

Note that it may be necessary to add the "repeatable_wrapper_elements" option to your .ini file for elements that have a parent wrapper element; for example, if you are splitting data that will be placed in separate <subject><topic> elements like this:

  <subject>
    <topic>Boats</topic>
  </subject>
  <subject>
    <topic>Havana</topic>
  </subject>
  <subject>
    <topic>water</topic>
  </subject>

you will need to add repeatable_wrapper_elements[] = subject to the [METADATA_PARSER] section of your .ini file.

Also note that this metadata manipulator will not work with "null" mappings, that is, mappings to MODS elements that do not come from source metadata. In other words, it won't work with mappings like this:

null3,"<extension><snowflakes type=""Types of snowflakes"">Column; Plain; Irregular</snowflakes></extension>"

In order to create a separate <snowflakes> element for "Column", "Plain", and "Irregular", you would need to use a separate mapping for each value.

Clone this wiki locally