Skip to content
This repository has been archived by the owner on May 24, 2022. It is now read-only.

Available Transformation Functions

Bo Ferri edited this page Oct 20, 2015 · 52 revisions

Note: d:swarm uses functions of the Metafacture framework for data transformation. You can find more detailed information about these functions at https://github.com/culturegraph/metafacture-core/wiki/Metamorph-functions.

Function Description Parameter Explanation Example
case Letter characters are transformed to lower or upper case. language locale en (for English)
upper lower case is converted to upper case SLUB DRESDEN
lower upper case is converted to lower case slub desden
compose Wraps the value in a prefix and postfix. Prefixing a mapping value “swarm” with “d:” will result in “d:swarm”. prefix prefix string d:
postfix postfix string
concat Combines the values of several attributes into one element, adding option prefix and postfix strings, and passes result to output.

Value 1: “SLUB”

Value 2: “Dresden”

delimiter: “-”

prefix: “Pre”

postfix: “Post”

Result: “PreSLUB-DresdenPost”

delimiter delimiter used to separate concatenated values
prefix prefix string
postfix postfix string
constant Replaces the value with a constant string. value replace value
count Counts occurrences of an attribute and passes result to output. no parameter
equals Filtering based on equality of the inpupt attribute and the function parameter. If the same, the input attribute is passed to output. string comparison value
Function Description Parameter Explanation Example
htmlanchor Creates an HTML anchor tag with the following pattern (without "+" and spaces):

<a href=" + prefix + value + postfix + ">title</a>

Example to be mapped: "slub-dresden"
Result: <a href="http://www.slub-dresden.de/">Homepage SLUB Dresden</a>

prefix prefix string http://www.
postfix postfix string .de
title link text Homepage SLUB Dresden
isbn ISBN cleaning, checkdigit verication and transformation between ISBN 10 and ISBN 13. Non-digit characters can be eliminated. ISBN can be validated. isbn13 transformation to ISBN 13
isbn10 transformation to ISBN 10
clean elimination of non-digit characters
verifyCheckDigit validation
normalize-utf8 UTF-8 normalization. Transforms umlauts into canonical form. no parameter
not-equals Filtering based on inequality. If unequal, attribute value is passed to putput. string comparison value
Function Description Parameter Explanation Example
occurence Filtering based on occurrence.

Values to be mapped (e.g. result of split): “SLUB” “Dresden” “d:swarm” “DMP”

only: “moreThen 2″

sameEntity: “True”

Result: “d:swarm” “DMP”

only Position of element moreThen 2

3

lessThen

sameEntity True

False

regexp Regexp matching returning the first occurrence of a pattern. The pattern is a Java regex pattern. format order of the capturing groups ${1}
match regex pattern ^isbn\d\d\-(\d{10,13})
replace Replaces a pattern with a string. The pattern is a Java regex pattern. pattern regex pattern ^isbn\d\d\-(\d{10,13})
with replace value
split Splitting based on a regexp.

Value to be split: “SLUB-Dresden”

delimiter: “-”

Result: “SLUB” and “Dresden” are passed to output

delimiter regex pattern
substring Extracts a substring.

value: “SLUB Dresden” start=0, end=7, returns “SLUB Dr”

end index position of the last character
start index position of the first characte
trim Trims all white spaces at the beginning and at the end of the attribute value. no parameter
urlencode Transforms all characters not allowed in a URL into URL-compatible characters. no parameter
regexlookup Performs a table lookup where keys may be regexes. lookupString A map or uploaded file that contains key/value pairs.
default Value used if no corresponding key is found.
dewey Dewey conversion and verification. precision A decimal number (represented in string format) showing the desired precision of the returned number; i.e. 100 to round to nearest hundred, 10 to round to nearest ten, 0.1 to round to nearest tenth, etc.
addLeadingZeros Add leading zeros to a Dewey number (if not present).
errorString Error string that should be written as value, if the input string is not a valid Dewey number.
http-api-request HTTP API GET request with the input value as URI. Note: the URI should probably composed in a previous component, i.e., the http-api-request function expects a valid URI. Note: the response needs to be processed in a further component, e.g.,parse-json. acceptType The accept type of the HTTP API request.
errorString Error string that should be written as value, if the HTTP API request fails for some reason.
parse-json Parses the input value with help of the given JSONPath. jsonPath The JSONPath to extract values from the given input JSON value. Note: the JSONPath must conform http://goessner.net/articles/JsonPath/.
errorString Error string that should be written as value, if the JSON parsing fails for some reason.
collect Collects all received values and concatenates them on record end. Useful for values of a field that occurs multiple times in a record. delimiter delimiter used to separate concatenated values
prefix prefix string
postfix postfix string
numfilter Extract data based on matching a numeric filter. Syntax is ">" for greater then, "<" for less then, "==" for equals, ">=" for greater then or equals and "<=" for less then or equals. Note: all '<' and '>' signs should be encoded in attributes, like '&lt;' and '&gt;'. expression numeric filter expression
issn ISSN conversion and verification. format Formats/normalizes the given ISSN with a hyphen after the 4th digit (+ upper-cases the last character)
check Check the given ISSN with help of the checksum character at the end of the ISSN (default = true)
errorString Error string that should be written as value, if the input string is not a valid ISSN

D:SWARM Help - Step by Step

Clone this wiki locally