A collection of Java string transformers, suitable for use with OpenRefine. Some generic, some aimed at botany and taxonomy.
Java
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
.gitignore
.travis.yml
LICENSE.md
Missing tests.md
README.md
pom.xml

README.md

Build Status

String Transformers

This is a collection of string transformers, all of which implement a transform method that takes a string and returns another.

Some are generic: CapitalLettersExtractor removes non-capital letters from a string. Others are geared towards handling scientific names, like StripBasionymAuthorTransformer.

See the String-Transformers Wiki on GitHub or API Documentation for further documentation.

Usage with OpenRefine

The transformers can be used with OpenRefine. Download the JAR and Apache Commons-Lang3 JAR and put both in OpenRefine's webapp/extensions/jython/module/MOD-INF/lib directory, then (re)start OpenRefine.

Use expressions like this, after selecting Jython from the Language dropdown:

from org.kew.rmf.transformers import CapitalLettersExtractor

cle = CapitalLettersExtractor()
return cle.transform(value)

Usage in other software

The library is published in the Maven Central Repository:

<dependency>
    <groupId>org.kew.rmf</groupId>
    <artifactId>string-transformers</artifactId>
    <version>0.2.0</version>
</dependency>

Licensing, contributions and credits

String-Transformers is open source software licensed under the MIT license, see LICENSE.md. The software is developed and maintained by the Biodiversity Informatics team at the Royal Botanic Gardens, Kew. For details of developers see the Git commit history. Thanks also to the Data Improvement Team for their scientific input.

Further contributions are welcome! We’re also interested to know if you use this library, whether that’s for biodiversity data or something else. Contact the Kew Biodiversity Informatics team or use GitHub.