Core package of the Metafacture tool suite for metadata processing.
Java Other
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
gradle/wrapper
metafacture-biblio
metafacture-commons
metafacture-csv
metafacture-elasticsearch
metafacture-files
metafacture-flowcontrol
metafacture-flux
metafacture-formatting
metafacture-formeta
metafacture-framework
metafacture-io
metafacture-javaintegration
metafacture-jdom
metafacture-json
metafacture-linkeddata
metafacture-mangling
metafacture-monitoring
metafacture-plumbing
metafacture-runner
metafacture-scripting
metafacture-statistics
metafacture-strings
metafacture-triples
metafacture-xml
metamorph-api
metamorph-test
metamorph
travis
.editorconfig
.gitattributes
.gitignore
.travis.yml
LICENSE
README.md
build.gradle
gradlew
gradlew.bat
settings.gradle

README.md

Metafacture

Metafacture is a toolkit for processing semi-structured data with a focus on library metadata. It provides a versatile set of tools for reading, writing and transforming data. Metafacture can be used as a stand-alone application or as a Java library in other applications. The name Metafacture is a portmanteau of the words meta data and manufacture.

Metafacture includes a large number of modules for operating on semi-structured data. These modules can be combined to build pipelines to perform complex metadata processing tasks. The pipelines can be constructed either in Java code or with the domain-specific language Flux. One of the core features of Metafacture is the Metamorph module. Metamorph is an xml-based language for specifying transformations of semi-structured data. It can be seamlessly integrated into Java code.

At its heart Metafacture is a framework for implementing modules for metadata processing. This makes Metafacture easily extendable with additional modules. The plugins and tools page on the wiki shows supplementary packages and projects which extend Metafacture.

Originally, Metafacture was developed as part of the Culturegraph platform but it is developed independently now and used by others, too: see who uses Metafacture.

Getting started

You can either use Metafacture as a stand-alone application or include it as a Java library in your own projects.

Metafacture as a stand-alone application

If you are only interested in running Flux scripts without doing any Java programming this is the way to go. The instructions assume that you are using a *nix-like shell.

  1. Download the latest distribution package from the metafacture-core/releases page. Make sure that you do download a distribution package and not a source code package (the file name should include -dist).

  2. Extract the downloaded archive:

    $ tar xzf metafacture-core-VERSION-dist.tar.gz

    This will create a new directory containing a ready-to-use metafacture distribution.

  3. Change into the newly created directory:

    $ cd metafacture-core-VERSION
  4. Run one of the example scripts:

    $ ./flux.sh examples/read/marc21/read-marc21.flux

    This example will print a number of marc21 records on standard out.

The examples folder contains many more examples which provide a good starting point for learning metafacture. If you have any questions please join our mailing list or use our issue-based discussion forum over at metafacture-documentation.

Using Metafacture as a Java libary

If you want use Metafacture in your own Java projects all you need to add some dependencies to your project. As of Metafacture 5 the single metafacture-core package has been replaced with a number of domain-specific packages. You can find the list of packages on Maven Central.

Alternatively, you can simply guess the package names from the top-level folders in the source code repository -- they are the same. For instance, if you want to use Metamorph in your project, simply add the following dependency to your pom.xml:

<dependency>
    <groupId>org.metafacture</groupId>
    <artifactId>metamorph</artifactId>
    <version>VERSION</version>
</dependency>

or if Gradle is your build tool of choice use:

dependencies {
    implementation 'org.metafacture:metamorph:VERSION'
}

Our integration server automatically publishes successful builds of all branches as snapshot versions on Sonatype OSS Repository. The version number is derived from the branch name. Snapshot builds from the master branch always have the version "master-SNAPSHOT".

Building metafacture-core from source

Building metafacture-core from source is easy. All you need is git and JDK 8:

  1. Clone the metafacture-core repository and change into the directory:

    $ git clone https://github.com/metafacture/metafacture-core.git
    $ cd metafacture-core
  2. Invoke the gradle-wrapper to download Gradle and build metafacture-core:

    $ ./gradlew install

    on Windows call gradlew.bat install instead.

See Code Quality and Style on the wiki for further information on the sources.

Stay updated

For support and discussion join the mailing list.