GitHub

Metablock: Semantic Publishing Tools

Metablock reads RDF data and writes a possibly modified resource description back to a RDF target.

RDF Data sources:

SPARQL service endpoints
RDF data files
OAI data sources with XSLT lifting
RDBMS with XSLT lifting

RDF data targets:

Virtuoso RDF storage
Jena TDB storage
RDF data files
Solr search index with XSLT transformation

Compile && Configure

configured by lib/metablock.ttl
compile:

    CPATH=war/jetty/lib/servlet-api-3.1.jar:lib:lib/*
    javac -cp $CPATH -d lib src/org/metablock/rest/*
    jar cf lib/metablock.jar -C lib org

start autobib daemon:

    abd start

Indexing : write RDF data to a Solr search index

To build a Solr search index from a SPARQL service endpoint, three steps are required:

Resource Enumeration: List all resources that should be indexed,
Resource Dump: Query everything the triple store knows about a resource,
Resource Tranformation: Transform RDF/XML to solr index format.

Step 1. and 2. need a SPARQL query, step 3 works with XSLT.
The sparql queries and xslt transformations used so far are rather general, but modelling of bibliographic resources may vary and require modification.

All configurations are done in turtle (see lib/metablock.ttl)

RDF Crawling

RDF data sources configured in lib/metablock.ttl can be tested with

java -jar metablock.jar -s [source] -t [target] -test

Copy from RDF source to RDF target

java -jar metablock.jar -s [source] -t [target]

RDF analyzers

PDF Analyzer: utilizes Grobid / CERMINE to extract metadata and bibliographical references from scientific articles

java -jar metablock.jar -s [source] -t [target] -e pdf

Reference Analysis: use external libraries to find citation context and determine citation polarity

java -jar metablock.jar -s [source] -t [target] -e sen

Examples

Index a directory of PDF files, enable pdf engine to extract metadata and write to VuFind:

java -jar metablock.jar -crawl -s files -t solr1 -e pdf Documents
Write DSpace metadata from DSpace REST API to Virtuoso triplestore (experimental):

java -jar metablock.jar -crawl -s dspace -t virt
Crawl OAI sources to a jena TDB store:

java -jar metablock.jar -crawl -s oai -t tdb
Build a search index for a TDB triple store

java -jar metablock.jar -crawl -s tdb -t solr1

Developer Documentation

Javadoc is available online.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Metablock: Semantic Publishing Tools

RDF Data sources:

RDF data targets:

Compile && Configure

Indexing : write RDF data to a Solr search index

RDF Crawling

RDF analyzers

Examples

Developer Documentation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
lib		lib
src/org/metablock/rest		src/org/metablock/rest
war		war
Readme.md		Readme.md
abd		abd

Cloud8/Metablock

Folders and files

Latest commit

History

Repository files navigation

Metablock: Semantic Publishing Tools

RDF Data sources:

RDF data targets:

Compile && Configure

Indexing : write RDF data to a Solr search index

RDF Crawling

RDF analyzers

Examples

Developer Documentation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages