Java parsers for different RDF serialisations + API + tools + JAX-RS integration
Java Shell
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
gradle/wrapper Update Gradle wrapper to version 3.1 Oct 8, 2016
nxparser-api Sync Gradle build files and settings with Maven POMs May 13, 2016
nxparser-cli Reviving the old patching code. Jun 7, 2016
nxparser-commons Streamlining the parser interfaces, which causes a major version bump. May 3, 2016
nxparser-jax-rs
nxparser-model-datatypes Sync Gradle build files and settings with Maven POMs May 13, 2016
nxparser-model updating the HTTP vocabulary May 22, 2017
nxparser-parsers-external-jsonld-jsonld_java
nxparser-parsers-external-jsonld-semargl Sync Gradle build files and settings with Maven POMs May 13, 2016
nxparser-parsers-external-rdfa-semargl-tagsoup adding new module with tagsoup patch Aug 17, 2017
nxparser-parsers-external-rdfa-semargl documenting findings to treat problems Aug 17, 2017
nxparser-parsers-external-turtle-jena Sync Gradle build files and settings with Maven POMs May 13, 2016
nxparser-parsers Sync Gradle build files and settings with Maven POMs May 13, 2016
nxparser-serialisers Sync Gradle build files and settings with Maven POMs May 13, 2016
nxparser-untested-cli Reviving the old patching code. Jun 7, 2016
nxparser-untested-utilities-nx moving count into the distribution Jul 26, 2017
nxparser-utilities-nx Sync Gradle build files and settings with Maven POMs May 13, 2016
nxparser-utilities-uri bugfix and new testcase May 13, 2016
nxparser-utilities moving count into the distribution Jul 26, 2017
.gitignore Updated gitingore Nov 5, 2015
.gitlab-ci.yml Fixed Java7 docker image for CI temporarly to a fixed version May 28, 2017
MPL-1.1.txt Move files to correct directory Jan 12, 2015
README.md Clarifications on the how to use NxParser using Maven. Jan 30, 2017
README.txt Move files to correct directory Jan 12, 2015
build.gradle syncing gradle with maven Aug 14, 2017
gradlew Update Gradle wrapper to version 3.1 Oct 8, 2016
gradlew.bat Update Gradle wrapper to version 3.1 Oct 8, 2016
jitpack.yml Force JitPack to use correct Java version May 19, 2016
license-nx-full.txt Move files to correct directory Jan 12, 2015
license-nx-lite.txt Move files to correct directory Jan 12, 2015
pom.xml adding new module with tagsoup patch Aug 17, 2017
release.sh reflected last pom update in release.sh Nov 5, 2015
settings.gradle Sync Gradle build files and settings with Maven POMs May 13, 2016

README.md

Welcome to NxParser

NxParser is a Java open source, streaming, non-validating parser for the Nx format, where x = Triples, Quads, or any other number. For more details see the specification for the NQuads format, a extension for the N-Triples RDF format. Note that the parser handles any combination (cf. generalised triples) or number of N-Triples syntax terms on each line (the number of terms per line can also vary).

It ate 2 mil. quads (~4GB, (~240MB GZIPped)) on a T60p (Win7, 2.16 GHz) in ~1 min 35 s (1:18min). Overall, it's more than twice as fast as the previous version when it comes to reading Nx.

The NxParser is non-validating, meaning that, e.g., it will happily eat non-conformant N-Triples. Also, the NxParser will not parse certain valid N-Triples files where the RDF terms are not separated by whitespace. We pass all positive W3C N-Triples test cases except one, where the RDF terms are not separated by whitespace (surprise!).

Other formats

The NxParser Parser family also includes a RDF/XML and a Turtle parser. Moreover, we attached a JSON-LD parser (jsonld-java) and a RDFa parser (semargl) such that they emit Triples in the NxParser API.

Binaries

Compiles are available on Maven Central. The groupId is org.semanticweb.yars. Depending on what part you need, you have to choose the artifactId accordingly: For example, if you only want to use the data model, use nxparser-model. If you want to make use of the parsers, use nxparser-parsers. If you want to use the RDF support for JAX-RS, use nxparser-jax-rs. The modules are linked as required.

<dependency>
  <groupId>org.semanticweb.yars</groupId>
  <artifactId>nxparser-parsers</artifactId>
  <version>2.3.3</version>
</dependency>

Legacy binaries

Find old compiles in the repository on Google Code, which we do not maintain any more. To use it nevertheless, add

<repository>
 <id>nxparser-repo</id>
 <url>
  http://nxparser.googlecode.com/svn/repository
 </url>
</repository>
<repository>
 <id>nxparser-snapshots</id>
 <url>
  http://nxparser.googlecode.com/svn/snapshots
 </url>
</repository>

to your pom.xml.

Code Examples

Read Nx from a file

FileInputStream is = new FileInputStream("path/to/file.nq");

NxParser nxp = new NxParser();
nxp.parse(is);

for (Node[] nx : nxp)
  // prints the subject, eg. <http://example.org/>
  System.out.println(nx[0]);

Use a blank node

// true means you are supplying proper N-Triples RDF terms that do not need to be processed
Resource subjRes = new Resource("<http://example.org/123>", true);
Resource predRes = new Resource("<http://example.org/123>", true);
BNode bn = new BNode("_:bnodeId", true);

Node[] triple = new Node[]{subjRes, predRes, bn};
// yields <http://example.org/123> <http://example.org/123> _:bnodeId
System.out.println(Arrays.toString(triple));

Use Unicode-characters

String japaneseString = ("祝福は、チーズのメーカーです。");
Literal japaneseLiteral = new Literal(japaneseString, "ja");

// yields "\u795D\u798F\u306F\u3001\u30C1\u30FC\u30BA\u306E\u30E1\u30FC\u30AB\u30FC\u3067\u3059\u3002"@ja
System.out.println(japaneseLiteral);

// yields 祝福は、チーズのメーカーです。
System.out.println(japaneseLiteral.getLabel());

Use datatyped literals

Example: Get a Calendar object from an xsd:dateTime-typed Literal

Literal dtl; // parser-generated
XSDDateTime dt = (XSDDateTime)DatatypeFactory.getDatatype(dtl); 
GregorianCalendar cal = dt.getValue();

Use from Python

Provided you use the Jython implementation (thanks to Uldis Bojars, this is saved from his now offline blog).

import sys
sys.path.append("./nxparser.jar")
	 
from org.semanticweb.yars.nx.parser import *
from java.io import FileInputStream
from java.util.zip import GZIPInputStream
	 
def all_triples(fname, use_gzip=False):
  in_file = FileInputStream(fname)
  if use_gzip:
      in_file = GZIPInputStream(in_file)
	 
  nxp = NxParser()
  nxp.parse(in_file)
	 
  while nxp.hasNext():
    triple = nxp.next()
    n3 = ([i.toString() for i in triple])
    yield n3

The code above defines a generator function which will yield a stream of NQuad records. We can now add some demo code in order to see it in action:

def main():
  gzfname = "sioc-btc-2009.gz"
 
  for line in all_triples(gzfname, use_gzip=True):
    print line
	 
  if __name__ == "__main__":
    main()

results in:

[u'<http://2008.blogtalk.net/node/29>', u'<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>', u'<http://rdfs.org/sioc/ns#Post>', u'<http://2008.blogtalk.net/sioc/node/29>']
[u'<http://2008.blogtalk.net/node/65>', u'<http://rdfs.org/sioc/ns#content>', u'"We\'ve created a map showing the main places of interest (event locations, restaurants, pubs, shopping locations and tourist sights) during BlogTalk 2008.  The conference venue is shown on the left-hand side of the map.  We will also have a hardcopy for all attendees. View Larger Map"', u'<http://2008.blogtalk.net/sioc/node/65>']

issues with Eclipse

we had an issue with eclipse not being able to create his folder structure for nxparser-parsers, mvn eclipse:eclipse did the trick.