Parsing RDF Formats

semsol edited this page Mar 11, 2011 · 2 revisions

ARC2 comes with a built-in RDF format detector for convenient parsing. You can instantiate the generic RDF parser and let ARC detect RDF/XML, Turtle, N-Triples, RSS 2.0, SPOG (constrained SPARQL XML results), Google's Social Graph API JSON, or HTML:

$parser = ARC2::getRDFParser();

You can also access a specific parser directly:

$parser = ARC2::getRDFXMLParser();

or

$parser = ARC2::getTurtleParser();

Parsing a remote document

$parser->parse('http://example.com/foaf.rdf');

Parsing a local document

$parser->parse('data/foaf.ttl');

Parsing data

$base = 'http://example.com/';
$data = '<rdf:RDF ...>...</rdf:RDF>';
$parser->parse($base, $data);

Retrieving a flat triples array after parsing

$triples = $parser->getTriples();

Retrieving triples indexed by subject -> predicates -> objects

$index = $parser->getSimpleIndex();

This method will by default flatten the object values. To keep object type / datatype / language information, you can use

$index = $parser->getSimpleIndex(0);

See also: Internal Structures