The package processes tabular data in the TSV format and converts it to RDF in the TriG serialization according to the user specifications.
- PHP => 5.3.2
- Composer => 1.6.*
- ext-curl => *
Pull this package in through Composer.
{
"require": {
"pensoft/tsv4rdf": "*"
}
}
or run in terminal: composer require pensoft/tsv4rdf
Laravel 5.x integration
Add the service provider to your config/app.php
file:
'providers' => array(
//...
Pensoft\TSV4RDF\Providers\TSV4RDFServiceProvider::class,
),
Add the facade to your config/app.php
file:
'aliases' => array(
//...
'TSV4RDF' => Pensoft\TSV4RDF\Facades\TSV4RDF::class,
),
cities.tsv
TSV is a file extension for a tab-delimited file
city | city_ascii | lat | lng | pop | country | iso2 | iso3 | province |
---|---|---|---|---|---|---|---|---|
Qal eh-ye Now | Qal eh-ye | 34.98300013 | 63.13329964 | 2997 | Afghanistan | AF | AFG | Badghis |
Chaghcharan | Chaghcharan | 34.5167011 | 65.25000063 | 15000 | Afghanistan | AF | AFG | Ghor |
Lashkar Gah | Lashkar Gah | 31.58299802 | 64.35999955 | 201546 | Afghanistan | AF | AFG | Hilmand |
Via Laravel 5.4
use Pensoft\TSV4RDF\TSV4RDF;
$tsv4rdf = new TSV4RDF();
$tsv4rdf->file('cities.tsv');
$tsv4rdf->file('cities.csv');
$tsv4rdf->setDelimeter(',');
This package can stream and csv
files but must to set comma separator
$tsv4rdf->setNamespace('geo', 'http://rdf.insee.fr/def/geo#');
or
$tsv4rdf->setNamespaces( array('geo' => 'http://rdf.insee.fr/def/geo#', 'geo2' => 'http://www.opengis.net/geosparql#') );
$tsv4rdf->setBasePrefix ('geo');
by default is NULL
If is not set base prefix by default before to start stream data takes from namespace the first and applies it.
$tsv4rdf->setPredefinedPredicates( array('city' => 'geo2:city') );
Exclude from input file columns
$tsv4rdf->setPredefinedPredicates( array('city_ascii' => '!') );
Actions helps to change or involke triples
and namespaces
during in stream the turtles.
actionInitialize(function($row, $class){ })
This action is dispatch only once time before to load triples and namespace. The param
$class
istsv4rdf
and can involke all public functions
All actions
actionBeforeTriples(function($row, $class){ })
The action actionBeforeTriples
is involked before build all triples from csv.
actionAfterTriples(function($row, $class){ })
The action actionBeforeTriples
is involked after build all triples from csv.
actionBeforeNamespaces(function($row, $class){ })
Adds prefixes before build namespaces in turtle.
actionAfterNamespaces(function($row, $class){ })
Adds prefixes after build all exists namespaces in turtle..
Available methods in actions
setOneTimeTriple()
Renders triple
only one time
actionInitialize(function($row, $object){
$object->setOneTimeTriple($subject = '<http://openbiodiv.net/d3be573a-f04e-411a-adbc-45b048ded905-8708510>',
$predicate = 'a',
$object = 'fabio:Label',
$is_object = true
);
});
setTriple()
Stores triple
in stock of triples
actionInitialize(function($row, $object){
$object->setTriple(
$subject = '<http://openbiodiv.net/d3be573a-f04e-411a-adbc-45b048ded905-8708510>',
$predicate = 'a',
$object = 'openbiodiv:Label',
$is_object = true
);
});
setSubjectsSuffix()
The setSubjectsSuffix
function adds to the main subject suffix.
actionInitialize(function($row, $object){
$object->setSubjectsSuffix('-label');
echo $object->getSubject();
});
removeRowField()
The removeRowField
skips a value from tsv
resource.
actionInitialize(function($row, $object){
$object->removeRowField('country', $row);
});
removeSubjectsSuffix()
The removeSubjectsSuffix
removes a suffix from subject.
actionInitialize(function($row, $object){
$object->removeSubjectsSuffix('-label');
echo $object->getSubject();
});
getSubject()
The getSubject
returns a value with some suffix and prefix. By default $suffix
and $prefix
are empty.
actionInitialize(function($row, $object){
$object->getSubject(
$suffix = '',
$prefix = '';
);
});
getPredicate()
The getPredicate
returns a value from some resource column name. The parameter can be a number or name of column.
actionInitialize(function($row, $object){
$object->getPredicate('country');
});
getObject()
The getObject
returns a value from some column. The parameter can be a number or name of column.
actionInitialize(function($row, $object){
$object->getObject($row, 'country', $default = null);
});
setNamespace()
The setNamespace
records prefix
and resource
in an array of namespaces.
actionInitialize(function($row, $object){
$object->setNamespace(
$prefix = '',
$resource = ''
);
});
$tsv4rdf->setLimit( 1000 );
by default limit is set to 0
and read all input data
$tsv4rdf->toFile( 'cities.trig' );
The output data is appended to singular file and could be set with various extensions
$tsv4rdf->toFiles( 'cities_$1.trig', $1, 1000 );
The output data is splitted by files with 1000 rows from input data
$tsv4rdf->toAPI ($endpoint, $method = 'GET', $options = array(), $headers = array())
$endpoint
- URL
$method
- 'GET','POST', 'PUT', 'PATCH' and etc.
$options
- see curl_setopt
$headers
- see CURLOPT_HEADEROPT link
$tsv4rdf->toString ()
Desplay output data