Skip to content

conversion:domain_name

timrdf edited this page Oct 24, 2012 · 36 revisions

What is first

What we will cover

This page discusses how to add a type to the row of a table during conversion (and thus the subject of the triples created) using the conversion:domain_name enhancement.

Let's get to it

The following table talks about people and states, and it is good practice to explicitly type each kind of instance. In the naive RDF conversion below the table, it is rather difficult to automatically recognize that people and states are being mentioned.

First,Last,State
Anne,Armstrong,PA
Ben,Bailey,NY
eg:thing_2
   raw:first "Anne" ;
   raw:last "Armstrong" ;
   raw:state "PA" ;
   ov:csvRow "2"^^xsd:integer .

The conversion:domain_name can help make this clearer (and conversion:range_name behaves similarly for the resource objects created from cells in a table). If we look at the initial enhancement parameters [created the first time](Generating enhancement parameters) we pulled the conversion trigger, we see that the converter is being told to interpret each property as a string literal. We can do better than that.

      conversion:enhance [
         ov:csvCol          1;
         ov:csvHeader       "First";
         #conversion:label   "First";
         conversion:comment "";
         conversion:range   todo:Literal;
      ];
      conversion:enhance [
         ov:csvCol          2;
         ov:csvHeader       "Last";
         #conversion:label   "Last";
         conversion:comment "";
         conversion:range   todo:Literal;
      ];
      conversion:enhance [
         ov:csvCol          3;
         ov:csvHeader       "State";
         #conversion:label   "State";
         conversion:comment "";
         conversion:range   todo:Literal;
      ];

By adding the conversion:domain_name enhancement with the label "Person" to the first column, the instances created for each table row during conversion are typed to a class for Person.

      conversion:enhance [                   
         ov:csvCol          1;               
         ov:csvHeader       "First";         
         conversion:domain_name "Person"; # <-- Add a type to the row.
         conversion:range   todo:Literal;    
      ]; 
eg_vocab:Person        # <-- A new class is created for "Person"
   a rdfs:Class , 
     owl:Class ;
   rdfs:label "Person" .

eg:person_2            # <-- The row subject is renamed (from "thing_2") to suggest its type.
   a eg_vocab:Person ; # <-- A new class is created for "Person" and the row is typed.
   e1:first "Anne" ;
   e1:last "Armstrong" ;
   e1:state "PA" ;
   ov:csvRow "2"^^xsd:integer .

eg:person_3 
   a eg_vocab:Person ;
   e1:first "Ben" ;
   e1:last "Bailey" ;
   e1:state "NY" ;
   ov:csvRow "3"^^xsd:integer .

Putting it all together

The conversion:domain_name enhancement is often used together with the following enhancements. We apply all of them to make a much more useful RDF structure from the table:

  • The subject URIs were renamed to more informative (e.g. :Anne_Armstrong" instead of :thing_2)
  • Human-readable strings are provided for the URI (e.g. "Anne Armstrong")
  • The instance of people are typed to the very common foaf:Person class.
  • The first and last names reuse the FOAF vocabulary.
  • The relation between the person and the state is expressed with a more specific form of foaf:based_near (eg:lives_in) in addition to the more general relation foaf:based_near.
  • The original URIs created for the people (e.g. eg:thing_2 and eg:person_2) reference the more preferred URI (e.g. eg:Anne_Armstrong). This allows anyone that used the original URIs to find their way to its better form.
  • The vocabulary created to model these instances are defined (as owl:Class and rdfs:Class) so that they can be further described by others.
eg:Anne_Armstrong
   rdfs:label "Anne Armstrong" ; dcterms:identifier "Anne Armstrong" ; coin:slug "Anne_Armstrong" ;
   a foaf:Person , eg_vocab:Person ;
   foaf:firstName "Anne" ;
   foaf:family_name "Armstrong" ;
   e2:lives_in     typed_state:PA ;
   foaf:based_near typed_state:PA ;
   ov:csvRow "2"^^xsd:integer .
   
eg:Ben_Bailey 
   rdfs:label "Ben Bailey" ; dcterms:identifier "Ben Bailey" ; coin:slug "Ben_Bailey" ;
   a foaf:Person , eg_vocab:Person ;
   foaf:firstName "Ben" ;
   foaf:family_name "Bailey" ;
   e2:lives_in     typed_state:NY ;
   foaf:based_near typed_state:NY ;
   ov:csvRow "3"^^xsd:integer .

eg:thing_2 con:preferredURI eg:Anne_Armstrong .
eg:thing_3 con:preferredURI eg:Ben_Bailey .

typed_state:PA 
   dcterms:identifier "PA" ;
   a eg_vocab:State , wgs:SpatialThing ;
   rdfs:label "PA" .

typed_state:NY 
   dcterms:identifier "NY" ;
   a eg_vocab:State , wgs:SpatialThing ;
   rdfs:label "NY" .

What is next

  • conversion:range_name behaves just like conversion:domain_name, but applies to the resources created from tabular cells (instead of the table rows like with conversion:domain_name).
  • conversion:domain_template to rename the subject of the triples created during conversion (instead of naming its type, with conversion:domain_name).
  • conversion:class_name/conversion:subclass_of to subtype the class created by conversion:domain_name into more popular classes that already exist.

Older discussions



















csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)

(apologies for the bad escaping - a bug in github?)

&amp;amp;amp;quot;What does each row represent?&amp;amp;amp;quot; is one of the first questions about an unfamiliar table of data. The conversion:domain_name enhancement answers this common question. For example, in our [quick and easy conversion](A quick and easy conversion), each row is an oil well; In the White House visitor records, each row is a person&amp;#39;s visit to the White House.

Because many data curators commonly need to specify the conversion:domain_name, it is included in the default enhancement parameters.

      #conversion:enhance [
      #   conversion:domain_template &amp;amp;amp;amp;amp;amp;quot;thing_[r]&amp;amp;amp;amp;amp;amp;quot;;
      #   conversion:domain_name     &amp;amp;amp;amp;amp;amp;quot;Thing&amp;amp;amp;amp;amp;amp;quot;;
      #];
      #conversion:enhance [
      #   conversion:class_name &amp;amp;amp;amp;amp;amp;quot;Thing&amp;amp;amp;amp;amp;amp;quot;;
      #   conversion:subclass_of &amp;amp;amp;amp;amp;amp;lt;http://purl.org/...&amp;amp;amp;amp;amp;amp;gt;;
      #];
  conversion:enhance [
     ov:csvCol          1;
     ov:csvHeader       "First";
     #conversion:label   "First";
     conversion:comment "";
     conversion:range   todo:Literal;
  ];
  conversion:enhance [
     ov:csvCol          2;
     ov:csvHeader       "Last";
     #conversion:label   "Last";
     conversion:comment "";
     conversion:range   todo:Literal;
  ];
  conversion:enhance [
     ov:csvCol          3;
     ov:csvHeader       "State";
     #conversion:label   "State";
     conversion:comment "";
     conversion:range   todo:Literal;
  ];

Example - Oil Wells

If conversion:domain_name is not set, you'll end up with something like:

:thing_2
   raw:quadrant_no &quot;106&quot; ;
   raw:tvdss_driller &quot;0&quot; ;
   raw:well_registration_no &quot;106/20- 1&quot; ;
   raw:completion_date &quot;1990-11-01&quot; ;

but after setting the enhancement parameters to:

      conversion:enhance [
         conversion:domain_name     &quot;Oil Well&quot;;
      ];

you will get:

:oil_Well_2
   a local_vocab:Oil_Well ;
   e1:quadrant_no &quot;106&quot; ;
   e1:tvdss_driller &quot;0&quot; ;
   e1:well_registration_no &quot;106/20- 1&quot; ;
   e1:completion_date &quot;1990-11-01&quot; ;
.

local_vocab:Oil_Well 
   a rdfs:Class , owl:Class ;
   rdfs:label &quot;Oil Well&quot; .

:thing_2 con:preferredURI :oil_Well_2 .

Effects

  • Types the subject row (e.g. local_vocab:Oil_Well).
  • changes the local name of the subject (e.g. oil_Well_2 instead of thing_2). For more control of the local name or the full URI of the subject, see conversion:domain_template.
  • asserts a con:preferredURI from the thing_2 row instance to (e.g.) oil_Well_2. This will lead anybody that was familiar with :thing_2 to the preferred URI that includes a bit more human-friendly meaning.

Datasets that use this enhancement

(results):

PREFIX conversion: &amp;amp;amp;amp;amp;amp;amp;lt;http://purl.org/twc/vocab/conversion/&amp;amp;amp;amp;amp;amp;amp;gt;
SELECT distinct ?dataset ?template
WHERE {
  GRAPH &amp;amp;amp;amp;amp;amp;amp;lt;http://logd.tw.rpi.edu/vocab/Dataset&amp;amp;amp;amp;amp;amp;amp;gt;  {
    ?dataset conversion:conversion_process [
       conversion:enhance [
          conversion:domain_name ?template
       ]
    ]
  }
}

What is next

Clone this wiki locally