timrdf edited this page Feb 14, 2012 · 31 revisions
Clone this wiki locally
csv2rdf4lod-automation is licensed under the Apache License, Version 2.0

In its simplest form, csv2rdf4lod is a quick and easy way to produce an RDF encoding of data available in Comma-Separated-Values (CSV).

In its advanced form, csv2rdf4lod is a custom reasoner tailored for some heavy-duty data integration. Although csv2rdf4lod can handle tabular data from well-structured RDBMS dumps, its forte is in handling "messier" tabular data created manually or using less rigorous information modeling strategies -- perfect for handling real data that evolved ''in the wild''.

In either case, csv2rdf4lod is designed to aggregate and integrate multiple versions of multiple datasets of multiple source organizations in an incremental and backward-compatible way. -- Tim Lebo

''Where to go from here?'' conversion:Enhancement is the most frequently used page for those up and running with csv2rdf4lod. It lists the enhancements that the converter recognizes, along with the beginning and end states of the RDF produced when performing the enhancement. It even cites datasets that benefited from the enhancements, so you can go check out the results ''in the wild''. (BTW, the enhancements are encoded in RDF using our conversion vocabulary, effortlessly decoupling them from the specific converter implementation and enabling cool things like querying for datasets according to how they were enhanced.) ''But first!''

Before getting to the point of making consistent, approachable, explicitly connected, integrated, ''linked'', ontology-ready, and provenance-captured semantic web instance data, you'll need to get started by Installing csv2rdf4lod automation on your favorite unix flavor. Apologies to those non-unix folks out there, but our initial focus was on the capability. We're looking for suggestions and mockups for a GUI that can effectively convey what the converter to do, but we haven't yet found anything that we are happy with. We're also looking for help to set up a RESTful service to expose the converter's capability (::nudge:: ::nudge::).

''Too much, too quick?''

If you're not quite ready to dig in, you can listen in on how csv2rdf4lod was used in a variety of Real-World Examples. There's our results from a Mashathon held in Washington DC in August 2010, where three of us grad student types helped two dozen federal government types grouped into a few teams to develop a mashup in less than two working days. Skipping a handful in between, our latest example comes from another group in the EPA trying to index third party environmental reports.

''Go explore!''

For more, see the list of wiki pages on github. Although some of it is terse and fragmented, it will be the go-to place for me to synthesize all of the materials I've made over the past year while working on this toolset. It should give you a hint as to what is available and what is coming down the pipeline. Also, feel free to shoot me questions or make suggestions! One of my URIs is http://tw.rpi.edu/instances/TimLebo and you can find my email by following your nose. Or check out the Publications and Presentations.

csv2rdf4lod logo qrcode linking to csv2rdf4lod's github

QR Code for this page.