Installing csv2rdf4lod automation

Tim L edited this page Aug 8, 2013 · 105 revisions
Clone this wiki locally
csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)

If installing csv2rdf4lod-automation is not a pleasure for you, please contact me so that I can do whatever it takes to make it better. I will update this documentation and the tool to accommodate your feedback. -Tim

After installing, be sure to check out A quick and easy conversion by stepping through the Conversion process phases.

What you need

  • Unix-flavor operating system (including Mac OS X Terminal.app or cygwin)
  • Familiarity with command line
  • git is optional, but recommended

Step 1: Get the code

You can use git to get the code, or you can grab the tarball or zip. We recommend using git, since it more cleanly handles updating (and un-updating). The tarball and zip do provide a crude self-update mechanism (update-csv2rdf4lod-bin.sh), but it is not as eloquent. If you're worried about not knowing git, don't worry -- you only need to know two of its commands (clone and pull).

bash-3.2$ cd ~/Desktop
bash-3.2$ git clone git://github.com/timrdf/csv2rdf4lod-automation.git
Cloning into csv2rdf4lod-automation...
remote: Counting objects: 7412, done.
remote: Compressing objects: 100% (2208/2208), done.
remote: Total 7412 (delta 4841), reused 7244 (delta 4674)
Receiving objects: 100% (7412/7412), 71.47 MiB | 1.63 MiB/s, done.
Resolving deltas: 100% (4841/4841), done.

Step 2: Install

Go into the directory that git set up and run install.sh:

bash-3.2$ cd ~/Desktop/csv2rdf4lod-automation/
bash-3.2$ ./install.sh 
install.sh:
   has set $CSV2RDF4LOD_HOME to /Users/timrdf/Desktop/csv2rdf4lod-automation in my-csv2rdf4lod-source-me.sh
   created my-csv2rdf4lod-source-me.sh.

What to do next:
   'source my-csv2rdf4lod-source-me.sh' to set environment variables.
    sourcing my-csv2rdf4lod-source-me.sh must be done each time you log in, so consider adding it to your ~/.bashrc.

install.sh will tell you what it did and what you should do next:

bash-3.2$ source my-csv2rdf4lod-source-me.sh
--
CSV2RDF4LOD_HOME                                         /Users/timrdf/Desktop/csv2rdf4lod-automation
CSV2RDF4LOD_BASE_URI                                     http://logd.tw.rpi.edu
CSV2RDF4LOD_BASE_URI_OVERRIDE                            (not required, $CSV2RDF4LOD_BASE_URI will be used.)
...

use cr-vars.sh to see the environment variables that CSV2RDF4LOD uses to control execution flow.

(NOTE: csv2rdf4lod-automation currently provides environment variables as bash; if you'd like support for another shell, vote for the issue on github)

Step 2.1 Set your CSV2RDF4LOD_BASE_URI

As you're starting, you only need to worry about the environment variable CSV2RDF4LOD_BASE_URI -- change it to your data-hosting site:

bash-3.2$ vi my-csv2rdf4lod-source-me.sh
export CSV2RDF4LOD_BASE_URI="http://your.org"
:wq

(for a description of base URI and why it is needed, see conversion process phase: name)

After setting CSV2RDF4LOD_BASE_URI in my-csv2rdf4lod-source-me.sh, you don't need to change any more, since install.sh set CSV2RDF4LOD_HOME for you and the rest have good defaults and can be tweaked later as you get more comfortable with converting and publishing.

Step 2.2 source your source-me.sh

Make sure you source csv2rdf4lod-automation/my-csv2rdf4lod-source-me.sh in any shell invocation that you try to use csv2rdf4lod-automation -- the environment variables need to be set up.

cr-vars.sh (code) will show the environment variables that csv2rdf4lod-automation uses, along with their current values. If this script is on your path and runs, then your shell should be configured to run csv2rdf4lod-automation tools correctly. Inspect the variable settings to ensure that the tools will behave according to your preferences and their documentation.

bash-3.2$ cr-vars.sh
  
CSV2RDF4LOD_HOME                                         /opt/csv2rdf4lod
CSV2RDF4LOD_BASE_URI                                     http://logd.tw.rpi.edu
...
  
CSV2RDF4LOD_CONVERT_MACHINE_URI                          http://purl.org/twc/id/machine/lebot/MacBookPro6_2
CSV2RDF4LOD_CONVERT_PERSON_URI                           http://tw.rpi.edu/instances/TimLebo
  
...
  
CSV2RDF4LOD_PUBLISH                                      false
CSV2RDF4LOD_PUBLISH_DELAY_UNTIL_ENHANCED                 true
...
  
see documentation for variables in:
https://github.com/timrdf/csv2rdf4lod-automation/blob/master/bin/setup.sh
https://github.com/timrdf/csv2rdf4lod-automation/wiki/CSV2RDF4LOD-environment-variables
  
http://purl.org/twc/id/software/csv2rdf4lod

For some added features, you can walk through Installing csv2rdf4lod automation - complete, but it is not required to get started converting.

Want credit?

To get some credit for your efforts converting and enhancing data, add the URIs for your machine and yourself. This is optional but highly recommended - you could get cited in the future!

export CSV2RDF4LOD_CONVERT_MACHINE_URI="http://your.edu/web/inside/machine/gemini#"
...
export CSV2RDF4LOD_CONVERT_PERSON_URI="http://tw.rpi.edu/instances/notTimLebo"

To see where these values are used, see CSV2RDF4LOD_CONVERT_PERSON_URI.

What's next?

Historical note

(for alternative external descriptions, see csv2rdf4lod Tutorials)