Skip to content
knudmoeller edited this page Aug 28, 2012 · 63 revisions

This document is a step by step tutorial explaining how to deploy a working LATC platform on a desktop PC or a server.

Base set up

In the following, we take care of all the dependencies of LATC. That is

  • Java
  • Hadoop 1.0.0
  • MySQL (or any other database system with a JDBC connector)
  • Jetty 7 or Tomcat 7
  • Apache
  • A PHP engine

In the following, we assume you have a running installation of all the aforementioned elements. The indicated versions are the one the software has been tested with, you may get the platform running with different versions but be aware that this would have not been tested.

Before you set the list of common variables we propose to declare the following alias in /etc/hosts: 127.0.0.1 latc-platform.local. Moreover, please create a new directory, e.g. LATC and move Tomcat to this directory.

List of common variables

  • $WEBAPPS = webapps directory of the servlet container (for instance, $WEBAPPS = /path/to/Tomcat/webapps)
  • $SERVER = server location for Servlet container (for instance, $SERVER = http://latc-platform.local:8080)
  • $APACHE = server location for Apache (for instance, http://latc-platform.local)
  • $WWW = web directory of apache (for instance, /path/to/LATC/www)

TODO: Don't we want Tomcat to run behind Apache as reverse proxy?

##Configure Apache virtualhost On Mac OS X: edit /etc/apache2/extra/httpd-vhosts.conf to add vhost config

<VirtualHost *:80>
    ServerAdmin webmaster@dummy-host.example.com
    DocumentRoot "/Users/your_username/path/to/LATC/www"
    ServerName latc-platform.local
    ErrorLog "/private/var/log/apache2/latc-platform.local-error_log"
    CustomLog "/private/var/log/apache2/latc-platform.local-access_log" common
</VirtualHost>

<Directory "/Users/your_username/Desktop/LATC/www">
    Options FollowSymLinks
    AllowOverride All
    Order allow,deny
    Allow from all
</Directory>

then restart Apache: sudo apachectl restart

Configure mysql

Create database "latc" and user "latc" with password "latc" and give permissions to the user under: $ mysql -u root

CREATE DATABASE latc;
CREATE USER 'latc'@'localhost' IDENTIFIED BY 'latc';
GRANT ALL ON latc.* TO 'latc'@'localhost';

Deployment of the different components

The LATC platform is divided into several components. We hereafter detail there retrieval from the project code repository and installation, one by one

LATC Console

  1. Clone the content of the git repository of the LATC platform in this newly created directory and go to the directory of the console
git clone git@github.com:LATC/24-7-platform.git
cd latc-platform/console/
  1. Edit the file src/main/resources/datanucleus.properties to match the parameters of your database. More specificaly, edit the following lines:
javax.jdo.PersistenceManagerFactoryClass=org.datanucleus.jdo.JDOPersistenceManagerFactory
javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver
javax.jdo.option.ConnectionURL=jdbc:mysql://localhost/latc
javax.jdo.option.ConnectionUserName=latc
javax.jdo.option.ConnectionPassword=latc
  1. Edit the file $WEBAPPS/WEB-INF/configuration.properties to set the location of the MDS and the key to use to query it
API_KEY_MDS = XXXXXXX
MDS_HOST = http://mds.lod-cloud.net/graphs
  1. Compile the source code using maven
mvn package
  1. The result is a WAR file LATC-Console.war located in target. In case the compilation fails, you can download the latest release
  2. Copy the WAR file LATC-Console.war into $WEBAPPS. The file should automatically be unpacked by Tomcat.
  3. Browse to $SERVER/LATC-Console/ to start using the Console

Note: If you would like to run console locally without installing tomcat you might do so using maven tomcat plugin by simply typing mvn tomcat:run-war

Silk Workbench

  1. Clone Silk to your LATC directory:
git clone git://git.assembla.com/silk.git
cd LATC/silk/silk2
  1. Compile the source code using mvn
mvn clean install
  1. Copy workbench.war into $WEBAPPS
cd LATC/silk/silk2/silk-workbench/silk-workbench-webapp
cp target/silk-workbench-webapp-2.5.3.war ~/path/to/LATC/tomcat/webapps/workbench.war
  1. Create configuration file config.properties and indicate location of console and API key plus branding. At the time of writing, the configuration file has to be located in the directory where Tomcat is started (“./”). So for example, if you CD into the Tomcat directory and start it with bin/startup.sh, then the config file would have to go into the Tomcat directory. This will be fixed.
workbenchName=LATC Workbench
linkSpecRepository=LATC_Console
linkSpecRepository.LATC_Console.URL=$WEBAPPS/LATC-Console/api/
linkSpecRepository.LATC_Console.API_Key=...
enableVoidSourceButton=true

TODO: Where do we find the LATC Console API key? I saw it hardcoded inside console code ;-) in APIKey.java class - not sure this one should be used

  1. Restart tomcat
cd LATC/tomcat
bin/shutdown.sh
bin/startup.sh
  1. Browse to $SERVER/workbench/ to start using the Workbench

LATC Data Source Inventory (DSI) and Meta data store (MDS)

Data Source Inventory (DSI)

Source Code

  • the DSI is based on the Puelia implementation of the Linked Data API, and implemented in php
  • the instructions are assuming that the DSI will be installed on a typical Linux server
  • the instructions are assuming that that the URI of the DSI will be http://dsi.lod-cloud.net - replace this value in the instructions below if you want to host the DSI elsewhere
  • since the DSI is not available as its own repository, we need to clone the whole LATC 24-7-platform repository from github and then create a symbolic link to the DSI folder (you don't need to repeat the git clone command if you already did this - in this case just do git pull origin master):
cd /var/www
git clone git://github.com/LATC/24-7-platform latc
ln -s latc/latc-platform/dsi/ dsi.lod-cloud.net

DSI Server Configuration

  • in addition to the DSI code, the setup also requires an Apache virtual host config file, most likely to be located in /etc/httpd/vhosts.d
  • here is what the current config file dsi.lod-cloud.net.config looks like:
<VirtualHost *:80>
    DocumentRoot /var/www/dsi.lod-cloud.net/
    ServerName dsi.lod-cloud.net
    LogFormat "%v %h %l %u %D %t \\"%r\\" %>s %b \\"%{Referer}i\\" \\"%{User-Agent}i\\"" apilogformat
    ErrorLog logs/dsi.lod-cloud.net.error.log
    CustomLog logs/dsi.lod-cloud.net.access.log apilogformat
    RewriteLog logs/dsi.lod-cloud.net.rewrite.log


    RewriteLogLevel 1
    RewriteEngine On

    RewriteRule ^/$            /datasets                                               [P,L,NE]
    RewriteRule ^/sparql$      http://api.talis.com/stores/latc-mds/services/sparql    [P,L,NE]

    LogLevel notice
    ProxyPreserveHost Off
    Options FollowSymLinks
</VirtualHost>

<Directory /var/www/dsi.lod-cloud.net/>
    Options FollowSymLinks
    AllowOverride FileInfo
</Directory>

Meta data store (MDS)

Source Code

  • the MDS is available as part of the LATC 24-7-platform on github: https://github.com/LATC/24-7-platform
  • the MDS is implemented in php
  • the instructions are assuming that the MDS will be installed on a typical Linux server
  • the instructions are assuming that that the URI of the MDS will be http://mds.lod-cloud.net - replace this value in the instructions below if you want to host the MDS elsewhere
  • since the MDS is not available as its own repository, we need to clone the whole LATC 24-7-platform repository from github and then create a symbolic link to the MDS folder (you don't need to repeat the git clone command if you already did this - in this case just do git pull origin master):
cd /var/www
git clone git://github.com/LATC/24-7-platform latc
ln -s latc/latc-platform/mds/ mds.lod-cloud.net

MDS Server Configuration

  • in addition to the MDS code, the setup also requires an Apache virtual host config file, most likely to be located in /etc/httpd/vhosts.d
  • since the triple store used by the MDS currently is a remote store on the Talis Platform (as opposed to a local store), the config file defines a number of rewrite rules for the APIs offered by the Talis Platform (i.e., SPARQL, textual search, and describe) - if the backend store were to change, these rewrite rules would have to be adapted
  • here is what the current config file mds.lod-cloud.net.config looks like:
<VirtualHost *:80>
    DocumentRoot /var/www/mds.lod-cloud.net/htdocs
    ServerName mds.lod-cloud.net
    LogFormat "%v %h %l %u %D %t \\"%r\\" %>s %b \\"%{Referer}i\\" \\"%{User-Agent}i\\"" apilogformat
    ErrorLog logs/mds.lod-cloud.net.error.log
    CustomLog logs/mds.lod-cloud.net.access.log apilogformat
    RewriteLog logs/mds.lod-cloud.net.rewrite.log

    RewriteLogLevel 1
    RewriteEngine On

    RewriteRule ^/sparql$        http://api.talis.com/stores/latc-mds/services/sparql    [P,L,NE]
    RewriteRule ^/search$        http://api.talis.com/stores/latc-mds/items              [P,L,NE]
    RewriteRule ^/describe$      http://api.talis.com/stores/latc-mds/meta               [P,L,NE]
    RewriteRule ^/data$          http://api.talis.com/stores/latc-mds/meta               [P,L,NE]

    LogLevel notice
    ProxyPreserveHost Off
    Options FollowSymLinks
</VirtualHost>

<Directory /var/www/mds.lod-cloud.net/htdocs>
    Options FollowSymLinks
    AllowOverride All
</Directory>

CKAN Import Scripts

  • the MDS is based on data imported for "lod"-tagged datasets on http://thedatahub.org (the canonical CKAN installation)
  • this data is being imported periodically by the import-ckan-json-as-void.php script in the mds/scripts folder
  • to set up the periodic data import, the script needs to be added to the crontab
  • the crontab can be edited as follows:
crontab -e
  • this opens a vi-shell editor
  • add the following line, then save and quit (:wq)
  • modify the path to the script if necessary, and the time information as desired (e.g., to hourly)
@daily php /var/www/latc/latc-platform/mds/scripts/import-ckan-json-as-void.php 2>&1

LATC Runtime

  1. Download the [SILK framework] (http://www4.wiwiss.fu-berlin.de/bizer/silk/releases/silk_2.5.2.zip), extract it in a temporary directory, get the file named silkmr.jar out of it
  2. Download [LATC_runtime.jar] (https://github.com/downloads/LATC/24-7-platform/LatcRuntime.jar) package and locate it in above directory too
  3. Create a configuration file config.ini with the following content:
HADOOP_PATH = hadoop-1.0.0
HADOOP_USER = foo
LATC_CONSOLE_HOST = http://latc-console.few.vu.nl/
LINKS_FILE_STORE  = links.nt
RESULTS_HOST = http://demo.sindice.net/latctemp
RESULT_LOCAL_DIR = results
SPEC_FILE = spec.xml
VOID_FILE = void.ttl 
API_KEY_CONSOLE = xxx
API_KEY_MDS = xxx
MDS_HOST = http://mds.lod-cloud.net/graphs
  1. Create your void template with name [void.tmpl] (https://github.com/LATC/24-7-platform/blob/master/latc-platform/runtime/trunk/voidtmpl)

  2. Create blacklist file which contain the name of disable link spec

  3. Ensure the two jar files and the configuration file are in the same directory and then execute the runtime

java -jar LatcRuntime.jar -c config.ini