Skip to content
Jaana edited this page Jun 9, 2015 · 14 revisions

Tutorial

Please follow the installation guide in order to get started. Then open a PDF file from the "Open File" on the toolbar of the Semann tool.

Now you can start annotating... ###How to add annotations? Select text in the PDF file, it will appear in the "Subject" field.

Fill in the "Property" and "Object" fields. Properties denote the connection type between the selected text and its subject, eg "is about", "related to" etc. Objects are classifications of the selected text in relation to its properties, eg "computer science", "company" etc. You can type your own values or use existing ones in the database. It is best to reuse existing values if they suit the purpose - this gets better results with the similar publications functionality. NB! Since you are running on a local copy of your database you need to add some annotations first for the suggestions to become available. When all three fields are filled, click on "Add annotation" button. The triples get now uploaded to the database.

add

###How to fetch existing annotations? You can fetch existing annotations for the open file by pressing on the "Fetch annotations" button. This fetches all existing annotations for the given file added by you or anyone else and displays them as highlights in the PDF file.

fetch

###Find similar publications Pressing "Find Similar" button returns results if there exist annotations for other files in the database that are associated to matching annotation in the open file. Returned results are not ordered or filtered currently. Each returned suggestion comes with a list of explanations as to why the match was made. If a match is found in the same structural context (SemAnn Discourse Elements Ontology), then that is emphasised with a corresponding label next to the specific explanation. Similar papers are found in the following way:

  1. Finds all papers where the annotations of the currently open paper match annotations of the same type in other papers. Currently limited to annotations of type DBpedia resources only.
  2. Finds all papers where the currently open paper and some other paper share annotations that point to the same DBpedia subject category.
  3. Checks whether any of the found papers have their annotations in the same structural context as the currently open paper.

similar

###In case of issues This is an early prototype and you might still run into issues. We ask you to please register them as new issues so they can get solved. Include a detailed description of the behaviour and if possible add the logs from the console window (F12).

Installation Guide

Install Openlink Virtuoso

Download the tool to your local machine. Continue with the installation of the database. The below instructions are based on version 7.1.0.

Windows

  1. Install Openlink Virtuoso database.
  2. Follow the [basic installation](http://www.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSUsageWindows#Basic Installation) guide of Virtuoso.
  3. Set the PATH system environment variable on your system.
  4. Create a [Virtuoso Windows Service](http://www.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSUsageWindows#Optional -- Create and Manage Virtuoso Windows Services).
  5. Open http://localhost:8890/conductor administration tool in your browser. login/psswd: dba/dba
  6. Add write access from the administration tool for the SPARQL editor when you are logged in: System Admin > User Accounts > SPARQL edit > add role: SPARQL_UPDATE. This allows the Semann tool to insert data to the database.
  7. Add your account (with what you log in: eg "dba") to the role : SPARQL_UPDATE under the tab "roles" in the administration tool.
  8. Open command prompt > cd %VIRTUOSO_HOME%/bin > isql 1111 dba dba > GRANT EXECUTE ON DB.DBA.L_O_LOOK TO "SPARQL";. This step is necessary to avoid the error "No permission to execute dpipe DB.DBA.L_O_LOOK" (bug for WIN edition)

Linux

The following incomplete guide, which is roughly based on the Linux From Scratch guide, explains how to build Virtuoso yourself from the sources and install it locally, without root privileges, for testing.

  1. Make sure you have installed the dependencies listed in the guide, and you have the usual Linux development environment, including gcc and the GNU autotools.
  2. Download virtuoso-opensource-VERSION.tar.gz.
  3. Unpack it, cd into that directory.
  4. Run sed -i "s|virt_iodbc_dir/include|&/iodbc|" configure
  5. Run ./configure --prefix=$PWD --exec-prefix=$PWD --with-iodbc=/usr --with-readline --program-transform-name="s/isql/isql-v/" --disable-static. This is very quick and dirty; it installs Virtuoso into the same directory as the sources. You can fine-tune all installation directories; see ./configure --help.
  6. Run make install.
  7. Run bin/virtuoso-t +foreground +configfile var/lib/virtuoso/db/virtuoso.ini. This starts Virtuoso in foreground, which allows you to see any debug output, and to stop it by simply pressing Ctrl+C.
  8. Continue by opening http://localhost:8890/conductor and following steps 5, 6 and 7 of the instructions given above.

####Still having problems? Windows

Try to isolate the problem:

  • Open your SPARQL editor: http://localhost:8890/sparql. Does it open up?
    • If the link does not open then it might be because your Windows Service is down. Start it manually: go to Start > "Services" > look for "Openlink Virtuoso Server [instance name]". If the service is not in "Started" status, then right-click start. if the service starts and then immediately stops (with an error message), then delete %VIRTUOSO_HOME%\database\virtuoso.lck file and create a new windows service instance as per installation instructions. It should now start successfully.
    • If the link opens without problems, run the default query. Does it return results?
  • If the above point seemed fine then there might be an issue in the communication between the tool and the SPARQL editor. You might need to check some variables in the Semann tool.
  • To check if you managed to successfully load any information into the database, you can run the following query in SPARQL editor http://localhost:8890/sparql: SELECT * FROM <http://eis.iai.uni-bonn.de/semann/graph> WHERE { ?s ?p ?o }

Import ontologies

The below is optional for the functioning of the tool, however if you are interested in observing the semantic capabilities of the tool, you should not skip this step.

  • Create a tmp folder under your Virtuoso home folder, eg in Windows: C:\Program Files\virtuoso-opensource\tmp
  • Add the following line to "virtuoso.ini" file: DirsAllowed = ., ../vad, ../tmp. After editing this file, you need to restart Virtuoso:
    • in Windows: open Windows Services > "OpenLink Virtuoso Server", right click > "Stop" and then "Start".
  • Download the following DBpedia datasets and uncompress them to the above mentioned tmp folder: mapping-based properties, DBpedia ontology, Categories (SKOS), Article Categories
  • Save [bulk load script](http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtBulkRDFLoaderScript#Bulk Loader Procedure and Sub-procedures creation SQL script) into your Virtuoso home folder and name it rdfloader.sql.
    • In Windows, run from command line:
      • cd %VIRTUOSO_HOME%/bin
      • isql 1111 dba dba
      • SQL> LOAD rdfloader.sql;
      • Ignore this warning: Warning 01V01: [OpenLink][Virtuoso ODBC Driver][Virtuoso Server]QW004: Incompatible types VARCHAR (182) and INTEGER (189) in = for gr and in lines 325-348 of load rdfloader.sql
  • Create a file named global.graph in the tmp folder, with its entire content being the URI of the desired target graph, i.e., http://dbpedia.org
  • Load the ontologies into the database graph http://dbpedia.org:
    • on command line: SQL> ld_dir ('../tmp', '*.*', 'http://dbpedia.org');
  • Initiate the import: SQL> rdf_loader_run (); This may take some time, depending on the size of the data sets.
    • To check whether it is finished, see the Virtuoso log for the lines: 10:21:50 PL LOG: Loader started 10:21:50 PL LOG: No more files to load. Loader has finished
  • You can run a query to verify whether the import was successful, it should return a number higher than 0: SQL> sparql select count(*) from <http://dbpedia.org> where {?s ?p ?o};
  • Import SemAnn ontologies:
    • Save the following files to the above mentioned tmp folder: SemAnn Discourse Elements Ontology, SemAnn Annotation Ontology. Also, their respective graph files: semann.0.2.sdeo.ttl.graph, semann.0.2.owl.ttl.graph
    • Load the ontologies into their respective database graphs:
      • on command line: SQL> ld_dir ('../tmp', 'semann.0.2.sdeo.ttl', 'http://eis.iai.uni-bonn.de/semann/0.2/sdeo');
      • on command line: SQL> ld_dir ('../tmp', 'semann.0.2.owl.ttl', 'http://eis.iai.uni-bonn.de/semann/0.2/owl');
    • Initiate the import: SQL> rdf_loader_run ();
      • You can run a query to verify whether the import was successful, it should return a number higher than 0: SQL> sparql select count(*) from <http://eis.iai.uni-bonn.de/semann/0.2/owl> where {?s ?p ?o}; and SQL> sparql select count(*) from <http://eis.iai.uni-bonn.de/semann/0.2/sdeo> where {?s ?p ?o};
  • Run a checkpoint to commit all transactions to the database: SQL> checkpoint;

Install a web server

(This step can be skipped if it is not important for you that you cannot open similar publications links). The below installs the popular lightweight nginx web server:

  • Download the latest mainline version distribution and unpack it to a location of your choice (e.g. in the root directory, i.e. C:\ in Windows).
  • Edit the configuration file in ./conf/nginx.conf.
    • Change the port number to something that is not in use already, e.g. listen 8180; - default port 80 is often in use by other programs and causes the nginx to give an error.
    • Set the root to point to the location of this tool on your system: root C:/Users/JohnDoe/Documents/GitHub/semann;
  • Start the engine from command line from the unpacked directory of nginx: start nginx
  • The tool should now be accessible under http://localhost:8180/
  • If you are still experiencing problems, please check the installation instructions:

Changelog

v1.0 - 04.2014, current milestone