Skip to content

fgiasson/Scones

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

Introduction

The Scones web service system (subject concepts or named entities) is used to perform subject concepts and named entities tagging on a target document. The GATE system is used to perform the tagging. A GATE XML annotation file is returned to the user.

This Scones application is meant to be use in conjunction with the Scones structWSF web service endpoint. More information can be find here about that endpoint:

  1. Scones web service endpoint documentation
  2. Scones web service files and configuration files
structScones is a conStruct Drupal module that acts as a user interface that lets people send text documents to the Scones web service endpoint to tag them, review them, and index them within the structWSF instance.

Scones uses tomcat6, PHP/Java Bridge and GATE.

Package Files

This Git repository represents the package to deploy on Tomcat6. The Scones folder is the source of the Scones.war file. The Scones.war file is what is used to autodeploy Scones on a Tomcat6 instance. The scones.xgapp file is the default Gate application to be used by the Scones structWSF web service endpoint. The default paths are pre-configured. The scones-gate.xgapp file is the default scones Gate application to use by the Gate developer if you want to enhance the default Scones behavior.

This packages includes:

  1. Gate version 7
  2. PHP/Java Bridge version 6.2.1

Installing & Configuring the Scones Web Service Endpoint

  1. Install tomcat6
  2. Install Scones.war
    1. Scones.war is a packaged version of GATE + the PHP/Java bridge softwares, configured for Scones's purposes. It includes the PHP/Java bridge version 6.2.1 and Gate version 7
    2. To install Scones.war, make sure tomcat6 is running, locate tomcat's webapps folder and copy Scones.war into that folder (/var/lib/tomcat6/webapps/ on a default Ubuntu installation), and wait until tomcat install & deploy the war file.
    3. Restart tomcat to properly handle this new web application by using this command:
      1. JAVA_OPTS="-Dgate.plugins.home=/var/lib/tomcat6/webapps/Scones/Gate-Plugins" /etc/init.d/tomcat6 restart
    4. Properly setup the 50local.policy file on your server
      1. The more liberal permissions would be to add that line to that file:
        1. grant codeBase "file:/var/lib/tomcat6/webapps/Scones/WEB-INF/-" { permission java.security.AllPermission; };
      2. Restart tomcat6
  3. Create GATE application
    1. You can use the GATE Developper user interface to create a new GATE application to use with Scones. Or simply use the scones.xgapp default application file.
      1. Create a new GATE pipeline, and save it into a XGAPP file.
      2. Modify the gate application XGAPP file generated by the GATE Developer user interface for the Scones setup
        1. Edit the generated XGAPP file. And make sure that all the paths to different files (named entities dictionaries used by gazetteer, ontologies, etc) can be located on the server where Scones will be running. Make sure that the paths to the GATE plugins files (in the /webapps/Scones/Gate-Plugins/ folder) are properly defined.
  4. Configure the Scones web service endpoint
    1. Make sure you properly configured the Scones web service endpoint by properly configuring the config.ini configuration file on the server. By default, this file is located here on your server: /usr/share/structwsf/scones/
    2. In the php.ini file, you will have to enable (turn "On") the allow_url_include directive.
    3. Make sure the port tomcat6 runs on (8080 by default) is not used by another application (such as Jetty)

Ontology

Scones uses the classes defined in an ontology (or more than one ontology if you modify the Gate application for that purpose). To have better performences from Gate, we suggests to only define classes in that ontology, and not any properties or named individuals. In special taggers, these could be defined if needed.

Named Entities

Scones not only tags concepts that comes from an OWL ontology, but also tags named entities that are defined in some Named Entity dictionaries.

These named entity dictionaries are specially formatted text files that are used by Gate. The best way to create them is to let structScones generate them for you.

The only thing you have to do is to create datasets of records in structWSF. Then, you should tag all the records, in these datasets, that you consider as named entities that should be used to tag documents processed with Scones.

The tagging of named entities is quite simple by adding a triple to each of these records that are named entities. This triple is using the sco:namedEntity attribute like:

  1. <record-URI> sco:namedEntity "true" .
Once this is done, each time you use structScones to re-create the named entities dictionaries, it will take care to generate all the files needed to create these named entities dictionaries from what is defined in structWSF.

Taking Named Entities Changes Into Account

All the changes to the named entities used to generate the Scones named entity dictionaries are not automatically applied to the running Scones instance. This means that if you are changing some of the named entities used by Scones, in their own datasets, these changes won't be taken into account if you tag a new document using Scones.

Let's take that scenario: you are changing the preferred labels of some named entities in a few datasets, and you are adding a new dataset, which has named entities, into the system. Here are the steps you have to perform in order to make them available to the next Scones action you will do:

  1. You go to the structScones settings page.
  2. You check the "Recreate the Named Entities dictionaries upon settings save" checkbox and leave the others unchecked.
  3. You click the the "Save Configuration" button an wait until the page reloads.
  4. Once the page get reloaded, you have to run the two structWSF Scones administration scripts: destroy.php and init.php. This will re-initialize the Scones instance with the modifications you did to the named entities dictionaries.
Then all the modifications you did to the named entities dictionaries will be taken into account in the subsequent Scones queries.

Initialize Scones Web Service

The next step is to initialize the Scones web service endpoint. Each time tomcat6 is restarted, Scones has to be re-initialized as well. The initialization phase consist in creating the GATE threads that will be used (concurrently if needed) to analyze incoming texts.

Other Scones Web Service Tools

We have access to two other tools to help us managing the Scones web service endpoint.

If you want to re-initialize Scones without restarting tomcat, then you can use the destroy.php script in the admin folder to destroy all the previously created sessions:

Additionally, if you want to check what session is currently in use in the system, you can use the analyzeSessions.php to have the status of the loaded sesions:

The output looks like:

Sessions ID: #1
Used: FALSE
Number of documents: 0


Sessions ID: #2
Used: FALSE
Number of documents: 0


Sessions ID: #3
Used: FALSE
Number of documents: 0

If used is TRUE, then it means that that session is currently used by a user. The Number of documents is the number of documents currently being tagged by that user. If used is FALSE, it simply means that the session is waiting to be used by a request user.

Resources

Here are some external documentation sources that may be helpful to better understand, install and configure a Scones instance:

  1. Scones: the structWSF web service endpoint documentation
  2. Scones and structScones Installation Instructions
  3. Scones: Story Tagging - A guide to use structScones
  4. Installing Gate and Creating New Gate Application Files

About

The Scones web service system (subject concepts or named entities) is used to perform subject concepts and named entities tagging on a target document.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published