Usage Guide

Petar Petrov edited this page May 20, 2013 · 5 revisions

Usage Guide

This is the usage guide for C3PO 0.4.0. It will help you setup and run the command line application as well as the web app in a server of your choice. Please download the binaries on BinTray.

Requirements

  • Java 1.6
  • MongoDB 2.0.5 or higher (64 Bit!!!)
  • FITS (optional)

Requirements WEB

  • Some Application Server (Tomcat/Jetty/JBoss 7) (if you want to run the web app)
  • Play Framework (if you want to run the web app)

Setup

Install Java, MongoDB (http://www.mongodb.org) and FITS (https://github.com/harvard-lts/fits), if you haven't. Take a note of the port where the mongo daemon is running (27017 by default). Download the command line from BinTray.

General

The command line of c3po has several modes you can choose from. To use c3po use the following command:

java -jar c3po-cmd.jar

This will output an error message with the modes that you can use. Here are all the available modes and their options you can use. The ones with the '*' are obligatory.

The help mode prints all the available modes and options.

Usage: c3po help

Prints version information

Usage: c3po version

The gather mode is used to read meta data into the mongo database.

Usage: c3po gather [options]
  Options:
  * -c, --collection
       The name of the collection
  * -i, --inputdir
       The input directory where the meta data is stored
    -r, --recursive
       Whether or not to gather recursively
       Default: false
    -t, --type
       Optional parameter to define the meta data type. Use one of 'FITS' or
       'TIKA', to select the type of the input files. Default is FITS
       Default: FITS

The profile mode is used to generate a profile in xml format.

Usage: c3po profile [options]
  Options:
    -a, --algorithm
       The algorithm that will be used for selecting the samples records.
       Supported values are: 'sizesampling', 'syssampling', 'distsampling'
       Default: sizesampling
  * -c, --collection
       The name of the collection
    -ie, --includeelements
       If this flag is present, the profile will include a list of element
       identifiers. Note, that this might be a long list.
       Default: false
    -o, --outputdir
       The output directory where the profile will be stored
       Default: <empty string>
    -props, --properties
       The list of properties for the 'distsampling' algorithm
       Default: []
    -s, --size
       The size of the samples set.
       Default: 5

The samples mode is used to select representative samples based on different strategies.

Usage: c3po samples [options]
  Options:
    -a, --algorithm
       The algorithm that will be used for selecting the samples records. Use
       one of 'sizesampling', 'syssampling', 'distsampling'
       Default: sizesampling
  * -c, --collection
       The name of the collection
    -o, --outputdir
       The output directory where the samples will be output. If nothing is
       provided the output is written to the console
    -props, --properties
       The list of properties for the 'distsampling' algorithm
       Default: []
    -s, --size
       The size of the samples set.
       Default: 5

The export mode is used to export the data in a csv format.

Usage: c3po export [options]
  Options:
  * -c, --collection
       The name of the collection
    -o, --outputdir
       The output directory where the profile will be stored
       Default: <empty string>

The remove mode is used to remove a collection.

Usage: c3po remove [options]
  Options:
  * -c, --collection
       The name of the collection

Advanced

C3PO relies on some simple configuration parameters, like the db name, db host, db port, etc. Defaults are supplied within the jar, so you don't have to do anything. However, if you want to override them create a file called .c3poconfig in your home directory and replace the properties you want. C3PO will use the defaults for all properties that you skip. Here are the defaults.`

#Application default properties.
c3po.persistence=default                         # the class provider for the persistence layer (or default)
c3po.controller.adaptors.count=4                 # the count of the adaptors
c3po.controller.consolidators.count=2            # the count of the consolidators
c3po.rule.infer_date_from_file_name=false        # a rule that tries to infer a date from the file names
c3po.rule.html_info_processing=false             # a rule that cleans up special fits meta data
c3po.rule.format_version_resolution=true         # a rule that fixes some errors in format version parsing
c3po.rule.empty_value_processing=true            # a rule that does not allow empty values
c3po.rule.create_element_identifier=true         # a rule that creates element identifiers if none are provided by the adaptor
c3po.adaptor.tika.version="unknown"              # the tika version (if tika files were processed)

#DB default Properties
db.host=localhost                                # the host where mongo is running
db.port=27017                                    # the port where mongo is listening
db.name=c3po                                     # the name of the db

Web Application

v0.3.0

The Web App provides a UI for the data and allows you to filter the data, select some sample records, export data (xml profile and csv), but also to integrate with tools like PLATO and SCOUT.

Build and Deploy

The Web App is developed with the Play Framework and so you will have to decide how you want to deploy the application. There are two options:

  1. natively (with play)
  2. deploy a war in a servlet container

Note that version 0.3.0 uses Play 2.0.4, so make sure you install the correct version.

Native

If you want to run C3PO native in Netty (provided by the PlayFramework), then you have to install play on your system. Afterwards navigate to the webapi directory with cd ~/c3po/c3po-webapi. Execute the following command play clean compile stage. This will generate everything you need. Just run the generated start script target/start &. This will run the app in production. Fire up a Browser and navigate to localhost:9000/c3po. You should see the application running.

This is tested under Linux and Unix, so Windows users might need to do something different. Please refer to the play guide or deploy in a servlet container.

JBoss/Tomcat

Normally the war should be deployable in any servlet container, however it is tested only in JBoss 7.0.2, 7.1.0 and 7.1.1 and Tomcat 7. If you want to try another version or another server follow this guide, but you might need to do some things differently. For more information also check this guide. (Thanks to dlecan)

If you choose JBoss, then you will have to navigate to cd ~/c3po/c3po-webapi and execute play package. This will generate a .war in your target folder. Take the war and rename it to ROOT.war. (Read dlecans guide for more information, why this is necessary). Depending on the version of JBoss you have chosen, you will also have to do some minor adjustments to the JBoss standalone.xml configuration file. Refer to dlecans guide for the actual steps (https://github.com/dlecan/play2-war-plugin/wiki/Deployment). Once you have done all this, just copy the war in the JBoss deployment folder and start the server. Fire up a Browser and navigate to localhost:8080/c3po. You should see the application running.

If you have any questions, contact me!

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.