Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
This is the usage guide for C3PO 0.4.0. It will help you setup and run the command line application as well as the web app in a server of your choice. Please download the binaries on BinTray.
- Java 1.6
- MongoDB 2.0.5 or higher (64 Bit!!!)
- FITS (optional)
- Some Application Server (Tomcat/Jetty/JBoss 7) (if you want to run the web app)
- Play Framework (if you want to run the web app)
Install Java, MongoDB (http://www.mongodb.org) and FITS (https://github.com/harvard-lts/fits), if you haven't. Take a note of the port where the mongo daemon is running (27017 by default). Download the command line from BinTray.
The command line of c3po has several modes you can choose from. To use c3po use the following command:
java -jar c3po-cmd.jar
This will output an error message with the modes that you can use. Here are all the available modes and their options you can use. The ones with the '*' are obligatory.
The help mode prints all the available modes and options.
Usage: c3po help
Prints version information
Usage: c3po version
The gather mode is used to read meta data into the mongo database.
Usage: c3po gather [options] Options: * -c, --collection The name of the collection * -i, --inputdir The input directory where the meta data is stored -r, --recursive Whether or not to gather recursively Default: false -t, --type Optional parameter to define the meta data type. Use one of 'FITS' or 'TIKA', to select the type of the input files. Default is FITS Default: FITS
The profile mode is used to generate a profile in xml format.
Usage: c3po profile [options] Options: -a, --algorithm The algorithm that will be used for selecting the samples records. Supported values are: 'sizesampling', 'syssampling', 'distsampling' Default: sizesampling * -c, --collection The name of the collection -ie, --includeelements If this flag is present, the profile will include a list of element identifiers. Note, that this might be a long list. Default: false -o, --outputdir The output directory where the profile will be stored Default: <empty string> -props, --properties The list of properties for the 'distsampling' algorithm Default:  -s, --size The size of the samples set. Default: 5
The samples mode is used to select representative samples based on different strategies.
Usage: c3po samples [options] Options: -a, --algorithm The algorithm that will be used for selecting the samples records. Use one of 'sizesampling', 'syssampling', 'distsampling' Default: sizesampling * -c, --collection The name of the collection -o, --outputdir The output directory where the samples will be output. If nothing is provided the output is written to the console -props, --properties The list of properties for the 'distsampling' algorithm Default:  -s, --size The size of the samples set. Default: 5
The export mode is used to export the data in a csv format.
Usage: c3po export [options] Options: * -c, --collection The name of the collection -o, --outputdir The output directory where the profile will be stored Default: <empty string>
The remove mode is used to remove a collection.
Usage: c3po remove [options] Options: * -c, --collection The name of the collection
C3PO relies on some simple configuration parameters, like the db name, db host, db port, etc.
Defaults are supplied within the jar, so you don't have to do anything. However, if you want to override them create a file called
.c3poconfig in your home directory and replace the properties you want. C3PO will use the defaults for all properties that you skip. Here are the defaults.`
#Application default properties. c3po.persistence=default # the class provider for the persistence layer (or default) c3po.controller.adaptors.count=4 # the count of the adaptors c3po.controller.consolidators.count=2 # the count of the consolidators c3po.rule.infer_date_from_file_name=false # a rule that tries to infer a date from the file names c3po.rule.html_info_processing=false # a rule that cleans up special fits meta data c3po.rule.format_version_resolution=true # a rule that fixes some errors in format version parsing c3po.rule.empty_value_processing=true # a rule that does not allow empty values c3po.rule.create_element_identifier=true # a rule that creates element identifiers if none are provided by the adaptor c3po.adaptor.tika.version="unknown" # the tika version (if tika files were processed) #DB default Properties db.host=localhost # the host where mongo is running db.port=27017 # the port where mongo is listening db.name=c3po # the name of the db
Build and Deploy
The Web App is developed with the Play Framework and so you will have to decide how you want to deploy the application. There are two options:
- natively (with play)
- deploy a war in a servlet container
Note that version 0.3.0 uses Play 2.0.4, so make sure you install the correct version.
If you want to run C3PO native in Netty (provided by the PlayFramework), then you have to install play on your system. Afterwards navigate to the webapi directory with
cd ~/c3po/c3po-webapi. Execute the following command
play clean compile stage. This will generate everything you need. Just run the generated start script
target/start &. This will run the app in production.
Fire up a Browser and navigate to
localhost:9000/c3po. You should see the application running.
This is tested under Linux and Unix, so Windows users might need to do something different. Please refer to the play guide or deploy in a servlet container.
Normally the war should be deployable in any servlet container, however it is tested only in JBoss 7.0.2, 7.1.0 and 7.1.1 and Tomcat 7. If you want to try another version or another server follow this guide, but you might need to do some things differently. For more information also check this guide. (Thanks to dlecan)
If you choose JBoss, then you will have to navigate to
cd ~/c3po/c3po-webapi and execute
play package. This will generate a .war in your target folder. Take the war and rename it to ROOT.war. (Read dlecans guide for more information, why this is necessary).
Depending on the version of JBoss you have chosen, you will also have to do some minor adjustments to the JBoss standalone.xml configuration file. Refer to dlecans guide for the actual steps (https://github.com/dlecan/play2-war-plugin/wiki/Deployment).
Once you have done all this, just copy the war in the JBoss deployment folder and start the server.
Fire up a Browser and navigate to
localhost:8080/c3po. You should see the application running.
If you have any questions, contact me!