The toolspec allows describing tools and operations in a machine readable format. It can be used to generate artefacts using the Toolwrapper and to execute a tool on a Hadoop cluster. The Photohawk toolspect defines operations for each supported algorithm. The component spec defines additional information and post-processing steps to bundle the operations as Taverna components.
The Toolwrapper uses the toolspec (and optionally the component spec) to generate artefacts for each operation:
- Wrapper scripts allow simplified invocation from the command line
- Taverna workflows allow combining tools using a graphical interface and executing them with the Taverna execution engine
- Components - enriched workflows - allow easy discovery, composition and execution
- Debian packages allow simple installation
Further the Toolwrapper supports publishing the generated components to the workflow sharing platform myExperiment.
Generating the artefacts
- Install the toolwrapper according to the installation instructions
- Generate the wrapper scripts and workflows using
toolwrapper-bash-generator/bin/generate.sh -t digital-preservation-qaobject-image-photohawk-changedetection.xml -c digital-preservation-qaobject-image-photohawk-changedetection.component -o out
- Copy the photohawk-commandline jar
- Generate the debian package containing the artefacts
toolwrapper-bash-debian-generator/bin/generate.sh -t digital-preservation-qaobject-image-photohawk-changedetection.xml -o debian -i out/ -ch digital-preservation-qaobject-image-photohawk-changedetection.changelog -e firstname.lastname@example.org -d digital-preservation-qaobject-image-photohawk-changedetection -a -desc "Pure Java image comparison"
The input file specifies the toolspec and operation as well as necessary parameters for the execution, one line per invocation. For example, to run SSIM on a two images, create a file
input.txt with the name of the toolspec
digital-preservation-qa-image-photohawk-changedetection, the name of the SSIM operation
digital-preservation-qaobject-image-photohawk-ssim and the input parameters
digital-preservation-qa-image-photohawk-changedetection digital-preservation-qaobject-image-photohawk-ssim --leftimage="hdfs:///user/you/input/image1.dng" --rightimage="hdfs:///user/you/input/image1.tiff"
In order to execute the job call
hadoop jar tomar-1.4.2-SNAPSHOT-jar-with-dependencies.jar -i input.txt -r hdfs:///user/you/toolspecs
Note that the toolspec and input files must be stored on hdfs.