CATE

CATE helps you discover, organize, and publish taxonomic information

Features

Modern spring-boot application
Easy to install
Cloud-native design
Multi-tenant
Built around open standards like Darwin-Core Archive
Imports and exports data formats used by taxonomists such as DELTA, SDD, bibtex, csv
Integrates with other web services

Installation

CATE is distributed as an rpm, hosted in a yum repository on bintray.

$ wget https://bintray.com/benjaminrclark/rpm/rpm -O /etc/yum.repos.d/bintray-benjaminrclark-rpm.repo

$ yum install cate

Starting CATE

Once CATE is installed it can be started without any further configuration, running by default in 'embedded' mode.

This means that it will use an in-memory database and a local solr server and filesystem. To configure CATE to use other services and/or persist data to a specific location, see configuration, below.

As a service

$ chkconfig --level 345 cate on

$ service cate start

As a java process

$ java -jar /var/lib/cate/cate.jar

Configuration

CATE requires Java Development Kit 1.7 (either Oracle JDK or OpenJDK). CATE uses FFMPEG and ImageMagick to process multimedia files and these packages must be installed locally on the server. These dependencies are specified as dependencies of the package and are verfied / installed automatically if you install the rpm.

CATE as a system depends upon a number of other services. The location and configuration of these services is relatively flexible. By default, CATE will run in embedded mode, meaning that no other external services are required.

CATE follows the approach used by spring boot in externalizing the configuration parameters. Many of the configuration properties are generic properties defined by spring-boot. Not all of them are listed below, but can be found in the spring-boot documentation. Others are specific to CATE.

In addition to specifying configuration properties, you can also enable optional services through the use of spring profiles. To run CATE as a single application server, the default profile is sufficient. To run multiple CATE application servers behind a load balancer (e.g. nginx / apache / elb), include the 'cluster' profile e.g.

spring.profiles.active=default,cluster

To run CATE on a [http://aws.amazon.com](Amazon Web Services), the 'aws' profile should be enabled.

spring.profiles.active=aws,cluster

The aws profile uses the standard properties for the database, solr and redis but is able to make use of Amazon Relational Database Service (RDS) and Amazon Elasticache in place non-Amazon services. In addition, the aws profile uses Amazon Simple Storage Service (S3) in place of a shared filesystem, and Amazon Simple Notification Service / Simple Queue Service in place of activemq. The parameters required to configure these components are listed under the heading AWS.

A sample Amazon Cloudformation template for a CATE cluster can be found in src/main/resources/cfn/cate.cnf.

Database

CATE uses a relational database as the canonical data store. Currently it is able to make use of H2 or MySQL. The properties used to configure it are standard spring-boot properties.

spring.datasource.url=jdbc:mysql://localhost:3306/cate
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.username=root
spring.datasource.password=
liquibase.contexts=mysql

Solr

CATE uses solr to power the free-text search and faceting.

solr.server.url=http://localhost:8983/solr
solr.server.class=org.apache.solr.client.solrj.impl.HttpSolrServer
solr.connection.timeout=100
solr.so.timeout=3000

Redis

CATE uses a redis database to store http session data when running in clustered mode. N.b. When running on AWS CATE will attempt to discover the port and address of an elasticache cluster running in the same account.

spring.redis.database=0
spring.redis.host=localhost
spring.redis.port=6397
spring.redis.password=

Filesystem / Object Store

CATE stores files in a shared filesystem mounted on the application server, or in S3 when running on AWS.

Local Filesystem / Network-Attached Shared Filesystem

upload.file.directory=/mnt/cate/upload
static.file.directory=/mnt/cate/static

AWS S3

cloudformation.uploadBucketArn

Messaging

CATE uses messaging to distribute tenant events across nodes in the cluster and to queue batch jobs. It uses activemq or SNS / SQS when running on AWS.

ActiveMQ

spring.activemq.broker-url=tcp://localhost:61616
spring.activemq.in-memory=false
spring.activemq.user=
spring.activemq.password=

AWS SNS & SQS

cloudformation.topicArn
cloudformation.queueArn

Email

Design

CATE is a web application which is designed to work at scale, deployed on virtual servers, and supporting many users and tenant projects.

In terms of scalability, its worth noting that the CATE application itself, and the application server it runs on, is not stateful. State is held in the following services:

Data: The relational database, plus a denormalized copy of the data is held in solr
Media: Media files are held in the object store (either NAS or S3) and are fetched to the application server as required. They are served to clients directly from the store
Session: CATE stores session state in a redis key-value store.

Events (job requests and tenant events) are distributed using a message broker. Tenant events are distributed to all instances using a topic. Job requests are distributed across application servers using a single queue which is polled by all servers.

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
gradle/wrapper		gradle/wrapper
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
build.gradle		build.gradle
dpl.cfg		dpl.cfg
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CATE

Features

Installation

Starting CATE

As a service

As a java process

Configuration

Database

Solr

Redis

Filesystem / Object Store

Local Filesystem / Network-Attached Shared Filesystem

AWS S3

Messaging

ActiveMQ

AWS SNS & SQS

Email

Design

About

Releases

Packages

Contributors 2

Languages

License

benjaminrclark/cate

Folders and files

Latest commit

History

Repository files navigation

CATE

Features

Installation

Starting CATE

As a service

As a java process

Configuration

Database

Solr

Redis

Filesystem / Object Store

Local Filesystem / Network-Attached Shared Filesystem

AWS S3

Messaging

ActiveMQ

AWS SNS & SQS

Email

Design

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages