Skip to content

WaCoDiS/core-engine

Repository files navigation

WaCoDiS Core Engine

Build
The WaCoDiS Core Engine component provides core functionalities such as job scheduling, evaluation and execution.

Table of Content

  1. WaCoDiS Project Information
  2. Overview
  3. Installation / Building Information
  4. User Guide
  5. Developer Information
  6. Contact
  7. Credits and Contributing Organizations

WaCoDiS Project Information

Climate changes and the ongoing intensification of agriculture effect in increased material inputs in watercourses and dams. Thus, water industry associations, suppliers and municipalities face new challenges. To ensure an efficient and environmentally friendly water supply for the future, adjustments on changing conditions are necessary. Hence, the research project WaCoDiS aims to geo-locate and quantify material outputs from agricultural areas and to optimize models for sediment and material inputs (nutrient contamination) into watercourses and dams. Therefore, approaches for combining heterogeneous data sources, existing interoperable web based information systems and innovative domain oriented models will be explored.

Architecture Overview

The WaCoDiS project aims to exploit the great potential of Earth Observation (EO) data (e.g. as provided by the Copernicus Programme) for the development of innovative water management analytics service and the improvement of hydrological models in order to optimize monitoring processes. Existing SDI based geodata and in-situ data from the sensors that monitor water bodies will be combined with Sentinel-1 and Sentinel-2 data. Therefore, the WaCoDiS monitoring system is designed as a modular and extensible software architecture that is based on interoperable interfaces such as the Open Geospatial Consortium (OGC) Web Processing Service. This allows a sustainable and flexible way of integrating different EO processing algorithms. In addition, we consider architectural aspects like publish/subscribe patterns and messaging protocols that increase the effectiveness of processing big EO data sets. Up to now, the WaCoDiS monitoring system comprises the following components:

Job Manager
A REST API enables users to define job descriptions for planning the execution of analysis processes.

Core Engine
The Core Engine schedules jobs for planned process executions based on the job descriptions. In addition, it is responsible for triggering WPS processes as soon as all required process input data is available.

Datasource Observer
Several observing routines requests certain datastores for new available data, such as in-situ measurements, Copernicus satellite data, sdi based geodata and services or meteorological data that are required for process executions.

Data Discovery
Data Discovery comprises two components: the WaCoDiS Data Access API and the WaCoDiS Metadata Connector.
Metadata about all incoming, available datasets discovered by the Datasource Observer is handled and bundled by the WaCoDiS Data Access API. For the purpose of defining process inputs, the WaCoDiS Data Access API generates references to the required datasets from the metadata and provides these references to the Core Engine via a REST API. The WaCoDis Data Access API uses a instance of Elasticsearch search engine to store the metdata about all available datasets.
To provide an asynchronous Publish/Subscribe pattern, the WaCoDiS Metadata Connector will listen for new datasets and then interacts with the REST API of WaCoDiS Data Access API.

Web Processing Service
The execution of analysis processes provided by EO Tools is encapsulated by a OGC Web Processing Service (WPS), which provides a standardized interface for this purpose. Therefore a custom backend for the 52°North javaPS implementation provides certain preprocessing and execution features.

Product Listener
A Product Listener will be notified as soon as any analyis process has finished and a new earth observation product is available. The component will fetch the product from the WPS and routes it to one or more specific backends (e.g. GeoServer, ArcGIS Image Server) that provides a certain service for the user to access the product.

Product Importer
For each product service backend a certain helper component will import the earth observation product into the specific backend's datastore and may set up a service on top of it. The product importer can be provided as part of the Product Listener or can be provided as an external component (e.g. a python script for importing porduct into the ArcGIS Image Server).

Job Status Listener
The WaCoDiS Job Status Listener is intermediate component that consumes messages (published by the Core Engine) on execution progress of processing jobs. According to the job's progress the Job Status Listener updates the status in the job's describtion by interacting with the REST API of the Job Manager.

Diagram of WaCoDiS high level architecture

The WaCoDiS monitoring system architecture is designed in a modular fashion and follows a publish/subscribe pattern. The different components are loosely connected to each other via messages that are passed through a message broker. Each module subscribes to messages of interest at the message broker. This approach enables an independent and asynchronous handling of specific events.

The messages exchanged via message broker follow a domain model that has been defined by the OpenAPI specification. You can find these definitions and other documentation in the apis-and-workflows repo.

Overview

The WaCoDiS Core Engine is the system component that implements the core processing workflow. This processing workflow comprises the following three stepps:

  1. scheduling of processing jobs (jobs).
  2. evalutation whether all necessary input data is available to start a processing job (job evaluation)
  3. initiation of the execution of the processing job by sending a request to the processing environment or rather the web processing service (job execution)

The core engine is also responsible for keeping other WaCoDiS components updated about the progress of processing jobs. The core engine publishes messages via the sytem's message broke if a processing job is started, failt or executed sucessfully.

Core Data Types

  • Job
    A WacodisJobDefinition (Job) describes a processing that is to be executed automatically according to a defined schedule. The WacodisJobDefinition contains (among other attributes) the input data required for execution, as well as the time frame and area of interest.
  • DataEnvelope
    The metadata about an existing dataset is described by the AbstractDataEnvelope (DataEnvelope) data format. There are different subtypes for different data sources (e.g SensorWebDataEnvelope or CopernicusDataEnvelope).
  • Resource
    Access to the actual data records is described by the AbstractResource (Resources) data format. There are the subtypes GetResources and PostResources. A GetResources contains only a URL while a PostResource contains a URL and a body for a HTTP-POST request.
  • SubsetDefinition
    The required inputs of a job are described by the data format AbstractSubsetDefinition (SubsetDefinition). There are different subtypes for different types of input data (e.g SensorWebSubsetDefinition or CopernicusSubsetDefinition). There is usually a subtype of AbstractSubsetDefinition that corresponds to a subtype of AbstractDataEnvelope.

The formal definition of these data types is done with OpenAPI and is available in the apis-and-workflows repo.

Modules

The WaCoDiS Core Engine comprises three Maven modules:

  • WaCoDiS Core Engine Data Models
    This module contains Java classes that reflect the basic data model. This includes the data types specified with OpenAPI in the WaCoDiS apis-and workflows repository. All model classes were generated by the OpenAPI Generator, which is integrated and can be used as Maven Plugin within this module.
  • WaCoDiS Core Engine Job Scheduling
    This module is responsible for initiating the evaluation (and execution) of a processing job according to cron pattern provided in the job's definition (WacodisJobDefinition). The scheduling is implemented based on the Java API Quartz.
  • WaCoDiS Core Engine Evaluator
    This module is responsible for checking if a scheduled job is actually executable; that means checking if all necessary input data is available. If not, the evaluator waits until new data sets become available a re-evaluates the executably of the given processing job.
  • WaCoDiS Core Engine Executor
    If a processing job is executable the executor module is responsible for putting together a execution request (containing all input parameters) and submitting this request to the processing environment. The Core Engine Executor keeps track of the job's execution progress and publishes messages via the system's message broker when a processing job is executed sucessfully or processing has failed.
  • WaCoDiS Core Engine App
    Since WaCoDiS Core Engine is implemented as Spring Boot application, the App module provides the application runner as well as default externalized configurations.
  • WaCoDiS Core Utils
    The Core Engine Utils module contains classes that implement routines used in multiple of the above modules.

Technologies

  • Java
    WaCoDiS Core Engine is tested with Oracle JDK 8 and OpenJDK 8. Unless stated otherwise later Java versions can be used as well.
  • Maven
    This project uses the build-management tool Apache Maven
  • Spring Boot
    WaCoDiS Core Engine is a standalone application built with the Spring Boot framework. Therefore, it is not necessary to deploy WaCoDiS Data Access manually with a web server.
  • Spring Cloud
    Spring Cloud is used for exploiting some ready-to-use features in order to implement an event-driven workflow. In particular, Spring Cloud Stream is used for subscribing to asynchronous messages within the WaCoDiS system.
  • RabbitMQ
    For communication with other WaCoDiS components of the WaCoDiS system the message broker RabbitMQ is utilized. RabbitMQ is not part of WaCoDiS Core Engine and therefore must be deployed separately.
  • OpenAPI
    OpenAPI is used for the specification of data models used within this project.
  • Quartz
    Quartz is a Java API for execution of recurrent, regular tasks based on a Cron definition (crontab). Quartz is used for scheduling the processing jobs.

Installation / Building Information

Build from Source

In order to build the WaCoDiS Core Engine from source Java Development Kit (JDK) must be available. Currently, WaCoDiS Core Engine (or its dependencies) can only be built with Java JDK8. There is no guarantee that the build process works with other versions of java. Since this is a Maven project, Apache Maven must be available for building it.

Before you build the Core Engine project, it is necessary to clone and build the WaCoDiS fork of the 52°North WPS Client Lib.

  1. clone the project: git clone https://github.com/WaCoDiS/wps-client-lib.git
  2. run mvn clean install in WPS Client Lib project's root folder to build the project and install the artifacts in your local Maven repository.
    As mentioned above, the build process of the WPS Client Lib might fail with other java versions than JDK8.

After you built the WPS Client Lib project, you can build the Core Engine project by running mvn clean install from (Core Engine) root directory.

Build using Docker

The project contains a Dockerfile for building a Docker image. Simply run docker build -t wacodis/core-engine:latest . in order to build the image. You will find some detailed information about running the Core Engine as Docker container within the deployment section.

User Guide

Deployment

This section describes deployment scenarios, options and preconditions.

Preconditions

  • (without using Docker) In order to run WaCoDiS Core Engine Java Runtime Environment (JRE) (version >= 8) must be available. In order to build Job Definition API from source Java Development Kit (JDK) version >= 8) must be abailable. Core Engine is tested with Oracle JDK 8 and OpenJDK 8.
  • In order to receive message about newly available data sets (job evalutation) and to publish message about processing progress (job execution) a running instance a running instance of RabbitMQ message broker must be available.
  • A running instance of WaCoDiS Job Manager must be available because during the job scheduling the Core Engine retrieves detailed job information by consuming the Job Manager's REST API.

The server addresses are configurable.

Run with Maven

Just start the application by running mvn spring-boot:run from the root of the core-engine-app module. Make sure you have installed all dependencies with mvn clean install from the project root.

Run with Docker

  1. Build Docker Image from Dockerfile that resides in the project's root folder.
  2. Run created Docker Image.

Alternatively, latest available docker image (automatically built from master branch) can be pulled from Docker Hub. See WaCoDiS Docker repository for pre-configured Docker Compose files to run WaCoDiS system components and backend services (RabbitMQ and Elasticsearch).

Configuration

Configuration is fetched from WaCoDiS Config Server. If config server is not available, configuration values located at src/main/resources/application.yml within the Core Engine App submodule are applied instead.

Parameters

The following section contains descriptions for configuration parameters structured by configuration section.

spring/cloud/stream/bindings/input-data-envelope

configuration of message chanel for receiving messages on newly available datasets

value description note
destination topic used to receive messages on newly available datasets, must be algined with Data Access API config e.g. wacodis.test.jobs.data.accessible
binder defines the binder (message broker)
content-type content type of DataEnvelope acknowledgement messages (mime type) should always be application/json
spring/cloud/stream/bindings/job-creation

configuration of message chanel for receiving messages when a new job is created

value description note
destination topic used to publish messages about created WaCoDiS jobs e.g. wacodis.test.jobs.new
binder defines the binder (message broker)
content-type content type of the messages should always be application/json
spring/cloud/stream/bindings/job-deletion

configuration of message chanel for receiving messages when an existing job is deleted

value description note
destination topic used to publish message about deleted WaCoDiS jobs e.g. wacodis.test.jobs.deleted
spring/cloud/stream/bindings/toolFinished

configuration of message chanel for publishing messages on successfully executed processing jobs

value description note
destination topic used to publish message about deleted WaCoDiS jobs e.g. wacodis.test.tools.finished
spring/cloud/stream/bindings/toolExecution

configuration of message chanel for publishing messages when a processing job is started (submitted to the processing environment)

value description note
destination topic used to publish message about deleted WaCoDiS jobs e.g. wacodis.test.tools.execute
spring/cloud/stream/bindings/toolExecution

configuration of message chanel for publishing messages when a processing job is started (submitted to the processing environment)

value description note
destination topic used to publish message about deleted WaCoDiS jobs e.g. wacodis.test.tools.execute
spring/rabbitmq

parameters related to WaCoDis message broker

value description note
host RabbitMQ host (WaCoDiS message broker) e.g. localhost
port RabbitMQ port (WaCoDiS message broker) e.g. 5672
username RabbitMQ username (WaCoDiS message broker)
password RabbitMQ password (WaCoDiS message broker)
spring/scheduler

configuration parameters related to the job scheduling module

value description note
timezone time zone used for job scheduling e.g. Europe/Berlin
jobrepository/uri base url of the necessary instance of Job Management API e.g. http://localhost:8081
spring/evaluator

configuration parameters related to the job evaluation module

value description note
matching/preselectCandidates if true, a waiting processing job (missing input data) is only re-evaluated if a new data set with matching source type becomes available boolean, if set to false requests do Data Access API will increase because every new data set triggers re-evaluation of every waiting job
dataaccess/uri url of the resource/search endpoint of the necessary instance of Data Access API e.g. http://localhost:8082/resources/search
spring/executor/wps

configuration parameters related to the job executor module, precisely the interacting with the web processing service (WPS, processing environment)

value description note
uri base url of the WaCoDiS (OGC) WPS e.g. http://localhost:8083/wps
version version of the OGC WPS standard that is implemented by the WaCoDiS WPS e.g. 2.0.0
defaultResourceMimeType (optional) define mime type for every input resource default is text/xml
defaultResourceSchema/name (optional) define schema for every input resource default is GML3, always provide name and schemaLocation
defaultResourceSchema/schemaLocation (optional) define schema for every input resource default is http://schemas.opengis.net/gml/3.1.1/base/feature.xsd
maxParallelWPSProcessPerJob max. number of parallel wps processes started per wacodis processing job only applicable if wacodis job is splitted in multiple wps processes (see WacodisJobDefinition.executionSettings.executionMode), e.g. 3
processInputsDelayed delay between start of multiple wps processes per wacodis processing job only applicable if wacodis job is splitted in multiple wps processes (see WacodisJobDefinition.executionSettings.executionMode), boolean, e.g. false
initialDelay_Milliseconds delay before the first wps process is started applicable if processInputDelayed is true, e.g. 0
delay_Milliseconds delay between separate wps processes are started applicable if processInputDelayed is true, e.g. 900000
delay_Milliseconds if true, squential processing of multiple wps process per wacodis processing job overrides maxParallelWPSProcessPerJob (effectively same as maxParallelWPSProcessPerJob = 1), only applicable if wacodis job is splitted in multiple wps processes (see WacodisJobDefinition.executionSettings.executionMode), boolean
spring/datasource.core.quartz-data-source

Per default, Quartz scheduler makes use of an in-memory job store. In order to configure a JDBC-based store, Core Engine provides a DataSource bean which can be configured using spring.datasource.core.quartz-data-source. To configure the DataSource just follow the Spring Boot guide.

spring/quartz

Quartz scheduler related beans are provided using spring-boot-starter-quartz. Hence, the scheduler can be configured via externalized configuration using spring.quartz.properties. Just set the usual Quartz configuration properties.

Developer Information

How to Contribute

Extending Core Engine

New Types of DataEnvelope and SubsetDefinition

WaCoDiS Core Engine and WaCoDiS Data Access must be modified if new types of DataEnvelope or SubsetDefintion are added to Wacodis schemas in order to support the newly introduced data types. See the Wiki for further information.

Pending Developments

Support for POSTResources

The Core Engine Executor is currently only able to submit GetResources to the WaCoDiS processing environment. Currently the Core Engine cannot handle POSTResources.

Branching

The master branch provides sources for stable builds. The develop branch represents the latest (maybe unstable) state of development.

License

Apache License, Version 2.0

Contributing developers

Name Organization Mail
Arne Vogt 52° North GmbH arnevogt
Sebastian Drost 52° North GmbH SebaDro
Matthes Rieke 52° North GmbH matthesrieke

Contact

The WaCoDiS project is maintained by 52°North GmbH. If you have any questions about this or any other repository related to WaCoDiS, please contact wacodis-info@52north.org.

Credits and Contributing Organizations

  • Department of Geodesy, Bochum University of Applied Sciences, Bochum
  • 52° North Initiative for Geospatial Open Source Software GmbH, Münster
  • Wupperverband, Wuppertal
  • EFTAS Fernerkundung Technologietransfer GmbH, Münster

The research project WaCoDiS is funded by the BMVI as part of the mFund programme

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •