Skip to content

Extending Data Access

arnevogt edited this page Mar 17, 2020 · 10 revisions

Integrate new DataEnvelopes and SubsetDefinitions

This section describes the code additions that have to be made so that WaCoDiS Data Access can handle new DataEnvelopes and SubsetDefinition.


Precondition:
Java classes must be generated for the new data types that have been added to the schema. This can be done by executing mvn clean install -p download-generate-models for the Wacodis Data Access Models module. To avoid conflicts newly generated classes should be added to the specific (feature) branch in the remote repository.


Filter Provider

All classes and methods mentioned below are part of the Wacodis Data Access DataWrapper module.

  1. Implement filter providers for new data types:
    Usually there is a corresponding DataEnvelope type for a new SubsetDefinition (and vice versa). For each data type (DataEnvelope and SubsetDefinition) a (Elasticsearch) filter provider must be implemented. A filter provider has the purpose to establish connectivity for Elasticsearch. A filter provider for DataEnvelopes contains the conditions that must be met that two DataEnvelopes are considered to be equal (used for the /dataAccess/dataenvelopes/search endpoint of Data Access API). The matching of two DataEnvelopes must not include the DataEnvelope's indentifier as the identifier is assigned by the Elasticsearch backend. A filter provider for SubsetDefintions contains the conditions that must be met that a DataEnvelope (available data set) fits to a SubsetDefinition (input description) (used for the /dataAccess/resources/search endpoint of Data Access API). The interfaces DataEnvelopeElasticsearchFilterProvider and SubsetDefinitionElasticsearchFilterProvider must be implemented. The methods buildFiltersForDataEnvelope/buildFiltersForSubsetDefinition must return a list of QueryBuilder. The list should contain a QueryBuilder for each condition that must be matched. For the classes to be created the existing implementations of the interfaces can be used as reference.
  2. Edit FilterProviderFactory:
    In class FilterProviderFactory the methods getFilterProviderForDataEnvelope (DataEnvelope) and getFilterProviderForSubsetDefinition (SubsetDefintion) must be edited. If subset/envelope is a instance of the new data type the method must return a instance of a suitable filter provider that were implemented in step 1. For that the If-Else block(s) should be extended.
  3. Edit DataEnvelopeJsonDeserializerFactory:
    In class DataEnvelopeJsonDeserializerFactory the method getObjectMapper must be edited. If jsonDataEnvelope (json formatted DataEnvelope) is a instance of the new DataEnvelope type (determined by srcType) a instance of that DataEnvelope type (Java Object) must be assigned to the variable typeReference. For that the switch statement should be extended. This step must be made to ensure that the data stored in Elasticsearch is returned as correctly formatted json.
Aditional Resources

Elasticsearch Java Client
Elasticsearch Java Query DSL (QueryBuilder)
Elasitcsearch REST Query DSL

DataEnvelopeToResourceConverter

A (Abstract)DataEnvelope is metadata that describes a (geospatial) dataset. There are different types of DataEnvelopes (e.g CopernicusDataEnvelope, SensorWebDataEnvelope ...) to describe different types of data. A (Abstract)Resource contains a URL (GetResource) or a URL and a HTTP POST body (PostResource) that references a dataset described by a corresponding DataEnvelope. Therefore, a Resource provides access to a certain dataset. WaCoDiS DataAccess (module Wacodis Data Access DataWrapper) is responsible for creating resources for specific data sets. Each type of DataEnvelope needs its own logic to convert a DataEnvelope of that specific type to a get or post resources.

  1. Each new sub type of AbstractDataEnvelope needs a implementation of the interface DataEnvelopeToResourceConverter. The interface defines two methods. The generic type T is the specific DataEnvelope type (e.g MyNewDataEnvelope). The method convertToResource must contain the code to derive a (get or post) resource from a DataEnvelope. It is not necessary to set the attribute dataEnvelopeReferences for the new resource (otherwise it is going to be overwritten). The method supportedDataEnvelopeType should return the class object of the type T (typically return MyNewDataEnvelope.class).

Note that a resource does not necessarily provide access to complete data set. Sometimes a resource should provide access to only subset of a data set. For example if the data source is provide by a OGC Web Feature Service (WFS), the WFSs filter capabilites should by used to subset the data set based on the information provided by the DataEnvelope. The created resource should include the filters to subset the data. In case of a WFS these filters could be part of the URL (GetResource) or provided as aditional HTTP POST body (post request). To subset the data and create filters the relevant area of interest, time frame and the (search) input ((Abstract)SubsetDefintion) can be used. Those information are contained in the ResourceSearchContext.

  1. Register the new DataEnvelopeToResourceConverter implementation by extending the the if-statement in class DataEnvelopeToResourceConversionHelper (method convertToResource).

Register corresponding DataEnvelope type for new subtype of AbstractSubsetDefinition

Usually there is a corresponding subtype of AbstractDataEnvelope for each subtype of AbstractSubsetDefinition. For example, CopernicusSubsetDefinition corresponds with CopernicusDataEnvelope and SensorWebSubsetDefinition corresponds with SensorWebDataEnvelope. Data Access receives the information about these relationships by calling the static method getCorrespondingDataEnvelopeSourceType from class SubsetDefinitionDataEnvelopeSourceTypeMapping. For a new type of SubsetDefinition this method must return the source type of the associated (new) type of DataEnvelope. It is recommended to simply extend the existing HashMap (SOURCETYPEMAPPING) in the class to implement this behavior. In some cases, there is no DataEnvelope associated with a subtype of AbstractSubsetDefinition (e.g. StaticSubsetDefinition). In this case no changes need to be made. If no corresponding subtype of AbstractDataEnvelope is registered Data Access API will not return resources (empty list) for inputs of that subtype of AbstractSubsetDefinition when calling its /dataAccess/resources/search endpoint.

Additional Changes

  1. In addition to the changes to WaCoDiS Data Access, the WaCoDiS Core Engine must also be extended to handle new data types.
Aditional Resources

WaCoDiS Core Engine

Integrate new ProductBackend for WacodisProductDataEnvelope

The attribute serviceDefinition of WacodisProductDataEnvelope describes the service which stores the described data. The serviceDefintion is a subtype of AbstractBackend. To support a new backend type the Data Access Data Wrapper module must be extended. If the subtype of AbstractBackend is unknown to DataAccess, it will only match the common attributes of supertype AbstractBackend (that means that only the backend type is compared but not the specific attributes). The changes are relevant to API endpoint /dataAccess/dataenvelopes/search.

  1. The interface ProductBackendFilterProvider must be implemented for the new backend type. The interface defines two methods. The generic type T is the specific AbsractBackend type (e.g MyNewBackendType). The method getFilterForBackend must return a list that contains all filter expressions (QueryBuilder) that are needed to compare the (relevant) attributes of the specific backend. The method supportedBackendType should return the class object of the type T (typically return MyNewBackendType.class).
  2. In the class BackendTypeFilterFactory, method getFilterForBackend must be extended so that the appropriate implementation of interface ProductBackendFilterProvider is called for the new backend type (extend the if-statement).