This folder contains the core Java and Maven-based implementation of OpenInfRA. The project page can be found here.
LOC (version 1.4.2): approx 25.000
To run OpenInfRA different prerequisites must be complied. At this point we will give a broad overview about the necessary components and their configuration. OpenInfRA consists of the following parts:
The application is written in Java and must be compiled with Java 7. It must be packed into a war file and run on a server. Currently, the application is optimized for Apache Tomcat. There are only a few handles necessary to configure the application. The main configuration file must be adapted to the current needs. The different configuration parameters are commented and need no further explanations. The prime configurations are the database connection and file path properties. These must be set correctly in order to run the application.
The OpenInfRA application is based on the following software stack:
- JAX-RS: Jersey
- Security: Apache Shiro
- JPA: EclipseLink
- Database: PostgreSQL + PostGis
The database is necessary to provide a data storage for OpenInfRA. Further instructions can be found in the appropriated repositoriy.
To make use of Solr in OpenInfRA two prerequisites must be complied. First the server itself must be installed. Second the core definition must be installed in the specified file path. Further instructions can be found in the appropriated repositories.
A detailed description is provided in the specific folder: GXC.
OpenInfRA provides a file upload. This upload requires ImageMagick to generate and produce different image representations.
- PDF conversion requires ghostscript
- Conversion of raw file formats (such as DNG) require UFRaw under Linux-based systems.
This section shows some starting points and describes a few details. Project and TopicCharacteristic are used as running examples. As the name states, 'Project' refers to an OpenInfRA project. A 'TopicCharacteristic' is an abstract container which groups a set of objects by the description and consolidation of specific attributes. An object is called TopicInstance and it defines attribute values consolidated by a 'TopicCharacteristic'. This leads to the following correlation: a set of TopicInstances are type of a specific TopicCharacteristic.
The following picture shows the 'Project' and the 'TopicCharacteristic' as model objects. 'Model objects' refer to the persistence layer. Native SQL queries or JPA queries schould be placed here. This helps to keep the code nice and clean.
The following picture shows the 'Project' and the 'TopicCharacteristic' as POJO objects. 'POJO objects' are data containers. These containers are used to transfer data from the application core to the REST API.
Database access is utilized by means of a DAO pattern. There exists a DAO class for each model object. DAO classes are used to transform 'model objects' into 'POJO objects' and vice versa. This is a litle bit implementation intensive but leads to a maximum of data control. Thus, it is possible to hide data in the REST API and to enrich the REST API with additional information without invoking side effects to the persistence layer.
The entity manager is very important for the DAO classes and the reaction time of the application since it utilizes the database access. Each DAO class uses its own 'entity manager'. In order to provide fast access it exists an EntityManagerFactoryCache which provides the administation of 'entity manager' objects for DAO classes.
OpenInfRA provides different database schemas. Each 'database schema' is optimized for specific needs:
- system: The 'system schema' contains abstract data and information which is used to derive project schemas.
- project: A 'project schema' contains only project specific data without meta data. There are several project schemas. Each project schema provides its own UUID.
- meta data: The 'meta data schema' contains additional information of a project.
- rbac: The 'rbac schema' contains information for the role-based access control system. This includes user information, roles and permissions.
- webapp: The 'webapp schema' provides additional information for GUI applications.
- files: The 'files schema' provides data of the file upload system.
- search: The 'search schema' is not a real database schema, it only provides an access point for the search engine.
Adding a new schema can be done very easy by the following steps:
-
- Create schema on the database level.
-
- Generate necessary model objects.
-
- Register the new schema in the OpenInfraSchemas enumeration
-
- Create POJO, DAO and RBAC classes.
-
- Register the new schema in the EntityManagerFactoryCache.
-
- Register resources and URLs in the REST API.
- The JUnit tests have to be extended.
- Logging isn't implemented appropriately.
- Deleting a main project (project schema) leaves dead links in file and RBAC schema
- The RBAC classes contains hard coded strings like 'project', 'schema' or 'topiccharacteristic'. This should be replaced somehow.
- Retrieving a list of topic characteristics will deliver the same list for main projects as for their sub projects. The method that returns the list must respect the project id.
The initial database schema didn't consider an UUID for the identification of associations. An association object was initally identified by the UUIDs of the related objects e.g. the 'attribute type' to 'attribute type group' association was identified by the UUID of the 'attribute type' and the 'attribute type group'. Each of the aforementioned objects relates to a POJO object. Thus, it became necessary to equip each POJO with an UUID (especially the association object) in order to provide a generic API. However, the old access stratedgy is still available and should be revised.
- Not all results deliver a highlighting for query matches e.g. search term *
- The index process saves dates in a special format that could not directly be influenced. Maybe an own implementation of an analyser would fix the problem.
- Executing a complete index process will fail on some servers because of a GC Overhead limit exceeded exception. The reason for this is the implementation of the indexing process. For more informations see the official Java documentation
- There are two kinds of data sources (database & documents) the core will use for indexing and searching. The current approach is not generic. It should be reworked to be more flexible.
- The index will not be updated automatically if a new file was uploaded.