An extensible and high-performance data processing engine
Java Scala Other
Clone or download

README.md

AnalyzerBeans

An extensible and high-performance data processing engine.

NOTE: Beginning with DataCleaner 4.0 we have decided to discontinue AnalyzerBeans as a separate project. We have incorporated AnalyzerBeans as the engine of DataCleaner and decided to keep the merge the two.

Please visit DataCleaner here: https://github.com/datacleaner/DataCleaner

Module structure

Modules are:

  • api - the API for AnalyzerBeans; contains interfaces and annotations to build processing components
  • core - the main processing engine implementation
  • testware - various utilities useful for testing
  • components - contains several submodules for concrete components that can be used in AnalyzerBeans. Some notable submodules are:
  • basic-transformers
  • basic-filters
  • basic-analyzers
  • html-rendering - framework for rendering analysis results as HTML fragments and pages
  • writers - components for inserting and updating data in target datastores
  • (...)
  • env - contains submodules for various environment configurations. Some notable submodules are:
  • cluster - framework for clustering AnalyzerBeans jobs
  • xml-config - reader and writers for jobs and configuration objects to and from XML files (conf.xml and .analysis.xml job files)
  • (...)
  • cli - a command-line interface which can be used to execute AnalyzerBeans jobs

Continuous Integration

There's a public build of AnalyzerBeans that can be found on Travis CI:

https://travis-ci.org/datacleaner/AnalyzerBeans

License

Licensed under the Lesser General Public License, see http://www.gnu.org/licenses/lgpl.txt