A quotation-based Scala DSL for scalable data analysis.
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.idea [INFRA] Relaxed IDEA 2017 code inspection settings. Sep 9, 2017
docs [DOCS] Added privacy page because of GDPR May 30, 2018
emma-benchmarks [INFRA] Reverting the project version to 0.2-SNAPSHOT. Nov 16, 2017
emma-examples-flink [FLINK] Set parallelism to 1 in ClickCountDiffsIntegrationSpec Jul 20, 2018
emma-examples-labyrinth [LABY] Add Labyrinth compilation Jul 20, 2018
emma-examples-spark [INFRA] Reverting the project version to 0.2-SNAPSHOT. Nov 16, 2017
emma-examples [LABY] Fix some checkstyle violations Jul 20, 2018
emma-flink [FLINK] Remove an unused method Jul 20, 2018
emma-gui [INFRA] Reverting the project version to 0.2-SNAPSHOT. Nov 16, 2017
emma-labyrinth [LABY] Fix some checkstyle violations Jul 20, 2018
emma-language [LABY] Fix whitespace handling in ClickCountDiffsIntegrationSpec Jul 20, 2018
emma-lib-flink [INFRA] Reverting the project version to 0.2-SNAPSHOT. Nov 16, 2017
emma-lib-spark [INFRA] Reverting the project version to 0.2-SNAPSHOT. Nov 16, 2017
emma-lib [INFRA] Reverting the project version to 0.2-SNAPSHOT. Nov 16, 2017
emma-quickstart [INFRA] Reverting the project version to 0.2-SNAPSHOT. Nov 16, 2017
emma-spark [INFRA] Reverting the project version to 0.2-SNAPSHOT. Nov 16, 2017
tools [TEST] Compiler benchmarking infrastructure May 9, 2017
.gitignore [DOCS] Migrated website to `docs` folder. Apr 11, 2017
.travis.yml [INFRA] Remove `mvn -q dependency:resolve` command. Sep 9, 2017
LICENSE [INFRA] Added Apache 2.0 headers, LICENSE and NOTICE files. Oct 3, 2016
NOTICE [SPARK][INFRA] Upgraded `spark.version` to 2.2.0. Sep 9, 2017
README.md [DOCS] Fixed broken links in README.md. Sep 28, 2017
pom.xml [POM] Added Laby to top-level pom. Jul 20, 2018
scalastyle_config.xml [LIB] Migrated `emma-lib` to main project. Sep 9, 2017

README.md

Emma

A quotation-based Scala DSL for scalable data analysis.

Build Status

Goals

Our goal is to improve developer productivity by hiding parallelism aspects behind a high-level, declarative API which maximises reuse of native Scala syntax and constructs.

Emma supports state-of-the-art dataflow engines such as Apache Flink and Apache Spark as backend co-processors.

Features

DSLs for scalable data analysis are embedded through types. In contrast, Emma is based on quotations (similar to Quill). This approach has two benefits.

First, it allows to reuse Scala-native, declarative constructs in the DSL. Quoted Scala syntax such as for-comprehensions, case-classes, and pattern matching are thereby lifted to an intermediate representation called Emma Core.

Second, it allows to analyze and optimize Emma Core terms holistically. Subterms of type DataBag[A] are thereby transformed and off-loaded to a parallel dataflow engine such as Apache Flink or Apache Spark.

Examples

The emma-examples module contains examples from various fields.

Learn More

Check emma-language.org for further information.

Build

  • JDK 7+ (preferably JDK 8)
  • Maven 3

Run

mvn clean package -DskipTests

to build Emma without running any tests.

For more advanced build options including integration tests for the target runtimes please see the "Building Emma" section in the Wiki.