Java Groovy Other
Switch branches/tags
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.github
.mvn/wrapper Add Maven Wrapper May 1, 2016
build-tools [maven-release-plugin] prepare for next development iteration Aug 7, 2017
core-project
directio-project
dmdl-project
docs
dsl-project [maven-release-plugin] prepare for next development iteration Aug 7, 2017
gradle Bump Gradle required Java version to 1.8. Aug 15, 2017
hive-project
info
integration Revise integration test for windows. Aug 14, 2017
iterative-project
operation-project
operator
sandbox-project
testing-project
utils-project
windgate-project Fix WindGate for Windows. Aug 14, 2017
workflow Revise error messages in `asakusa run`. Aug 18, 2017
yaess-project
.gitattributes
.gitignore
LICENSE Modified license year. May 8, 2013
NOTICE
README.md Revise README.md. Jul 7, 2017
mvnw Add Maven Wrapper May 1, 2016
mvnw.cmd
pom.xml Bump Gradle tooling API version to 4.1. Aug 10, 2017

README.md

Asakusa Framework

Asakusa is a full stack framework for distributed/parallel computing, which provides with a development platform and runtime libraries supporting various distributed/parallel computing environments such as Hadoop, Spark, M3 for Batch Processing, and so on. Users can enjoy the best performance on distributed/parallel computing transparently changing execution engines among MapReduce, SparkRDD, and C++ native based on their data size.

Other than query-based languages, Asakusa helps to develop more complicated data flow programs more easily, efficiently, and comprehensively due to following components.

  • Data-flow oriented DSL

    Data-flow based approach is suitable for DAG constructions which is appropriate for distributed/parallel computing. Asakusa offers Domain Specific Language based on Java with data-flow design, which is integrated with compilers.

  • Compilers

    A multi-tier compiler is supported. Java based source code is once compiled to inter-mediated representation and then optimized for each execution environments such that Hadoop(MapReduce), Spark(RDD), M3 for Batch Processing(C++ Native), respectively.

  • Data-Modeling language

    Data-Model language is supported, which is comprehensive for mapping with relational models, CSVs, or other data formats.

  • Test Environment

    JUnit based unit testing and end-to-end testing are supported, which are portable among each execution environments. Source code, test code, and test data are fully compatible across Hadoop, Spark, M3 for Batch Processing and others.

  • Runtime execution driver

    A transparent job execution driver is supported.

All these features have been well designed and developed with the expertise from experiences on enterprise-scale system developments over decades and promised to contribute to large scale systems on distributed/parallel environments to be more robust and stable.

How to build

Maven artifacts

./mvnw clean install -DskipTests

Gradle plug-ins

cd gradle
./gradlew clean [build] install

How to run tests

Maven artifacts

export HADOOP_CMD=/path/to/bin/hadoop
./mvnw test

Gradle plug-ins

cd gradle
./gradlew [clean] check

How to import projects into Eclipse

Maven artifacts

./mvnw eclipse:eclipse

And then import existing projects from Eclipse.

If you run tests in Eclipse, please activate Preferences > Java > Debug > 'Only include exported classpath entries when launching'.

Gradle plug-ins

cd gradle
./gradlew eclipse

And then import existing projects from Eclipse.

Sub Projects

Related Projects

Resources

Bug reports, Patch contribution

License