Getting Started en
- Setup a development environment with Asakusa Framework
- Make a batch application with Asakusa framework and execute it.
We'll describe how to setup a development environment with Asakusa framework. Before that, please check the prerequisites at Target-Platform-en.
Prepare Linux desktop environment (We'll refer this to development environment from here) for Asakusa Framework. You'll need ssh, so if you haven't installed ssh, please install it before reading below description.
By the way, you can use Linux desktop environment on virtual environment. We've tested Asakusa Framework on the virtual environments listed below.
- VMWare Player 3.1.3 (Windows)
- VMWare Fusion 3.1.2 (MacOS)
- VMWare Server 2.0.2 (Linux)
At first, create a new user for developing applicatoins with Asakusa Framework. Let's suppose you made a new user named "asakusa". We call this user "Asakusa user" from now.
After creating asakusa user, setup ssh so that you can ssh to the localhost without a passphrase. Execute the following commands
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
$ chmod 600 ~/.ssh/authorized_keys
Install Java and Hadoop.
Check Cloudera site for installing Cloudera distribution of Hadoop 3 (CDH3).
Check you can execute hadoop sample jobs correctly under stand-alone mode by asakusa user. We'll use hadoop stand-alone mode from here. If you are executing hadoop in pseudo-distributed mode, please change it to stand-alone mode before continuing.
Install MySQL on development environment.
Please refer how to install MySQL and how to download MySQL JDBC driver at MySQL site.
NOTE: Asakusa Framework does not support MySQL version 5.0. Use MySQL version 5.1 or higher.
After installing MySQL, create a MySQL user and a database for using with Asakusa framework. An application development project, which will be created Maven archetype which is described in details following text, uses "asakusa" for default database name, user name and password.
$ mysql -u root
> GRANT ALL PRIVILEGES ON *.* TO 'asakusa'@'localhost' IDENTIFIED BY 'asakusa' WITH GRANT OPTION;
> GRANT ALL PRIVILEGES ON *.* TO 'asakusa'@'%'IDENTIFIED BY 'asakusa' WITH GRANT OPTION;
> CREATE DATABASE asakusa DEFAULT CHARACTER SET utf8;
We will use this database when we make model classes with model generator and when we test with "test-driver".
NOTE: This database will be reproduced (by DROP DATABASE/CREATE DATABASE) whenever executing model generator. Do not use this database for any other purpose but the development
Install Maven in the development environment.
Please check the following site for installing Maven.
Setting environment variables which asakusa user must have.
- JAVA_HOME: JDK install directory path
- HADOOP_HOME: Hadoop install directory path
- ASAKUSA_HOME: Asakusa framework runtime modules install directory path
- "$HOME/asakusa" is recommended. If you change this value, you have to change following setting file.
- $ASAKUSA_HOME/bulkloader/conf/bulkloader-conf-db.properties)
- import.extractor-shell-name=(relative path from $HOME)
- export.extractor-shell-name=(relative path from $HOME)
- NS_MODELGEN_JDBC: DB configuration file used by model generator
- Set "$ASAKUSA_HOME/bulkloader/conf/asakusa-jdbc.properties" as value.
- NS_TESTTOOLS_CONF: DB configuration file used by test tool
- Set "$ASAKUSA_HOME/bulkloader/conf/asakusa-jdbc.properties" as value
You have to add "Maven install directory"/bin to PATH.
Following script shows an example of asakusa user's $HOME/.bash_profiles
JAVA_HOME=/usr/java/default
export JAVA_HOME
HADOOP_HOME=/usr/lib/hadoop
export HADOOP_HOME
ASAKUSA_HOME=$HOME/asakusa
export ASAKUSA_HOME
NS_MODELGEN_JDBC=$ASAKUSA_HOME/bulkloader/conf/asakusa-jdbc.properties
export NS_MODELGEN_JDBC
NS_TESTTOOLS_CONF=$ASAKUSA_HOME/bulkloader/conf/asakusa-jdbc.properties
export NS_TESTTOOLS_CONF
PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:/opt/apache-maven-3.0.3/bin
export PATH
Load the settings above into current shell environment.
$ source ~/.bash_profile
If you use an IDE such as Eclipse for development, execute it from the shell which set above environment variables. If you execute an IDE from desktop icon, you should logout then login once before doing that.
Create a project for Asakusa framework application development with Maven archetype.
$ mvn archetype:generate -DarchetypeCatalog=http://asakusafw.s3.amazonaws.com/maven/archetype-catalog.xml
...
Choose archetype:
1: http://asakusafw.s3.amazonaws.com/maven/archetype-catalog.xml -> asakusa-archetype-batchapp (-)
Choose a number: : ※enter 1
...
Choose version:
1: 0.1.0
2: 0.1.1-SNAPSHOT
Choose a number: 2: ※enter 1
...
Define value for property 'groupId': : example ※Enter any string
Define value for property 'artifactId': : batchapp ※Enter any string
Define value for property 'version': 1.0-SNAPSHOT ※Enter any string
Define value for property 'package': example ※Enter any string
...
Y: : Enter "Y"
Following description suppose the case you made a project named "batchapp".
Install Asakusa framework which is included in the project with maven archetype. If you succeed the install, Maven log shows "BUILD SUCCESS".
$ cd batchapp
$ mvn assembly:single antrun:run
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
...
Execute a sample application included in the project created by mvn archetype. If the execution succeed, 3 JUnit test cases run and succeed, then maven logs "BUILD SUCCESS".
$ mvn clean test
...
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
If you use Eclipse IDE, create a configuration file for Eclipse with Maven, then import it to Eclipse.
$ mvn eclipse:eclipse
At first time you use Eclipse, you must set class path "M2_REPO" to Eclipse.
If your workspace is at "$HOME/workspace", execute following commands.
$ mvn -Declipse.workspace=$HOME/workspace eclipse:add-maven-repo
Asakusa project directory structure which produced by archetype "asakusa-archetype-batchapp" will be like following.
project
|-- pom.xml
|-- build.properties
`-- src
| |-- main
| | `-- java
| | `-- example
| | `-- batch : A package for batch DSL
| | `-- flowpart : A package for flow DSL (flow parts)
| | `-- jobflow : A package for flow DSL (job flow)
| | `-- operator : A package for operators
| `-- test
| `-- java
| | `-- example
| | `-- batch : A package for batch DSL
| | `-- flowpart : A package for flow DSL (flow parts)
| | `-- jobflow : A package for flow DSL (job flow)
| | `-- operator : A package for operators
| `-- resources
| `-- asakusa-resources.xml : Configuration file for Asakusa Framework Core Runtime
| `-- logback-test.xml : logging configuration file for testing on development environment
| `-- testtools.properties : DB configuration file for test tool of Asakusa framework
|
`-- target (Only files depend on Asakusa framework have a description)
|-- ${artifactid}-batchapps-${version}.jar
| : Archive of batch applications compiled by Ashigel compiler
| Generated in Maven package phase
|-- ${artifactid}-XX.jar : jar file created by Maven, but Asakusa framework never use it
|-- ${artifactid}-XX-sources.jar : jar file created by Maven, but Asakusa framework never use it
|
|-- batchc : Output directory for results of batch compile by Ashigel compiler. Created in Maven package phase
|-- batchcwork : Working directory for Ahigel compiler compiling batches
|-- excel : Directory for creating test data definition sheets. Created in Maven process-resources phase
|-- sql : Directory for creating DDLs of Thunder gate. Created in Maven process-resources phase
|-- testdriver : Work directory used by Asakusa framework test driver
|
|-- generated-sources
`-- annotations
| `-- example
| `-- flowpart : Package for operator factory classes generated by annotation processor
| `-- operator : Package for operator factory classes and implementation classes generated by annotation processor
`-- modelgen
`-- example
`-- modelgen
`-- table
| `-- model : Package for data model classes generated from table structures
| `-- io : Package for data model I/O drivers generated from table structures
`-- view
`-- model : Package for data model classes generated from view information
`-- io : package for data model I/O drivers generated from view information