Fetching contributors…
Cannot retrieve contributors at this time
271 lines (203 sloc) 8.65 KB
Flume Developer Notes
Jonathan Hsieh <>
// This is in asciidoc markup
== Introduction
This is meant to be a a guide for issues that occur when building,
debugging and setting up flume as developer.
== High level directory and file structure.
./bin/ flume scripts
./conf/ flume configuration files
./lib/ libraries used by flume
./libbuild/ libraries used by flume for building
./libtest/ libraries used by flume for testing
./src/ahocorasick a library for multple string search
./src/antlr flume config language ANTLR grammer files
./src/gen-java autogenrated java source files (from antlr/thrift)
./src/java flume java source code
./src/javaperf flume performance tests (out of date)
./src/javatest flume unit tests
./src/javatest-torture flume reliability tests (out of date)
./src/thrift flume thrift idl files (for rpc)
./src/webapps flume webapp jsp source code
Files created by build:
== Files in `.gitignore`
The exclusions in .gitignore are either autogenerated or build/eclipse
== eclipse project setup.
Run "ant eclipse", then create a new java project in eclipse with the
current directory as the base project directory.
./.eclipse default working directory for eclipse
Note: eclipse class files are not used by bin/flume, you must either
a) compile via ant for bin/flume to pick up your modified code, or b)
specify eclipse on the flume classpath, e.g.:
FLUME_CLASSPATH=./.eclipse/classes-main:./.eclipse/classes-test bin/flume
== Building thrift
This will create a repository in ./apache-thrift
git clone git://
cd thrift
git fetch
git checkout -b thrift-0.2.0 origin/tags/thrift-0.2.0
sudo make install
Problem: During
---- error: possibly undefined macro: AC_PROG_LIBTOOL
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
Solution: install libtool
sudo apt-get install libtool
== Generated source
These files should not be checked in unless their source files are modified.
== Running test with ant.
ant test
=== Running only specified test cases from ant
ant test -Dtestcase=<TestFile>
where <TestFile> is a class name without .java or .class or path.
(How do is specify just a function?)
== origin/master invariants
Always should build.
Ideally tests all pass
== Push invariants
We should tag pushes with JIRA nubmers.
== Flume's web application
The default setup for flume is to run its servlets from precompiled
jsps. The default configuration points jetty (a jsp server) to
information found in the ./webapps directory. We assume that most
developers and users will be in core, so at the git project root dir,
./webapps is a symlink that points to the build/webapps directory.
This is where static files and auto-generated files that are generated when
flume is compiled. Using this symlink makes the flume webapp use
precompiled jsp pages.
One can debug the jsp pages or have them autogenerate at runtime
(useful for development) by changing this symlink to point to
src/webapps. This directory has subapps and jsp source
code. Debugging is easier when it is more dynamic and our servlet
container Jetty can dyamically comile jsp pages to hasten the
debugging process.
Here are some tips for getting the web apps for Flume Master or Flume
Node running from inside eclipse.
* Make sure `tools.jar` is in your java classpath. If JAVA_HOME is
set to a JDK JVM path (as opposed to a JRE) this jar should be
included. This jar includes the java compiler which is required to
enable the compilation of jsp's so they can be served on the fly.
* Ant is used to compile jsps. Make sure some version of
ant-launcher.jar and ant-1.x.x.jar is in your build path. (if you
are in eclipse for example). These files live in ./libbuild
* The default is to point to a the web app at a precompiled version of
of the servlets. There is a hook in flume-site.xml to point the
jetty at a directory full of jsps. It assumes that hte flume
directory is the base for relative paths or can use a fully
qualified path
Environment variables can be set in the +bin/ script.
# bin/ for Ubuntu installs
export JAVA_HOME=/usr/lib/jvm/java-6-sun
Alternately, instead of using symlnks, one can set the following
property in the system's flume-site.conf file, like below.
<description>This is the path use to the web apps that display
flume node/master data
Use src webapps for development.
=== Problems when compiling JSPs
/home/jon/flume/build.xml:471: java.lang.ExceptionInInitializerError
at org.apache.jasper.JspCompilationContext.createCompiler(
at org.apache.jasper.JspC.processFile(
at org.apache.jasper.JspC.execute(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
Caused by java.lang.NoClassDefFoundError: org/apache/log4j/Category) (Caused by org.apache.commons.logging.LogConfigurationException: No suitable Log constructor [Ljava.lang.Class;@31554233 for org.apache.commons.logging.impl.Log4JLogger (Caused by java.lang.NoClassDefFoundError: org/apache/log4j/Category))
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(
at org.apache.commons.logging.LogFactory.getLog(
at org.apache.jasper.compiler.Compiler.<clinit>(
... 25 more
Make sure that log4j-xxx.jar is in your CLASSPATH.
== Avro.
Using a post 1.2.0 version that has reflection that supports
Strings, byte[]'s and extracting fields defined in super classes.
requires (in repo):
lib/avro-1.2.0-dev.jar # (trunk hash 8911c848 ; more avro 1.3 than 1.2)
lib/paranamer-1.5.jar # extra reflection stuff
lib/jackson-core-asl-1.1.1.jar # json parser
lib/jackson-mapper-asl-1.1.1.jar # json parser
== Developer mode.
This is an option in the bin/flume for using eclipse built class files
instead of ant built class files.
in bash one would set FLUME_DEVMODE to true:
$ declare -x FLUME_DEVMODE=true
It is assumed that the eclipse build path is build_eclipse/.
== Building Windows packages
Building a full windows package and installer executable requires a
few steps. A cygwin envrionment is currently assumed.
1) Build flume jars ('ant tar').
2) Update the installer script to add versioning information ('ant
winstall-filter'). This generates ./flume.nsi.
3) Run nsis compiler on the generated flume.nsi. We've used v2.46
The current cut does not deal with differences between 32-bit vs
64-bit versions, proper error handling situations, or checks to see if
not run as administrator.
== Building documentation
Documentation for Flume is written in asciidoc. It relys on several
libraries to generate images.
* asciidoc v8.5.2
* graphviz (dot) v2.26.3
Documents can be built by running 'ant docs'
== License
All source files must include the following header:
* Licensed to Cloudera, Inc. under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. Cloudera, Inc. licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* See the License for the specific language governing permissions and
* limitations under the License.