These FAQs and answers are relative to the current release unless otherwise noted.
Ensure that you have provided the jvm with sufficient heap space. By default Flume starts the jvm with it’s default heap allocation, which differs depending on the jvm version, the host type (os, 32/64 bit, etc…), total host memory available, as well as other issues.
The environment variable UOPTS can be used to pass additional jvm parameters when running Flume. e.g.
$ UOPTS=“-Xms1g -Xmx2g” bin/flume node
which starts a flume node with an initial heap of one gig and a max heap of two gig. See “java -h” or “java -X” for more details on available jvm options.
We use it to make the master reliable.
For now, we suggest adding a group that the flume user is part of, and make the file give read rights to members of that group.
You can use your browser to go to http://:35872/ and should get a Flume-generated web page.
if you have a the JDK installed you can use the ‘jps’ program to find out the names of the java processes currently running. You should see something like this :
If not you can run ‘ps aux | grep Flume’ to see if it is running:
$ ps aux | grep Flume
flume 4677 0.0 0.1 2383292 32140 ? Sl Sep19 1:42 java -Dflume.log.dir=/var/log/flume -Dflume.log.file=flume-flume-master-monster01.sf.cloudera.com.log -Dflume.root.logger=INFO,DRFA -Dzookeeper.root.logger=INFO,zookeeper -Dwatchdog.root.logger=INFO,watchdog -Djava.library.path=/usr/lib/flume/lib -Xmx2000m -Dpid=4677 -Dpidfile=/tmp/flumemaster.pid com.cloudera.flume.watchdog.FlumeWatchdog java -Dflume.log.dir=/var/log/flume -Dflume.log.file=flume-flume-master-monster01.sf.cloudera.com.log -Dflume.root.logger=INFO,DRFA -Dzookeeper.root.logger=INFO,zookeeper -Dwatchdog.root.logger=INFO,watchdog -Djava.library.path=/usr/lib/flume/lib -Xmx2000m com.cloudera.flume.master.FlumeMaster
flume 4711 0.0 0.2 2536336 55900 ? Sl Sep19 0:09 java -Dflume.log.dir=/var/log/flume -Dflume.log.file=flume-flume-master-monster01.sf.cloudera.com.log -Dflume.root.logger=INFO,DRFA -Dzookeeper.root.logger=INFO,zookeeper -Dwatchdog.root.logger=INFO,watchdog -Djava.library.path=/usr/lib/flume/lib -Xmx2000m com.cloudera.flume.master.FlumeMaster
You may have edited a version of the file but probably isn’t where flume expects it. Try manually starting the flume node but going to the command line and
In the first few lines of output there should be something like:
10/07/21 10:25:20 INFO conf.FlumeConfiguration: Loading configurations
The flume-site.xml file that you edit should be in that directory.
(in this case ‘/etc/flume/conf/flume-site.xml’)
Check the log and the permissions of the files to make sure flume has permissions to read the files. Oftentimes logs are stored as root or another system user. Flume usually runs as the user or as the user ‘flume’, who may not have permissions.
You need to set a the output format property in your flume-site.xml file.
<property> <name>flume.collector.output.format</name> <value>raw</value> <description>The output format for the data written by a Flume collector node. There are several formats available: syslog - outputs events in a syslog-like format log4j - outputs events in a pattern similar to Hadoop's log4j pattern avrojson - this outputs data as json encoded by avro avrodata - this outputs data as a avro binary encoded data debug - used only for debugging raw - output only the event body, no metadata </description> </property>
Set the flume.event.max.size.bytes property in the flume-site.xml file to a max size value.
Yes, it v0.9.1 supports gzip compression and v0.9.2 supports any compression codec hadoop supports.
That bug should be fixed in v0.9.1u1 and v0.9.2+.
To use E2E reliability modes, you currently must use the collectorSink at the end point! The collectorSink contains the code that checks and responds to the acking and flushing logic injected in the ackedWriteAhead decorator that are used/generated in the auto/agent E2E sinks.
Also, make sure that if you change flume.collector.roll.millis, change flume.agent.logdir.retransmit to a value at least twice as big.
To guarantee data gets written, we can only send acknowledgements after we have successfully written. Sinks do the writing so only they can send the acknowledgement signals!
2136264 [pool-1-thread-3] ERROR org.apache.thrift.server.TSaneThreadPoolServer - Thrift error occurred during processing of message. org.apache.thrift.protocol.TProtocolException: Missing version in readMessageBegin, old client? at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:201) at com.cloudera.flume.conf.thrift.FlumeClientServer$Processor.process(FlumeClientServer.java:290) at org.apache.thrift.server.TSaneThreadPoolServer$WorkerProcess.run(TSaneThreadPoolServer.java:280) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)
This happens when an incorrect client attempts to talk to one of the thrift service. You may have accidentally added a port to flume.master.servers or some other place.
The hbase-site.xml file should be put in a dir that is in the your class path of flume. This means you could put the file in ./flume/conf/ or add the path to your configuration file to FLUME_CLASSPATH.
The default setting is to precompile the jsps into java code. Currently these are generated when ‘ant’ is run from the command line. The java servlets are written to ./build/src/. You need to make sure to add ‘build/src’ to you eclipse build path.
Please let us know! We use a system called JIRA for bug reporting, tracking, and resolution. You can go here to let us know what you have found! Please let us know the version, component (if you can tell), and ideally a way to duplicate the bug!
Flume has weaker guarantees than some other systems (message queues for example) in the interest of moving data around more quickly and to enable cheaper fault tolerance (The idea is to minimise the amount of state that Flume has to keep. Replicated state is what makes fault-tolerance hard, and makes reasoning about failure conditions difficult.). In Flume’s end-to-end reliability mode, events are delivered at least once, but with no ordering guarantees. We’ve found this sufficient for using Flume as a data conduit, since messages can be de-duplicated either at write time or by a post-hoc batch process. However, this means that Flume is harder to use as a message passing or eventing framework unless your application is setup to be idempotent wrt duplicate events and there is no causal relationship between events that is required to be preserved upon delivery.
There are two ways that events may be re-ordered:
1. They are transmitted in DFO or E2E modes, and a failure delays them until after the successful delivery of some chronologically later events. The agent will try and retransmit unacknowledged events, but that could happen after some events get delivered just fine.
2. The network reorders the packets. That can’t happen with current TCP protocols (i.e. there’s buffering and reordering done at the receiver), but I can’t rule out us going to UDP, precisely because we don’t need those guarantees.
You can always reconstruct causal order after all events are delivered by looking at their timestamps, but at the time of delivery you don’t know if there are events that you missed, unless you attach sequence numbers to each. If you are using Flume for alerting then you just need to track when the last interesting state was Say you received an ERROR notification with timestamp t – just make sure you save t and silently drop any messages that arrive after it with timestamps < t.
In BE mode, currently, events should arrive in order but it’s possible they could be delivered to different collectors, if you have more than one. You have to be aware of the possibility that events could be arbitrarily delayed, as well, although the delay you see for BE should be less than for DFO or E2E (i.e. events are usually delivered quickly, or not at all).
Last edited by BertrandDechoux,