Operation and Configuration

Operations

Hardware

Luxun can work on normal PC with small memory, however, for production usage, it is recommended to use Luxun on server grade machine with at least 4GB memory, >8GB memory is recommended, such that active log pages can be cached for better performance.

Luxun throughput is basically limited by disk and network bandwidth, for better performance, it's recommended to use Luxun with ultra fast disk; 1Gbps network is ok, 10Gbps network is recommended, in network bandwidth limited environment, compression can improve the network utilization a lot.

OS

Luxun has been tested on Windows 7 with NTFS filesytem, CentOS 6.3 and Ubuntun 12.04 with ext4 filesytem, on all these platforms, the performance of Luxun is quite good.

Java

Luxun has been tested on Oracle JDK 1.6.x.

>java -version
java version "1.6.0_38"
Java(TM) SE Runtime Environment (build 1.6.0_38-b05)
Java HotSpot(TM) 64-Bit Server VM (build 20.13-b02, mixed mode)

According to our testing, even without additional JVM heap setting, Luxun can work smoothly, you may add additional heap memory to Luxun according to your particular appliction. To set JVM options, you can set the JAVA_OPTS environment variable before you run luxun server script, for example, JAVA_OPTS=-Xmx4g, or you may change the Luxun sever script directly.

Configuration

Note, according to our testing:

The most important server configuration for performance is the control of the disk flush rate. The most often data is flushed to disk, the more "seek-bound" Luxun will be and the lower the throughput. It is recommended to disable flush on Luxun broker, since in most cases flush is not necessary because of the unique feature of memory mapped file leveraged by Luxun internally. You may enable flush only if you need transactional reliability and you are aware of the cost to performance, the flush behavior can be set by either giving a timeout(flush at most every 30 seconds, say) or a number of messages(flush every 1000 messages), this can be overriden at the topic level, if needed.

The most imporant client configurations for performance are:

compression

sync vs async producing
batch size
fetch size

You may adjust these configurations according to your real environment.

Important configuration properties for Luxun broker

More details about server configuration can be found in the class com.leansoft.luxun.server.ServerConfig.

name	default	description
brokerid	none	Each broker is uniquely identified by an id. This id serves as the brokers "name", and allows the broker to be moved to a different host/port without confusing consumers.
port	9092	the port to listen and accept connections on
monitoring.period.secs	600	the interval in which to measure performance statistics
log.dir	none	Specifies the root directory in which all log data is kept.
log.flush.count	-1	Controls the number of messages accumulated in each topic in memory before the data is flushed to disk. Note: usually, explicit flush is not needed since: 1.) underlying log will automatically flush a cached page when it is replaced out, 2.) underlying log uses memory mapped file, and the OS will flush the changes even your process crashes. Set this property to a positive number only if you need transactional reliability and you are aware of the cost to performance.
log.retention.hours	24 * 7	The number of hours to keep a log file before deleting it
log.retention.size	-1	the maximum size of the log before deleting it. This controls how large a log is allowed to grow, a negative number means no maximum size limiting.
topic.log.retention.hours	none	Topic-specific retention time that overrides log.retention.hours, e.g., topic1:10,topic2:20.
log.cleanup.interval.mins	10	Controls how often the log cleaner checks logs eligible for deletion. A log file is eligible for deletion if it hasn't been modified for log.retention.hours hours.
log.default.flush.scheduler.interval.ms	-1	Controls the interval at which logs are checked to see if they need to be flushed to disk. A background thread will run at a frequency specified by this parameter and will check each log to see if it has exceeded its flush.interval time, and if so it will flush it. Note: usually, explicit flush is not needed since: 1.) underlying log will automatically flush a cached page when it is replaced out, 2.) underlying log uses memory mapped file, and the OS will flush the changes even your process crashes. Set this property to a positive number only if you need transactional reliability and you are aware of the cost to performance.
log.default.flush.interval.ms	log.default.flush.scheduler.interval.ms	Controls the maximum time that a message in any topic is kept in memory before flushed to disk. The value only makes sense if it's a multiple of log.default.flush.scheduler.interval .ms. Note: usually, explicit flush is not needed since: 1.) underlying log will automatically flush a cached page when it is replaced out, 2.) underlying log uses memory mapped file, and the OS will flush the changes even your process crashes. Set this property to a positive number only if you need transactional reliability and you are aware of the cost to performance.
topic.flush.intervals.ms	none	Per-topic overrides for log.default.flush.interval.ms. Controls the maximum time that a message in selected topics is kept in memory before flushed to disk. The per-topic value only makes sense if it's a multiple of log.default.flush.scheduler.interval.ms. E.g., topic1:10000,topic2:20000. Note: usually, explicit flush is not needed since: 1.) underlying log will automatically flush a cached page when it is replaced out, 2.) underlying log uses memory mapped file, and the OS will flush the changes even your process crashes. Set this property to a positive number only if you need transactional reliability and you are aware of the cost to performance.
max.message.size	1024 * 1024	maximum size of a single message that the server can receive
log.backfile.page.size	128 * 1024 * 1024	Controls the maximum size of a single log page file.
password	none	Authentication password for server administration, such as topic deleting.

Important configuration properties for the high-level consumer.

More details about consumer configuration can be found in the class com.leansoft.luxun.consumer.ConsumerConfig.

name	default	description
groupid	none	Aka fanoutid, is a string that uniquely identifies a set of consumers within the same consumer group. Must be set before a consumer can work.
broker.list	none	A list of broker to consume from, format - brokerid1:host1:port1,brokerid2:host2:port2. Must be set before a consumer can work.
consumerid	none	Optional consumer identifier
socket.timeout.ms	30 * 1000	controls the socket timeout for network requests
fetch.size	1024 * 1024	controls the number of bytes of messages to attempt to fetch in one request to the Luxun server
fetcher.backoff.ms	1000	This parameter avoids repeatedly polling a broker node which has no new data. We will backoff every time we get an empty set from the broker for this time period
fetcher.backoff.ms.max	fetcher.backoff.ms * 10	Maximum backoff time window
queuedchunks.max	100	the high level consumer buffers the messages fetched from the server internally in blocking queues. This parameter controls the size of those queues
consumer.timeout.ms	-1	By default, this value is -1 and a consumer blocks indefinitely if no new message is available for consumption. By setting the value to a positive integer, a timeout exception is thrown to the consumer if no message is available for consumption after the specified timeout value.
num.retries	0	In case there is any consumption exception, how many times should the underlying fetcher thread retry before it exists. A nagative number or zero means no retry.

Important configuration properties for the producer.

More details about producer configuration can be found in the class com.leansoft.luxun.producer.ProducerConfig.

name	default	description
connect.timeout.ms	5000	the maximum time spent by com.leansoft.luxun.producer.SyncProducer trying to connect to the luxun broker. Once it elapses, the producer throws an ERROR and stops.
socket.timeout.ms	30000	The socket timeout in milliseconds.
reconnect.count	30000	the number of produce requests after which com.leansoft.luxun.producer.SyncProducer tears down the socket connection to the broker and establishes it again.
reconnect.time.interval.ms	1000 * 1000 * 10	the time window after which com.leansoft.luxun.producer.SyncProducer tears down the socket connection to the broker and establishes it again.
max.message.size	1000 * 1000	the maximum number of bytes that the com.leansoft.luxun.producer.SyncProducer can send as a single message payload
serializer.class	com.leansoft.luxun.serializer.DefaultEncoder. This is a no-op encoder. The serialization of data to Message should be handled outside the Producer	class that implements the com.leansoft.luxun.serializer.Encoder interface, used to encode data of type T into a Luxun message
broker.list	none	A list of broker to produce messages to, format-brokerid1:host1:port1,brokerid2:host2:port2. Must be set before a producer can work.
compression.codec	0 (No compression)	This parameter allows you to specify the compression codec for all data generated by this producer. use 0 for no compression, 1 for GZip compression, 2 for Snappy compression.
compressed.topics	none	This parameter allows you to set whether compression should be turned on for particular topics. If the compression codec is anything other than NO_COMPRESSION, enable compression only for specified topics if any. If the list of compressed topics is empty, then enable the specified compression codec for all topics. If the compression codec is NO_COMPRESSION, compression is disabled for all topics.
producer.type	sync	this parameter specifies whether the messages are sent asynchronously or not. Valid values are - async for asynchronous batching send through com.leansoft.luxun.producer.async.AyncProducer; sync for synchronous send through com.leansoft.luxun.producer.SyncProducer.
partitioner.class	com.leansoft.luxun.producer.DefaultPartitioner<T> - - uses the partitioning strategy hash(key)%num_brokers. If key is null, then it picks a random broker.	class that implements the com.leansoft.luxun.producer.IPartitioner<K>, used to supply a custom partitioning strategy on the message key (of type K) that is specified through the ProducerData<K, V> object in the com.leansoft.luxun.producer.Producer<K, V> send API
Options for Asynchronous Producers (producer.type=async)
queue.time	5000	maximum time, in milliseconds, for buffering data on the producer queue. After it elapses, the buffered data in the producer queue is dispatched to the com.leansoft.luxun.producer.async.EventHandler.
queue.size	1000	the maximum size of the blocking queue for buffering on the com.leansoft.luxun.producer.async.AyncProducer
queue.enqueueTimeout.ms	0	Control the enqueue behaviour if there is no space available in the blocking queue, 0 - will return failure and drop the message without waiting; a negative number - will wait till space available; a positive number - will wait till the specified timeout setting before returning failure and dropping the message.
batch.size	200	the number of messages batched at the producer, before being dispatched to the com.leansoft.luxun.producer.async.EventHandler
event.handler	com.leansoft.luxun.producer.async.DefaultEventHandler<T>	the class that implements com.leansoft.luxun.producer.async.EventHandler<T> used to dispatch a batch of produce requests, using an instance of com.leansoft.luxun.producer.SyncProducer.
event.handler.props	none	the java.util.Properties() object used to initialize the custom event.handler through its init() API
callback.handler	none	the class that implements com.leansoft.luxun.producer.async<T> used to inject callbacks at various stages of the com.leansoft.luxun.producer.async.AyncProducer pipeline.
callback.handler.props	none	the java.util.Properties() object used to initialize the custom callback.handler through its init() API
num.retries	0	If DefaultEventHandler is used, this specifies the number of times to retry if an error is encountered during send.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly