New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCKETMQ-121]Support message filtering based on SQL92 #82
Conversation
1 similar comment
2 similar comments
@shroman @lizhanhui @lollipopjin Could you help us to review this great job ? |
public interface ConsumerIdsChangeListener { | ||
void consumerIdsChanged(final String group, final List<Channel> channels); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great refactoring
2 similar comments
|
||
import org.apache.rocketmq.common.BrokerConfig; | ||
import org.apache.rocketmq.common.constant.LoggerName; | ||
import org.apache.rocketmq.filter.util.BitsArray; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BitMap ? Could we use the roaring bitmap, https://github.com/RoaringBitmap/RoaringBitmap
|
||
ret = filterData.getCompiledExpression().evaluate(context); | ||
} catch (Throwable e) { | ||
log.error("Calc filter bit map error!commitLogOffset=" + request.getCommitLogOffset() + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use {} replace +
log.warn("Spend {} ms to calc bit map, consumerNum={}, topic={}", eclipseTime, filterDatas.size(), request.getTopic()); | ||
} | ||
} catch (Throwable e) { | ||
log.error("Calc bit map error! topic=" + request.getTopic() + ", offset=" + request.getCommitLogOffset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow the previous comment
return this.deadTime >= this.bornTime; | ||
} | ||
|
||
public long deadHowLong() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is deadhowlong
} | ||
|
||
@Override | ||
public boolean equals(Object o) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use ToStringBuilder in Commons Lang3 to build equal and hash override method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already replaced by EqualsBuilder, ToStringBuilder and HashCodeBuilder.
* Represents a constant expression | ||
* <p> | ||
* This class was taken from ActiveMQ org.apache.activemq.filter.ConstantExpression, | ||
* but: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great~
* limitations under the License. | ||
*/ | ||
|
||
/* Generated By:JavaCC: Do not edit this line. ParseException.java Version 5.0 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why java Version 5.0
* significant. This allows a positive 32-bit number to be returned for all | ||
* cases. | ||
* <br>Don't change the order of algorithms, add new algorithm to last if you want. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended commons codec's hashing algorithms or guava hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll check whether the third party's jar could meet the scene.
/** | ||
* Simple implement of bloom filter. | ||
*/ | ||
public class BloomFilter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended existent mature BloomFilter in guava or hahoop‘s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I replaced HashAlgorithm to Guava Hashing.murmur3_128. Most classes of Guava's BloomFilter are private and final, which could not be included by RocketMQ, such as BitArray saved in store.So I didn't use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, murmur3_128
* @param <K> | ||
* @param <V> | ||
*/ | ||
public class LRUCache<K, V> extends LinkedHashMap<K, V> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replace with guava CacheBuilder?
2 similar comments
2 similar comments
Will look into this PR today. |
I think only servtool‘s module has guava dependency, so there is no transitive dependency pollution on sdk, right? |
please @lizhanhui @shroman help to review this great PR :-) |
@vongosling yeah, only server's modules have guava dependency, client doesn't. |
@lizhanhui , Do you have any other thoughts about this PR? If no, I will merge this PR soon. And please @vsair help resolve the conflicting files. |
@zhouxinyu I have checked the major data flow and it looks good to me. I have not scrutinized the changes line by line yet. You may merge it first and we may discuss potential issues hereafter. |
1. Add filter module 2. Manage consumer filter expression 3. Support pre calculate filter result when build consume queue. 4. Check whether server support feature of sql when consumer start, maybe it's not the best solution (When network is not stable?).
1. Srvutil module include Guava 2. Replace LRUCache to CacheBuilder, replace HashAlgorithm to Guava Hashing.murmur3_128
2 similar comments
@zhouxinyu Conflicts have been resolved. Thanks. |
It seems that this PR could be merged now. @vongosling @zhouxinyu @lizhanhui |
I have two things to confirms,
|
Hi, Jaskey;
Thanks for your question.
1. It will decode the properties to do calculation at that case.
2. When they are changed, the consumer's subscription will be ignored when
starting, so the results int bloom filter also will be ignored. That is
mean server will do calculation again and not update the results already
existed in Store.
Thanks,
Vsair
…On Mon, Jun 12, 2017 at 2:47 PM, Jaskey ***@***.***> wrote:
@vsair <https://github.com/vsair>
I have two things to confirms,
1.
if the old messages which exists before consumer subscribe with the
filter expression must be filtered by decodng properties to do calculation ?
2.
`expectConsumerNumUseFilter = 32` and `maxErrorRateOfBloomFilter = 20` can be configured and changed, what if I change this value and restart, does the calculated results in the ConsumeQueueExt updated accordingly?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#82 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACeujdZf1ZNbu-MycGfHW_kbIuCCvUE3ks5sDN8ZgaJpZM4Mk0bH>
.
|
Thanks for your clarification. I am just trying to go through the main logic of sql filter. For the point of
where is that code logic ? I search for usage of What I am concerned is that, if we change any configuration about the bloom filter, the bit result may be different as it was before changing configuration , so it will be different from the result persisted in the consume queue ext. I wonder that will broker works as usual to filter the expected messages as it did before. |
Support message filtering based on SQL92
So far, RocketMQ only support message filtering feature by
TAG
, but one message only can own one tag, this is too limited to meet complex business requirements.So, we want to define and implement a reasonable filter language based on a subset of the SQL 92 expression syntax to support customized message filtering.
Why subset of SQL92
Let RocketMQ has the ability of message filtering is the purpose of this issue, no matter SQL92 or any other languages.
As I know, ActiveMQ already impllement this functionality based on JavaCC, it's simple and exntensible.So I just extract it and integrate into RocketMQ, only some grammars:
Design
The implementation of SQL92 language is placed in this module which have dependency on common module.
Broker compile or evaluate expression through the interface of
FilterSpi
contained inFilterFactory
that manage allFilterSpi
and also support new one to register.Different from tag filtering, expression of SQL92 should be compiled first to check whether is leagal and then use the complied expression to compute. This procedure is designed to take place at broker.
ConsumerManager
manage the suscriptions of push consumer, andConsumerFilterManager
manage the expression info of push consumer who wish to filter message by special language, the info includes data version, expression, compiled expression, alive time and etc.I redesign the interface
getMessage
ofMessageStore
by replace the last parameterSubscriptionData
toMessageFilter
that is also refactored. The purpose is to make modulerocketmq-store
has no relation with protocol.When get message, the implementation
ExpressionMessageFilter
would check whether the message is matched byBitsArray
which will be refered later or evaluation, just as the mechanism of tag filtering.It's poor performance to do filter when pull message:
BloomFilter
and pre-calculation are adopted to optimize the situation:BloomFilter
when register to broker.CommitLog
, the consumer's filtering result would be calculated, and all resuls are assembled as aBitsArray
saved inConsumeQueueExt
.ConsumeQueueExt
is a store file linked toConsumeQueue
,ConsumeQueue
could find the data by thetagsCode
whitch is already replaced by the address(for compitable, the range is Long.MIN_VALUE to Integer.MIN_VALUE) generated byConsumeQueueExt
.ExpressionMessageFilter
could use theBitsArray
to check whether the message is matched. Because of BloomFilter's collision, it also need to decode properties to do calculation for matched message(may could be reduced by check the collision, not include in this edition).This optimization is suitable for:
Interface
Only push consumer could filter message by SQL92 expression in this edition.