-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logical_count: improve memory usage #1341
logical_count: improve memory usage #1341
Conversation
load --table Logs_20150709 | ||
[ | ||
{"timestamp": "2015-02-05 13:49:00"} | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this because the expected log #|d| [logical_count][select] <Logs_20150709>: no range index
is not written if Logs_20150709
is empty.
Currently, an empty table is skipped before logging that log.
https://github.com/groonga/groonga/pull/1341/files#diff-d3fb57a047ca92c9fc19d900aa1a906a31bc3519eb7c26927516b229705c27a4R290
load --table Logs_20150709 | ||
[ | ||
{"timestamp": "2015-02-05 13:49:00"} | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this because the expected log #|d| [logical_count][range-index] <Logs_20150709>: range index is available
is not written if Logs_20150709
is empty.
Currently, an empty table is skipped before logging that log.
https://github.com/groonga/groonga/pull/1341/files#diff-d3fb57a047ca92c9fc19d900aa1a906a31bc3519eb7c26927516b229705c27a4R290
load --table Logs_20150709 | ||
[ | ||
{"timestamp": "2015-02-05 13:49:00"} | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this because the expected log #|d| [logical_count][select] <Logs_20150709>: covered
is not written if Logs_20150709
is empty.
Currently, an empty table is skipped before logging that log.
https://github.com/groonga/groonga/pull/1341/files#diff-d3fb57a047ca92c9fc19d900aa1a906a31bc3519eb7c26927516b229705c27a4R290
test/command/suite/sharding/logical_count/log/cover_type_all.expected
Outdated
Show resolved
Hide resolved
20279e0
to
86f788e
Compare
plugins/sharding/logical_count.rb
Outdated
raise InvalidArgument, message | ||
end | ||
counter = Counter.new(context) | ||
count_result = counter.count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you rename this from total
?
It seems that count_result
isn't better than total
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed it.
fcb3a6a
to
0df1ec3
Compare
I am testing performance but ready for review of implementation. |
plugins/sharding/logical_count.rb
Outdated
@range_index = range_index | ||
@table = shard.table | ||
class ShardCountContext < StreamExecuteContext | ||
attr_reader :order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we should be defined in StreamExecuteContext
.
How about passing order
to StreamExecuteContext#initialize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me work on this in another PR because it also affects to logical_range_filter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1347 is merged
ensure_filtered | ||
|
||
if @range_index | ||
return count_n_records_in_range | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this done before return 0 if @filtered_result_sets.empty?
?
Can we return 0
here when we have no result sets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to #1341 (comment).
We don't have any result set when @range_index is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any result set when @range_index is defined.
In logical_count
, @range_index
is defined only when we don't need to execute filter
and we simply get index_cursor.count
(https://github.com/groonga/groonga/pull/1341/files#diff-d3fb57a047ca92c9fc19d900aa1a906a31bc3519eb7c26927516b229705c27a4R191)
plugins/sharding/logical_count.rb
Outdated
if @post_filter and @dynamic_columns.have_filtered? | ||
filtered_table = context.table.select_all | ||
def execute_filter(range_index) | ||
return if range_index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm... "execute_filter
does nothing if there is a range index" will confuse developers...
Can we use another approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed it. How about the current approach?
- Removed this
return if range_index
. - Modified to check
@range_index
before filtering inShardCountContext.execute
.
cbb42a4
to
edb70ab
Compare
Performance test results. Condition:
Sorry for my lack of details but I cannot write detailed logs or a schema and so on because it is containing private information... case: Execution time
case:
Memory usage |
plugins/sharding/logical_count.rb
Outdated
unless @post_filter.nil? | ||
result_set = apply_post_filter(result_set) | ||
@temporary_tables << result_set | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this to StreamShardExecutor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed this in another PR.
#1348
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1348 is merged
This is for #1341. Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Fix test data datetime
Modify to return result before filtering when @range_index is not null in execute
- Make order attr sharable - Make postfilter sharable - Avoid needless referring
edb70ab
to
508aff1
Compare
end | ||
@contexts << ShardCountContext.new(shard, cover_type, range_index) | ||
class ShardCountExecutor < StreamShardExecutor | ||
include Loggable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in StreamShardExecutor
because logger
is used in it.
Overall, I have changed
logical_count
to immediatelyunref
shards orclose
temporary objects that are no longer referred likelogical_range_filter
.TODO:
logical_range_filter
into a common classdetect_time_classify_types