-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make text index query cache a configurable option #5176
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5176 +/- ##
=============================================
- Coverage 65.95% 44.50% -21.45%
=============================================
Files 1056 1057 +1
Lines 54131 54067 -64
Branches 8050 8050
=============================================
- Hits 35701 24065 -11636
- Misses 15787 27957 +12170
+ Partials 2643 2045 -598
Continue to review full report at Codecov.
|
_indexSearcher.setQueryCache(null); | ||
if (textIndexProperties == null || | ||
textIndexProperties.get(FieldConfig.LUCENE_TEXT_INDEX_ENABLE_QUERY_CACHE) == null || | ||
textIndexProperties.get(FieldConfig.LUCENE_TEXT_INDEX_ENABLE_QUERY_CACHE).equalsIgnoreCase("false")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll suggest using
!textIndexProperties.get(FieldConfig.LUCENE_TEXT_INDEX_ENABLE_QUERY_CACHE).equalsIgnoreCase("true")
instead of false
, in case someone mis-spells the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
public static String VAR_LENGTH_DICTIONARY_COLUMN_KEY = "field.config.var.length.dictionary"; | ||
|
||
// Lucene index properties | ||
public static String LUCENE_TEXT_INDEX_REALTIME_READER_REFRESH_KEY = "field.config.text.index.realtime.reader.refresh"; | ||
public static String LUCENE_TEXT_INDEX_ENABLE_QUERY_CACHE = "field.config.text.index.enable.query.cache"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put a comment above this line on how this cached will be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -160,7 +162,11 @@ private void createSegment() | |||
private void loadSegment() | |||
throws Exception { | |||
IndexLoadingConfig indexLoadingConfig = new IndexLoadingConfig(); | |||
indexLoadingConfig.setTextIndexColumns(new HashSet<>(textIndexColumns)); | |||
Map<String, Map<String, String>> textIndexColumnsWithProperties = new HashMap<>(); | |||
for (String column : textIndexColumns) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
putIfAbsent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change not needed anymore
public static String VAR_LENGTH_DICTIONARY_COLUMN_KEY = "field.config.var.length.dictionary"; | ||
|
||
// Lucene index properties | ||
public static String LUCENE_TEXT_INDEX_REALTIME_READER_REFRESH_KEY = "field.config.text.index.realtime.reader.refresh"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this but can we remove the field.config.
prefix from all these keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -45,7 +45,7 @@ | |||
private ReadMode _readMode = ReadMode.DEFAULT_MODE; | |||
private List<String> _sortedColumns = Collections.emptyList(); | |||
private Set<String> _invertedIndexColumns = new HashSet<>(); | |||
private Set<String> _textIndexColumns = new HashSet<>(); | |||
private Map<String, Map<String, String>> _textIndexColumns = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't put the whole map here. The map contains all the properties, not for text column only. I think you can keep this config unchanged, but check the field config when loading the text index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Btw, FieldConfig contains a properties map Map<String, String> per column
Do we still need this? |
6dcae80
to
ec311dc
Compare
@Jackie-Jiang @jackjlli , I have addressed the review comments. Please take a look at it again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise
// Disable Lucene query result cache. While it helps a lot with performance for | ||
// repeated queries, on the downside it cause heap issues. | ||
_indexSearcher.setQueryCache(null); | ||
if (textIndexProperties == null || textIndexProperties.get(FieldConfig.TEXT_INDEX_ENABLE_QUERY_CACHE) == null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (textIndexProperties == null || textIndexProperties.get(FieldConfig.TEXT_INDEX_ENABLE_QUERY_CACHE) == null | |
if (textIndexProperties == null || !Boolean.parseBoolean(textIndexProperties.get(FieldConfig.TEXT_INDEX_ENABLE_QUERY_CACHE))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -60,6 +60,9 @@ | |||
private boolean _isDirectRealtimeOffheapAllocation; | |||
private boolean _enableSplitCommitEndWithMetadata; | |||
|
|||
// constructed from FieldConfig | |||
private Map<String, Map<String, String>> _columnsWithProperties; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest renaming it to _columnProperties
. Also don't set it inside the extractTextIndexColumnsFromTableConfig()
, set it in extractFromTableConfig()
so that other index type can also access it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
During design of text index feature, we had done some heap overhead experiments. As part of that we had determined that while Lucene's internal query cache helps with performance significantly for repeatable queries, it adds heap overhead. Therefore, it was disabled during index loading.
It is worthwhile to make this configurable on per-index basis for users to enable it. Of course, they need to know the downside of more heap overhead (which could potentially negate the perf improvements due to more GC).
As part of ongoing internal user acceptance testing, we learned that most of the queries are repeatable. The UI will more or less keep the text search filter constant and tweak the other filters. For such cases, it is good to see the performance improvements by enabling the query cache and if the heap overhead is not significantly high, user might want to keep it enabled.
Note: By default it is still disabled. So nothing really changed in the existing behavior.