New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISPN-6395 Unify clustered queries with non clustered queries #5600
Conversation
88c95b2
to
b56a134
Compare
b56a134
to
1a3ee49
Compare
Updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimal stuff
ConfigurationBuilder configurationBuilder = new ConfigurationBuilder(); | ||
configurationBuilder.clustering().cacheMode(CacheMode.LOCAL); | ||
configurationBuilder.indexing().index(Index.ALL) | ||
.addProperty("default.directory_provider", "ram"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ram/local-heap/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Muscle memory :)
* | ||
* @since 9.2 | ||
*/ | ||
public enum IndexedQueryMode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we also have an AUTO mode which chooses the best strategy depending on the underlying indexing mode ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I was planning to add it later (another PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for introducing an AUTO mode. I think that mode should become our default. Do we actually need a value called DEFAULT?
I think the one we now call DEFAULT should be named differently. We need to find a better name that describes in a word that: 'it is executed in a single phase in the caller'. What is the opposite of clustered/distributed? Can't find a good name right now. :)
And then we could also add a DEFAULT enum value, for convenience, that is just an alias to one of the other constants (and warn users that the DEFAULT might be different in next major ispn version, as we might invent new query mechanisms). :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AUTO will come in a later PR, I possibly will be able to squeeze it in before the year ends.
Suggestions for renaming DEFAULT ?
- SINGLE_PHASE
- SINGLE_STEP
- LOCAL
- NON_BROADCAST :)
- CALLER
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking and all the names above have issues, apart for NON_BROADCAST, but this is silly. I'll go with FETCH which I believe reflects accurately the fact that the query is run locally and will read the index, potentially from remote nodes.
1a3ee49
to
99e1ed1
Compare
Updated again. I removed one commit which was just about refactoring the REST search related tests, as broadcast support for REST is not 100% yet. |
99e1ed1
to
40ef6f6
Compare
updated one more time. Removed a "implements Serializable" left behind. |
I'll have a look again also. |
CI seems unstable. Triggering another build |
* Creates a Query based on an Ickle query string | ||
* @param queryMode the {@link IndexedQueryMode} dictating the indexed query execution mode if applicable. | ||
*/ | ||
Query create(String queryString, IndexedQueryMode queryMode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking for the way to build a QueryBuilder-DSL based query and also specify IndexedQueryMode but could not find it. Maybe I'm missing something.
And another thing, I'm not sure whether we need to specify IndexedQueryMode at query creation time; maybe it's better to specify it at execution time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not touch the DSL, I though it'd be in maintenance mode only?
WRT IndexedQueryMode at execution time, I've chosen at creation time because:
- It's how it's working now: SearchManager.getQuery vs SearchManager.getClusteredQuery
- I tried to avoid having to change the API everywhere, i.e.,
query.list(mode), query.iterator(mode), query.getResultSize(mode)
for each of the query types
CI is green! |
return firstResult; | ||
} | ||
|
||
public void sort(Sort sort) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setSort? to match with getSort.
return hsQuery; | ||
} | ||
|
||
public void setMaxResults(int maxResults) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep getter-setter pairs like getMaxResults/setMaxResults and getFirstResult/setFirstResult together without mising them with other methods like sort
.
Updated again. Exposed the |
|
||
public HSQuery getHsQuery(AdvancedCache<?, ?> cache) { | ||
if (hsQuery == null) { | ||
QueryEngine queryEngine = cache.getComponentRegistry().getComponent(EmbeddedQueryEngine.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to pass the EmbeddedQueryEngine somehow to QueryDefinition instead of grabbing it from the component registry? Maybe in the QueryDefinition(String queryString)
constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately no, the QueryDefinition
is what is broadcast, so the serialization layers is responsible to construct it.
Ok. I'm still looking here at some aspects. Please do not merge it yet. |
133ecd9
to
a21eda6
Compare
Addressed reviews, changed |
return super.maxResults(maxResults); | ||
} | ||
|
||
@Override | ||
public CacheQuery<E> firstResult(int firstResult) { | ||
this.firstResult = firstResult; | ||
this.queryDefinition.setFirstResult(firstResult); | ||
return this; | ||
} | ||
|
||
@Override | ||
public CacheQuery<E> sort(Sort sort) { | ||
this.sort = sort; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the sort
field used at all? I see it being set but don't see it used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort
is used in the ctor of DistributedIterator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, was looking at the wrong branch. sort
is unused, I'll remove it
} | ||
|
||
public void setNamedParameters(Map<String, Object> params) { | ||
if (params != null) params.forEach(this.namedParameters::put); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is forEach better that putAll? I suppose we also need to take care of the case when params are null.
if (params == null) {
namedParameters.clear();
} else {
namedParameters.putAll(params);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
if (parsingResult.hasGroupingOrAggregations()) { | ||
throw log.groupAggregationsNotSupported(); | ||
} | ||
LuceneQueryParsingResult luceneParsingResult = transformParsingResult(parsingResult, EMPTY_MAP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing EMPTY_MAP with emptyMap() spares of a generics warning.
a21eda6
to
11dcad8
Compare
Updated |
@@ -97,10 +97,14 @@ protected ModelFactory getModelFactory() { | |||
return cache; | |||
} | |||
|
|||
protected int getNodesCount() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is used. The derived class overrides createCacheManagers
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, let me sort this out
output.writeUTF(object.getQueryString().get()); | ||
} else { | ||
output.writeBoolean(false); | ||
output.writeObject(object.getHsQuery(null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this can work with null
cache parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ops, let me clean this up
e790c49
to
f926c83
Compare
addressed latest reviews |
} | ||
|
||
public HSQuery getHsQuery() { | ||
return hsQuery; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling this method should throw an IllegalStateException
if hsQuery
is null
, ie initialize(...)
was not called prior to this.
@@ -57,11 +60,26 @@ public CacheQueryImpl(Query luceneQuery, SearchIntegrator searchFactory, Advance | |||
cache, keyTransformationHandler); | |||
} | |||
|
|||
public CacheQueryImpl(Query luceneQuery, SearchIntegrator searchFactory, AdvancedCache<?, ?> cache, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the constructors in this class is no longer used.
@@ -59,10 +61,12 @@ private QueryRequest getQueryRequest() throws IOException { | |||
if (request.method() == HttpMethod.GET) { | |||
String queryString = getParameterValue(QUERY_STRING); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole body of this if
statement would look better if extracted as a separate getQueryFromString method in the spirit of getQueryFromJSON
public ClusteredQueryCommandWorker getCommand(Cache<?, ?> cache, HSQuery query, UUID lazyQueryId, | ||
int docIndex) { | ||
public ClusteredQueryCommandWorker getCommand(Cache<?, ?> cache, QueryDefinition queryDefinition, UUID lazyQueryId, | ||
int docIndex) { | ||
ClusteredQueryCommandWorker command = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declaration and initialization of command
can be done in same line.
f926c83
to
967d456
Compare
967d456
to
517c9f2
Compare
} else { | ||
queryDefinition.initialize(cache); | ||
HSQuery hsQuery = queryDefinition.getHsQuery(); | ||
CacheQuery cacheQuery = new CacheQueryImpl<>(hsQuery, cache, keyTransformationHandler); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two line can become return new CacheQueryImpl<E>(hsQuery, cache, keyTransformationHandler);
to avoid the warning.
I'm happy with this refactoring. I can still spot some small design issues that we can fix later. The existence of RemoteQueryDefinition and HsQueryRequest seems to be a symptom of misplaced responsibility. It all starts with QueryDefinition.initialize, which IMO should actually be placed inside QueryEngine, not QueryDefinition. Doing that refactoring will remove the need for RemoteQueryDefinition, which now exists just to differentiate between embedded and remote case, but that differentiation can be done inside the query engine itself. Also, HsQueryRequest is just a data holder that carries the return value of QueryEngine.createHsQuery. If QueryDefinition.initialize is moved to QueryEngine we would also not need this anymore. I did not think about it in detail but maybe we would also need to make QueryDefinition mutable for QueryEngine and immutable for external parties. In that case we can extract QueryDefinition as an immutable interface (exposing getters only) and it's implementation class could have package local setters accessible to QueryEngine only. But let's leave those improvements for another day. I'll merge this today as it is after you have applied the last 2-3 minor changes I suggested. Thanks @gustavonalle ! |
517c9f2
to
6b313dd
Compare
Updated |
@anistor Created https://issues.jboss.org/browse/ISPN-8628 to further refactor it |
Integrated in master. Thanks @gustavonalle ! |
1 similar comment
Integrated in master. Thanks @gustavonalle ! |
LuceneQueryParsingResult luceneParsingResult = transformParsingResult(parsingResult, nameParameters); | ||
org.apache.lucene.search.Query luceneQuery = makeTypeQuery(luceneParsingResult.getQuery(), luceneParsingResult.getTargetEntityName()); | ||
SearchIntegrator searchFactory = getSearchFactory(); | ||
HSQuery hsQuery = metadata == null ? searchFactory.createHSQuery(luceneQuery) : searchFactory.createHSQuery(luceneQuery); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe here you intended to write HSQuery hsQuery = metadata == null ? searchFactory.createHSQuery(luceneQuery) : searchFactory.createHSQuery(luceneQuery, metadata);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this here: 29e92ee
https://issues.jboss.org/browse/ISPN-6395
Preview, just to gather feedback (from @anistor mainly) on the backwards compatible API changes.This PR introduces a
IndexQueryQueryMode
that specifies how an indexed query is executed. BROADCAST mean the query is sent to all nodes (and results aggregated), whileCALLERFETCH (the default) executes the query directly (thus needs a distributed index to work).Why is this relevant? Because with BROADCAST, each node can index its own data on its own index, so both indexing and querying is way more scalable than having a single global index handled by the
InfinispanIndexManager