Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-4587: integrate lucene-monitor into solr #2382

Draft
wants to merge 37 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
bba4191
integrate lucene-monitor into solr
kotman12 Mar 30, 2024
fe3b413
move MonitorDataValues and check in license
kotman12 Apr 1, 2024
0989f1c
update versions.lock
kotman12 Apr 1, 2024
62ddcbc
add package-info to monitor packages
kotman12 Apr 1, 2024
33eccf7
extract helper method
kotman12 Apr 1, 2024
fa7fb40
apply errorprone suggestions
kotman12 Apr 3, 2024
11a1138
implement highlight matches
kotman12 Apr 4, 2024
7526b2f
AggregatingMatcher -> MatchesAggregator
kotman12 Apr 4, 2024
6ac7a5e
make monitor query cache optional
kotman12 Apr 4, 2024
2362ce7
move manySegmentsTest to ParallelMonitorSolrQueryTest
kotman12 Apr 4, 2024
e5a1382
call CandidateMatcher directly
kotman12 Apr 5, 2024
b7b58b3
remove doc forwarding callback
kotman12 Apr 6, 2024
ce40d60
instantiate decoder in outer loop
kotman12 Apr 9, 2024
a201676
ignore score for relevant match types
kotman12 Apr 12, 2024
60b0eb5
read MAX_SIZE_PARAM for maxSize
kotman12 Apr 12, 2024
fd04c4e
don't drop cause
kotman12 Apr 12, 2024
32cb6f6
remove superstitious delete calls
kotman12 Apr 12, 2024
b6d369b
add testDeleteByQueryId
kotman12 Apr 12, 2024
0ab5a6b
enable setting maxRamMB for monitor cache
kotman12 Apr 13, 2024
0b0120e
hardcoding luceneMatchVersion is bad
kotman12 Apr 15, 2024
24c1bf5
add multi-pass presearcher and optional field aliasing
kotman12 Apr 19, 2024
ee2992d
more accurate error
kotman12 Apr 19, 2024
6815ea1
redundant override
kotman12 Apr 19, 2024
995cfa2
wrap reserved field with _ and remove override behavior
kotman12 Apr 24, 2024
a2419ff
validate MonitorFields.RESERVED_MONITOR_FIELDS in schema
kotman12 Apr 24, 2024
0fde7ef
stricter validations of required fields
kotman12 Apr 27, 2024
5968754
narrow scope of __anytokenfield validation
kotman12 Apr 27, 2024
bb11982
getBool with default
kotman12 May 3, 2024
eba6e2d
initialize Presearcher in ReverseSearchComponent + add ReverseSearchH…
kotman12 May 4, 2024
d3ccf0c
remove unused constant
kotman12 May 4, 2024
e12ec6f
make SolrMonitorCache name optionally configurable
cpoerschke May 10, 2024
6d4986c
[exploratory] turn MonitorConstants.QUERY_DECOMPOSER into ReverseSear…
cpoerschke May 10, 2024
3aae9f0
remove REVERSE_SEARCH_PARAM_NAME flag in favor of dedicated path
kotman12 May 10, 2024
0915ce1
move getComponent from QCEVisitor to SolrMonitorQueryDecoder as sugge…
cpoerschke May 14, 2024
e6e25a5
avoid queryDecoder.getComponent(queryDecoder.decode(...), ...) usage …
cpoerschke May 14, 2024
8bb7151
Merge remote-tracking branch 'github_kotman12/solr-monitor' into solr…
cpoerschke May 14, 2024
e1365ad
remove now no-longer-used MonitorConstants.QUERY_DECOMPOSER
cpoerschke May 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions settings.gradle
Expand Up @@ -52,6 +52,7 @@ include "solr:modules:ltr"
include "solr:modules:s3-repository"
include "solr:modules:scripting"
include "solr:modules:sql"
include "solr:modules:monitor"
include "solr:webapp"
include "solr:benchmark"
include "solr:test-framework"
Expand Down
1 change: 1 addition & 0 deletions solr/licenses/lucene-monitor-9.10.0.jar.sha1
@@ -0,0 +1 @@
082f2fce3e8f0cb84054eeb91d7a35e558f3d7be
33 changes: 33 additions & 0 deletions solr/modules/monitor/build.gradle
@@ -0,0 +1,33 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

apply plugin: 'java-library'

description = 'Apache Solr Monitor'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is so puzzling to anyone who isn't intimately familiar with Lucene Monitor. I don't even think we should be calling this "Solr Monitor"; looks like infrastructure monitoring thing. Possibly "Solr-Lucene-Monitor" but still... a puzzling name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great point .. The library used to be called luwak which I find to be a much better name... I'll try to think of a better name (maybe solr-reverse-search or solr-query-alerting). I'll reply in more detail to your mailing list message also touching on solr.cool and the sandbox.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saved Searches is a common name, I assume it is possible to list a users's saved searches too. Or Alerting, but then most people will expect there to be some functionality to ship alerts somewhere...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, if anything this might be a part of some larger alerting system, but "saved search" is more accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saved searches is a pretty indicative name. Percolator is also a known name for this kind of functionally.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I thought ES invented "percolator" as more of a metaphor... I wasn't aware that this is a more generic name. I was worried that "percolator" might clash too much with ES.


dependencies {

implementation project(":solr:core")
implementation project(":solr:solrj")
implementation "org.apache.lucene:lucene-core"
implementation "org.apache.lucene:lucene-monitor"
implementation 'com.github.ben-manes.caffeine:caffeine'
testImplementation project(':solr:test-framework')
testImplementation 'junit:junit'
}


@@ -0,0 +1,125 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.lucene.monitor;

import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;

/** Class used to match candidate queries selected by a Presearcher from a Monitor query index. */
public abstract class CandidateMatcher<T extends QueryMatch> {

/** The searcher to run candidate queries against */
protected final IndexSearcher searcher;

private final Map<String, Exception> errors = new HashMap<>();
private final List<MatchHolder<T>> matches;

private long searchTime = System.nanoTime();

private static class MatchHolder<T> {
Map<String, T> matches = new HashMap<>();
}

/**
* Creates a new CandidateMatcher for the supplied DocumentBatch
*
* @param searcher the IndexSearcher to run queries against
*/
public CandidateMatcher(IndexSearcher searcher) {
this.searcher = searcher;
int docCount = searcher.getIndexReader().maxDoc();
this.matches = new ArrayList<>(docCount);
for (int i = 0; i < docCount; i++) {
this.matches.add(new MatchHolder<>());
}
}

/**
* Runs the supplied query against this CandidateMatcher's set of documents, storing any resulting
* match, and recording the query in the presearcher hits
*
* @param queryId the query id
* @param matchQuery the query to run
* @param metadata the query metadata
* @throws IOException on IO errors
*/
public abstract void matchQuery(String queryId, Query matchQuery, Map<String, String> metadata)
throws IOException;

/**
* Record a match
*
* @param match a QueryMatch object
*/
protected final void addMatch(T match, int doc) {
MatchHolder<T> docMatches = matches.get(doc);
docMatches.matches.compute(
match.getQueryId(),
(key, oldValue) -> {
if (oldValue != null) {
return resolve(match, oldValue);
}
return match;
});
}

/**
* If two matches from the same query are found (for example, two branches of a disjunction),
* combine them.
*
* @param match1 the first match found
* @param match2 the second match found
* @return a Match object that combines the two
*/
public abstract T resolve(T match1, T match2);

/** Called by the Monitor if running a query throws an Exception */
void reportError(String queryId, Exception e) {
this.errors.put(queryId, e);
}

/**
* @return the matches from this matcher
*/
public final MultiMatchingQueries<T> finish(long buildTime, int queryCount) {
doFinish();
this.searchTime =
TimeUnit.MILLISECONDS.convert(System.nanoTime() - searchTime, TimeUnit.NANOSECONDS);
List<Map<String, T>> results = new ArrayList<>();
for (MatchHolder<T> matchHolder : matches) {
results.add(matchHolder.matches);
}
return new MultiMatchingQueries<>(
results, errors, buildTime, searchTime, queryCount, matches.size());
}

/** Called when all monitoring of a batch of documents is complete */
protected void doFinish() {}

/** Copy all matches from another CandidateMatcher */
protected void copyMatches(CandidateMatcher<T> other) {
this.matches.clear();
this.matches.addAll(other.matches);
}
}
@@ -0,0 +1,64 @@
/*
*
* * Licensed to the Apache Software Foundation (ASF) under one or more
* * contributor license agreements. See the NOTICE file distributed with
* * this work for additional information regarding copyright ownership.
* * The ASF licenses this file to You under the Apache License, Version 2.0
* * (the "License"); you may not use this file except in compliance with
* * the License. You may obtain a copy of the License at
* *
* * http://www.apache.org/licenses/LICENSE-2.0
* *
* * Unless required by applicable law or agreed to in writing, software
* * distributed under the License is distributed on an "AS IS" BASIS,
* * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* * See the License for the specific language governing permissions and
* * limitations under the License.
*
*/

package org.apache.lucene.monitor;

import java.io.Closeable;
import java.io.IOException;
import java.util.List;
import java.util.function.Supplier;
import java.util.stream.Collectors;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.LeafReader;

public class DocumentBatchVisitor implements Closeable, Supplier<LeafReader> {

private final DocumentBatch batch;
private final List<Document> docs;

private DocumentBatchVisitor(DocumentBatch batch, List<Document> docs) {
this.batch = batch;
this.docs = docs;
}

public static DocumentBatchVisitor of(Analyzer analyzer, List<Document> docs) {
return new DocumentBatchVisitor(
DocumentBatch.of(analyzer, docs.toArray(new Document[0])), docs);
}

@Override
public void close() throws IOException {
batch.close();
}

@Override
public LeafReader get() {
return batch.get();
}

public int size() {
return docs.size();
}

@Override
public String toString() {
return docs.stream().map(Document::toString).collect(Collectors.joining(" "));
}
}
@@ -0,0 +1,60 @@
/*
*
* * Licensed to the Apache Software Foundation (ASF) under one or more
* * contributor license agreements. See the NOTICE file distributed with
* * this work for additional information regarding copyright ownership.
* * The ASF licenses this file to You under the Apache License, Version 2.0
* * (the "License"); you may not use this file except in compliance with
* * the License. You may obtain a copy of the License at
* *
* * http://www.apache.org/licenses/LICENSE-2.0
* *
* * Unless required by applicable law or agreed to in writing, software
* * distributed under the License is distributed on an "AS IS" BASIS,
* * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* * See the License for the specific language governing permissions and
* * limitations under the License.
*
*/
package org.apache.lucene.monitor;

import java.util.List;
import java.util.Map;
import org.apache.lucene.search.Query;

public class MatchesAggregator<T extends QueryMatch> extends CandidateMatcher<T> {

private final CandidateMatcher<T> resolvingMatcher;

private MatchesAggregator(
List<CandidateMatcher<T>> matchers, CandidateMatcher<T> resolvingMatcher) {
super(resolvingMatcher.searcher);
this.resolvingMatcher = resolvingMatcher;
for (var matcher : matchers) {
var matches = matcher.finish(Long.MIN_VALUE, -1);
for (int doc = 0; doc < matches.getBatchSize(); doc++) {
for (T match : matches.getMatches(doc)) {
this.addMatch(match, doc);
}
}
for (Map.Entry<String, Exception> error : matches.getErrors().entrySet()) {
this.reportError(error.getKey(), error.getValue());
}
}
}

@Override
public void matchQuery(String queryId, Query matchQuery, Map<String, String> metadata) {
throw new UnsupportedOperationException("only use for aggregating other matchers");
}

@Override
public T resolve(T match1, T match2) {
return resolvingMatcher.resolve(match1, match2);
}

public static <T extends QueryMatch> MultiMatchingQueries<T> aggregate(
List<CandidateMatcher<T>> matchers, CandidateMatcher<T> resolver, int queryCount) {
return new MatchesAggregator<>(matchers, resolver).finish(Long.MIN_VALUE, queryCount);
}
}
@@ -0,0 +1,34 @@
/*
*
* * Licensed to the Apache Software Foundation (ASF) under one or more
* * contributor license agreements. See the NOTICE file distributed with
* * this work for additional information regarding copyright ownership.
* * The ASF licenses this file to You under the Apache License, Version 2.0
* * (the "License"); you may not use this file except in compliance with
* * the License. You may obtain a copy of the License at
* *
* * http://www.apache.org/licenses/LICENSE-2.0
* *
* * Unless required by applicable law or agreed to in writing, software
* * distributed under the License is distributed on an "AS IS" BASIS,
* * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* * See the License for the specific language governing permissions and
* * limitations under the License.
*
*/

package org.apache.lucene.monitor;

import java.util.Set;

public class MonitorFields {

public static final String QUERY_ID = QueryIndex.FIELDS.query_id;
public static final String CACHE_ID = QueryIndex.FIELDS.cache_id;
public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq;
public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload";
public static final String VERSION = "_version_";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize this was public!


public static final Set<String> RESERVED_MONITOR_FIELDS =
Set.of(QUERY_ID, CACHE_ID, MONITOR_QUERY, PAYLOAD, VERSION);
}