Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-12722: add fl param to ChildDocTransformer #443

Closed
wants to merge 10 commits into from
Closed

SOLR-12722: add fl param to ChildDocTransformer #443

wants to merge 10 commits into from

Conversation

moshebla
Copy link
Contributor

No description provided.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!


assertJQ(req("q", "test_s:testing",
"sort", "id asc",
"fl", "*, [child childFilter='lonely/lonelyGrandChild/test2_s:secondTest' fl='*, _nest_path_']",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets leave _nest_path_ as internal and not mention it since testing "fl" has nothing to do with this field or nested documents. So instead use some other field, foo_s or whatever. And can you move this test to TestChildDocTransformer.java as it's not related to hierarchy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing,
just pushed a new commit :).

@@ -137,6 +137,7 @@ When using this transformer, the `parentFilter` parameter must be specified, and

* `childFilter` - query to filter which child documents should be included, this can be particularly useful when you have multiple levels of hierarchical documents (default: all children)
* `limit` - the maximum number of child documents to be returned per parent document (default: 10)
* `fl` - the field list which the transformer is to return.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... "There is a further limitation in which the fields here should be a subset of those specified by the top level fl param." And say defaults to the top level "fl" ?

@@ -106,9 +107,12 @@ public DocTransformer create(String field, SolrParams params, SolrQueryRequest r
}
}

String childReturnFields = params.get("fl");
SolrReturnFields childSolrReturnFields = childReturnFields==null? new SolrReturnFields(): new SolrReturnFields(childReturnFields ,req);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this default to the top level "fl"? I think so. Even if it didn't do this before, it in-effect behaves this way and it makes sense (to me) that it would.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense,
just pushed a commit :).

@moshebla
Copy link
Contributor Author

Oops. There's a problem with this commit.
I'm on it!

@moshebla
Copy link
Contributor Author

I tried bypassing the stack overflow error from the original commit.
Hope this is good enough.

@dsmiley
Copy link
Contributor

dsmiley commented Aug 30, 2018

Oh right; we'd get an infinite loop as the ChildDocTransformerFactory is already being called by SolrReturnFields! Ouch!

Sorry but I don't like your solution of trying to parse/strip the child transformer "fl"; it's very error-prone to do such things.

Lets define a ThreadLocal that allows us to know if we're being asked to create this transformer from within a recursive call to SolrReturnFields. When we see this happening, we can return a no-op transformer (see RawValueTransformerFactory.NoopFieldTransformer which could be made public and moved to an inner class of DocTransformer). This thread local would be set & cleared in a try-finally, and this logic would be in some utility method to keep it from being distracting to the caller. In general I avoid ThreadLocals as it's a singleton design pattern which is usually bad, but it can be used sparingly/judiciously.

BTW it appears that doc transformers will not be executed on the child docs. The transformer is not invoking them, and I suspect they won't be invoked on the way out to the client. Can you check/test if this is true? If I'm correct, this is a pre-existing bug/limitation that would be good to mark as a TODO; or you can try to address here since it's related (since you before couldn't specify what 'fl' was therefore it wasn't possible to specify a transformer until now).

@@ -131,6 +131,12 @@ public void transform(SolrDocument rootDoc, int rootDocId) {

// load the doc
SolrDocument doc = searcher.getDocFetcher().solrDoc(docId, childReturnFields);
Copy link
Contributor Author

@moshebla moshebla Sep 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps SolrDocumentFetcher#solrDoc should run the transformers provided by SolrReturnFields in the future?
This is just a thought, and is probably out of scope for this PR anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the same thing! But yeah; out of scope for this PR. CC @ErickErickson

@moshebla
Copy link
Contributor Author

moshebla commented Sep 2, 2018

BTW, I was skimming through DocTransformer, and noticed DocTransformer#needsSolrIndexSearcher.
Shouldn't this return true for ChildDocTransformer, since it searches the docId provided in the underlying index, and will fail if given a negative value?

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

}

@Override
public String getName() {
return name;
}

@Override
public boolean needsSolrIndexSearcher() { return true; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 nice catch!

@@ -57,12 +58,34 @@

static final char PATH_SEP_CHAR = '/';
static final char NUM_SEP_CHAR = '#';
private static final ThreadLocal<Boolean> transformerInitialized = new ThreadLocal<Boolean>(){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the classic but more verbose way to declare a ThreadLocal. In Java 8 this can be a one-liner:
ThreadLocal<Boolean> recursionCheckThreadLocal = ThreadLocal.withInitial(() -> Boolean.FALSE);
And please consider another field name, not "transformerInitialized"? I suggest "recursionCheckThreadLocal" since it declares it's purpose (to detect recursion) and it's type (it's a ThreadLocal). "initialized" is dubious to me because it suggests an instance field of a class but this one is global.

transformerInitialized.set(true);
return createChildDocTransformer(field, params, req);
} finally {
transformerInitialized.remove();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm; maybe clearer to set(false)

* Trivial Impl that ensure that the specified field is requested as an "extra" field,
* but then does nothing during the transformation phase.
*/
public static final class NoopFieldTransformer extends DocTransformer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh woops; I recommended you do this but overlooked it's not just a no-op; it's doing this extra field requesting task. Well I suppose it's okay; not a big deal. Alternatively, this could changed to be just a no-op, and then the former impl in RawValueTransformerFactory.java could exist as a subclass that is a bit simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I added a default constructor to and changed another bit so it works either way.
I would prefer if this was left here, where it is easily accessible, preventing duplication of code.

@@ -91,10 +98,10 @@ private void testChildDoctransformerXML() throws Exception {
"fl", "*,[child parentFilter=\"subject:parentDocument\"]"), test1);

assertQ(req("q", "*:*", "fq", "subject:\"parentDocument\" ",
"fl", "subject,[child parentFilter=\"subject:parentDocument\" childFilter=\"title:foo\"]"), test2);
"fl", "id, subject,[child parentFilter=\"subject:parentDocument\" childFilter=\"title:foo\"]"), test2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: remove that extra space after the comma here and in the other lines you modified.

"/response/docs/[0]/_childDocuments_/[0]/intDvoDefault==42",
"/response/docs/[0]/_childDocuments_/[0]/child_fl=='child_fl_test'");

try(SolrQueryRequest req = req("q", "*:*", "fq", "subject:\"parentDocument\" ",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know XPath? assertQ takes xpath expressions that is powerful enough to do everything you are asserting here in less code. (assertQ is used in a ton of tests; plenty of examples to learn from). The assertJQ thing is less powerful. You don't have to rewrite it but it's at least worth being aware for future assertions.

@@ -1831,7 +1831,7 @@ public void testChildDoctransformer() throws IOException, SolrServerException {

q = new SolrQuery("q", "*:*", "indent", "true");
q.setFilterQueries(parentFilter);
q.setFields("id,[child parentFilter=\"" + parentFilter +
q.setFields("id, level_i, [child parentFilter=\"" + parentFilter +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused; this change suggests there is a backwards-compatibility change; is there? I think you said that the child "fl" is effectively constrained by whatever the top "fl" is; which if totally true means (I think) we needn't worry by defaulting to the top "fl". If this isn't totally correct (hence your change) then we need to be careful about changing the default behavior in a minor release; we'd have to tweak the default choice (absence of specific "fl") using the luceneMatchVersion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed these tests before we defaulted to the top level "fl" parameter.
I shall investigate further.

Copy link
Contributor Author

@moshebla moshebla Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests fail since the recent changes made in this branch result the child doc transformer defaulting to the top fl. Beforehand, the default child fl was " * ", which does not filter "level_i" from child docs. Should we default to " * " in lucene version < 8, and change this behaviour only once Solr 8 arrives?

@@ -132,6 +134,12 @@ public void transform(SolrDocument rootDoc, int rootDocId) {

// load the doc
SolrDocument doc = searcher.getDocFetcher().solrDoc(docId, childReturnFields);
if(childReturnFields.getTransformer() != null) {
if(childReturnFields.getTransformer().context != this.context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this line should only check if context is null? By checking if the context is different, it's suggestive there may already be a context but if that were true then we ought to reinstate the former context when done. I suspect it's always going to be null. Can you look?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like your suspicion was correct.
I will push a commit asap!

SolrReturnFields childSolrReturnFields;
if(childReturnFields != null) {
childSolrReturnFields = new SolrReturnFields(childReturnFields, req);
} else if(req.getSchema().getDefaultLuceneMatchVersion().major < 8) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@moshebla
Copy link
Contributor Author

moshebla commented Sep 4, 2018

Pushed a commit ensuring the context is set for child transformers only if it was not set previously.

@moshebla
Copy link
Contributor Author

moshebla commented Sep 6, 2018

Closing this PR as it was committed to master in commit e4f256b

@moshebla moshebla closed this Sep 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants