Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making Lucene.Net.Join.TermsQuery public #413

Closed
icodetm opened this issue Feb 5, 2021 · 5 comments
Closed

Making Lucene.Net.Join.TermsQuery public #413

icodetm opened this issue Feb 5, 2021 · 5 comments
Labels
design is:question Further information is requested is:wontfix

Comments

@icodetm
Copy link

icodetm commented Feb 5, 2021

Would it be possible to make TermsQuery public?

It seems a lot more performant than a BooleanQuery or even a TermsFilter so I think it would be quite useful if exposed publicly. (I believe it's public or at least publicly documented in 5.0 onwards).

@rclabo
Copy link
Contributor

rclabo commented Feb 5, 2021

So I investigated this a bit. Thank you for linking to the public docs which do in fact show the method public in version 5.2. The odd thing is that when I look up the class in the Lucene Java 5.x repo I see that it does not have an access modifier. And when no access modifier is specified in Java the default is package private which means it is not publicly available.

In the latest version of Java Lucene, the TermsQuery class still has no access modifier which means that it's still not publicly available.

This causes me to think the Java documentation is wrong to show it as a public class. But I would expect the java documentation to be generated from the code base so I'm a bit baffled. Perhaps someone else can chime in about this.

@NightOwl888
Copy link
Contributor

Looks like this is a case of mistaken identity.

There is a TermsQuery class in the join package which exists in Lucene 4.8.0. This class has never been declared public in Lucene.

In Lucene 5.1.0, a new public TermsQuery class was added to the queries package, which is the one the documentation link refers to above.

In principle, I don't see any issue with making the TermsQuery class in Lucene.Net.Join public. But I would like to ask the Lucene team why it was made package private in the first place. If the Lucene team agrees that it should be made public in the current version, then I don't see any reason we can't make that change in Lucene.NET 4.8.0. Before I do, I just wanted to check which of these 2 different implementations you are referring to when you say "it performs better"?

@icodetm
Copy link
Author

icodetm commented Feb 5, 2021

Ah okay that was my mistake.

But to clarify, when using query time Joins, we find the second pass of the join (which uses Lucene.Net.Join.TermsQuery) performs a lot better than using a BooleanQuery with multiple Occur.Should clauses or a TermsFilter.

We have scenarios where we essentially construct our own multi-term queries and wanted to explore using the TermsQuery (or something comparable in terms of performance) directly.

@NightOwl888
Copy link
Contributor

@sthmathew

I finally got a chance to ask the Lucene team whether this would be something they would consider, and it turns out that TermsQuery from the queries package "provides similar functionality" and furthermore it has been transferred to the core package in apache/lucene@22940f5.

So, there are a couple of ways this could go:

  1. You could port over one of the public versions from 5.2 or from a later version for use in your application. However, in the current version there are some Automaton dependencies that don't exist in 4.8.0, so you will need to pick a version before that point.
  2. We could make Lucene.Net.Join.TermsQuery public in 4.8.0 and you will need to adjust your code when Lucene.NET is upgraded to a newer version to call the public implementation, since Lucene.Net.Join.TermsQuery will be made internal again at the time of the upgrade. It doesn't make much sense to have 2 public queries that are similar.

I suspect the public TermsQuery was probably optimized a little better, so you might want to try porting it to see how that goes and report back whether that will work for you.

@NightOwl888 NightOwl888 added is:wontfix and removed is:enhancement New feature or request labels Nov 5, 2022
@NightOwl888
Copy link
Contributor

Closing this, as it is simply more unplanned work to do for the 4.8.0 release and will eventually be available when Lucene.NET is upgraded. Porting the implementation from Lucene 5.2 is likely possible for those who wish to have that version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design is:question Further information is requested is:wontfix
Projects
None yet
Development

No branches or pull requests

3 participants