Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about a KNNQueryBuilder in Java REST Client ? #258

Closed
ivan4github opened this issue Apr 30, 2021 · 8 comments
Closed

What about a KNNQueryBuilder in Java REST Client ? #258

ivan4github opened this issue Apr 30, 2021 · 8 comments

Comments

@ivan4github
Copy link

ivan4github commented Apr 30, 2021

ElasticKnn is a very elegant solution to search BERT Embeddings for similarities far more efficiently than with scoring only.
It is simple to install and use with a very light print on an existing cluster.
Thank you for this fantastic work.

The only thing that stops us on using it in production is that we are using Java REST Client to handle queries and retrieval.
This is where the hard part is for us.

A suggestion, ...it is only a suggestion ....

What about a KNNQueryBuilder in Java REST Client ?

I don't know anything about plugins and cannot tell wether it is feasible...

Thanks anyway

@alexklibisz
Copy link
Owner

Hey, thanks for the kind remarks. I will look into making a Java query builder. Hopefully elastic has provided interfaces that can be extended for this.

@alexklibisz
Copy link
Owner

alexklibisz commented May 1, 2021

@ivan4github I took a first pass at this in #260. There is a snapshot release here. When you import this package, you'll get a class called ElastiknnNearestNeighborsQueryBuilder. This includes just enough functionality to execute queries through the Java Rest Client. Here's a test demonstrating the functionality here.

Let me know what you think. From my perspective it's good enough, but I don't really use the Java client.

@ivan4github
Copy link
Author

ivan4github commented May 3, 2021

Alex,

Waow, very impressive, a great functionnal response within one week end day !

The problem is that I am not lightning fast as you are....
For the future happy users in Java REST API like I am, here is the translation of your scala code in goog old java ;-)

I combined a queryString and an Angular KNN query (which looks the more appropriate to our BERT embeddings) in a multiple request :

     
            // String query builder
            QueryStringQueryBuilder queryBuilder = QueryBuilders.queryStringQuery(query).defaultField("tout").defaultOperator(Operator.AND).quoteFieldSuffix(".exact");

            // Elasticknn Angular Query Builder
            float[] myTestList = new float[3];
            myTestList[0] = (float)0.1;
            myTestList[1] = (float)0.2;
            myTestList[2] = (float)0.3;
            Vector myTestVect = new Vector.DenseFloat(myTestList);
            ElastiknnNearestNeighborsQuery.AngularLsh eknnQuery=  new ElastiknnNearestNeighborsQuery.AngularLsh(myTestVect, 50);
            ElastiknnNearestNeighborsQueryBuilder eknnQueryBuilder = new ElastiknnNearestNeighborsQueryBuilder(eknnQuery, "Phrase_vector_knn");

            // combine both in Multi Search request (direct from Java REST Client doc)
            MultiSearchRequest multiRequest = new MultiSearchRequest();

            SearchRequest firstSearchRequest = new SearchRequest();
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
            searchSourceBuilder.query(queryBuilder);
            firstSearchRequest.source(searchSourceBuilder);
            multiRequest.add(firstSearchRequest);

            SearchRequest secondSearchRequest = new SearchRequest();
            searchSourceBuilder = new SearchSourceBuilder();
            searchSourceBuilder.query(eknnQueryBuilder);
            secondSearchRequest.source(searchSourceBuilder);
            multiRequest.add(secondSearchRequest);

Thank you for everything Alex.

@ivan4github
Copy link
Author

ivan4github commented May 3, 2021

For archives to combine the 2 query builders in a single request, do the following:

            // String query builder
            QueryStringQueryBuilder queryBuilder = QueryBuilders.queryStringQuery(query).defaultField("tout").defaultOperator(Operator.AND).quoteFieldSuffix(".exact");

            // Elasticknn Angular Query Builder
            float[] myTestList = new float[3];
            myTestList[0] = (float)0.1;
            myTestList[1] = (float)0.2;
            myTestList[2] = (float)0.3;
            Vector myTestVect = new Vector.DenseFloat(myTestList);
            ElastiknnNearestNeighborsQuery.AngularLsh eknnQuery=  new ElastiknnNearestNeighborsQuery.AngularLsh(myTestVect, 50);
            ElastiknnNearestNeighborsQueryBuilder eknnQueryBuilder = new ElastiknnNearestNeighborsQueryBuilder(eknnQuery, "Phrase_vector_knn");

                // combine 2 query builders
                QueryBuilder mqueryBuilder = QueryBuilders.boolQuery()
                    .filter(queryBuilder)
                    .must(eknnQueryBuilder);

Used "must()" as the score must come from Elasticknn and should ignore the one from the queryString

@alexklibisz
Copy link
Owner

Sounds good, and thanks for expanding on the example! Let me know if you find any issues with it. If not I'll probably merge and release it this weekend or next.

@alexklibisz
Copy link
Owner

Hi @ivan4github, do you have any more input on the implementation over in #260 ?

@ivan4github
Copy link
Author

Nothing to add, it works great.
Very easy to implement, great job !
Plase add the link to the jar for java client containing this builder for the current version.
The only hassle is the mandatory perfect match between ES version and the plugin version. A small price to pay ;-)

Please close the case.

@alexklibisz
Copy link
Owner

Great. This is released in 7.13.2.1, with some minimal docs here: https://elastiknn.com/libraries/#java-library-with-elasticsearch-query-builder-for-elastiknn-queries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants