Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.NullPointerException when field not filled (7.6.2) #181

Closed
ejackson-eb opened this issue Oct 26, 2020 · 4 comments
Closed

java.lang.NullPointerException when field not filled (7.6.2) #181

ejackson-eb opened this issue Oct 26, 2020 · 4 comments

Comments

@ejackson-eb
Copy link

I attempted an Elasticknn query on a vector field that existed in my mapping but which had not been filled for any documents. This triggered a java.lang.NullPointerException.

Mapping:

        "esknn_embedding" : {
          "type" : "elastiknn_dense_float_vector",
          "similarity" : "boolean",
          "elastiknn" : {
            "model" : "lsh",
            "similarity" : "angular",
            "dims" : 512,
            "L" : 99,
            "k" : 1
          }
        },

Query:

{'_source': ['id'], 'size': 20, 'query': {'function_score': {'query': {'bool': {'must': [{'elastiknn_nearest_neighbors': {'field': 'esknn_embedding', 'vec': {'values': [...]}, 'model': 'lsh', 'similarity': 'angular', 'candidates': 50}}]}}, 'functions': [], 'score_mode': 'multiply', 'boost_mode': 'multiply'}}}

Elasticsearch output:

Caused by: java.lang.NullPointerException

	at com.klibisz.elastiknn.query.ExactQuery$StoredVecReader.apply(ExactQuery.scala:50) ~[?:?]
	at com.klibisz.elastiknn.query.HashingQuery$.$anonfun$apply$2(HashingQuery.scala:24) ~[?:?]
	at org.apache.lucene.search.MatchHashesAndScoreQuery$1$2.score(MatchHashesAndScoreQuery.java:168) ~[?:?]

I'm not really sure what should happen, but presumably not an exception of this sort.

@alexklibisz
Copy link
Owner

You're catching a lot of things I hadn't thought of. Very helpful 👍
It should probably just return no documents. I believe there's a way to compose the custom query with a query that checks for the presence of that field. Will look into it.

@alexklibisz
Copy link
Owner

I was trying to reproduce this one in a test, and accidentally reproduced #180 😆
Hopefully they're the same underlying issue.

@alexklibisz
Copy link
Owner

@ejackson-eb So I ended up adding a test in #184 that does the following:

  • Create an index.
  • Put a mapping that includes a simple ID keyword field and a vector field with the same AngularLsh mapping you showed above, albeit lower dimension vectors (128).
  • Index 20k docs. The even-numbered docs include a valid vector. The odd numbered docs include nothing for the vector field.
  • Run a count query to count number of docs with ID field .. should be 20k
  • Run a count query to count number of docs with vector field .. should be 10k
  • Run an exact query using the 0th vector as the query vector .. should return the 0th vector as top hit.
  • Run an approx query with the same query vector .. should return same top hit.

I found that after increasing from 10k to 20k I started seeing the issue you mentioned over in #180.
I never actually saw a null pointer exception.

So I'd suggest trying out the fix that I linked here: #180 (comment)
And then if you still get null pointer exceptions, it would be super helpful if you can provide a python or bash script that reproduces the issue. It's fine if you just email it to me, and also totally fine if the vectors are just random numbers.

alexklibisz added a commit that referenced this issue Oct 28, 2020
@alexklibisz
Copy link
Owner

This seems to be resolved based on some email discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants