Skip to content

Conversation

@vigyasharma
Copy link
Contributor

We don't have checks to disallow zero vectors from getting indexed today. This becomes a problem later when vectors are searched or segments are merged, as noted in #15540. This change prevents zero vectors from getting indexed.

AI Disclosure: The change itself is trivial but it broke quite a few tests. I fixed about half of them manually, then leveraged AI to fix remaining tests. I've reviewed all AI test fixes. Most of them were simply replacing zero/empty vectors with non-empty vector values. Some tests checked for vector similarity scores, and I've fixed them accordingly.

Addresses #15540

@github-actions github-actions bot added this to the 10.4.0 milestone Jan 27, 2026
@benwtrent
Copy link
Member

@vigyasharma sorry for not commenting on the other issue.

I am not sure we should prevent 0 vectors like this. It seems like we need a validation for cosine and 0 magnitudes only right?

@vigyasharma
Copy link
Contributor Author

That's okay, I can make the changes if needed. :)

I was thinking about this, and while cosine gets mathematically broken with 0 vectors, I'm not sure if zero vectors are meaningful for other functions either. They'll always return 0 values for dot product and Max inner product, and search won't really be able to differentiate based on those similarity scores. It will silently affect scores, graph geometry and result set, which feels trappy?

Are there meaningful scenarios where we want to allow zero vectors? I'm not sure if bit vectors need them? Are hamming distance or jaccard sim meaningful when all bits are 0?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants