Skip to content

Maybe spurious exception (ValueError: missing dimensions) when using sparsevec with django  #80

@QBH3

Description

@QBH3

I believe the test in

raise ValueError('missing dimensions')
might not work as intended.

When trying to save in updated Model class in django it is called via https://github.com/pgvector/pgvector-python/blob/master/pgvector/utils/sparsevec.py#L125 and does not get a dimension as an argument.

django Model class:

from django.db import models
from pgvector.django import SparseVectorField
from pgvector.django import HnswIndex

class Node(models.Model):
    text = models.TextField()
    embedding = SparseVectorField(dimensions=30522, null=True, blank=True)  # for "naver/splade-cocondenser-ensembledistil"

    class Meta:
        indexes = [
            HnswIndex(
                name='my_index',
                fields=['embedding'],
                opclasses=['sparsevec_l2_ops']
            )
        ]

How the table looks like with psql:

                                                   Table "public.planetai_django_api_node"
    Column     |       Type       | Collation | Nullable |             Default              | Storage  | Compression | Stats target | Description
---------------+------------------+-----------+----------+----------------------------------+----------+-------------+--------------+-------------
 id            | bigint           |           | not null | generated by default as identity | plain    |             |              |
 text          | text             |           | not null |                                  | extended |             |              |
 embedding     | sparsevec(30522) |           |          |                                  | external |             |              |

You can see that the table has the same dimension as the Model class.

The Exception that was trown:

Traceback (most recent call last):
  File "PGVECTORENV/lib/python3.10/site-packages/asgiref/sync.py", line 518, in thread_handler
    raise exc_info[1]
  File "PGVECTORENV/lib/python3.10/site-packages/django/core/handlers/exception.py", line 42, in inner
    response = await get_response(request)
  File "PGVECTORENV/lib/python3.10/site-packages/django/core/handlers/base.py", line 253, in _get_response_async
    response = await wrapped_callback(
  File "PGVECTORENV/lib/python3.10/site-packages/asgiref/sync.py", line 468, in __call__
    ret = await asyncio.shield(exec_coro)
  File "PGVECTORENV/lib/python3.10/site-packages/asgiref/current_thread_executor.py", line 40, in run
    result = self.fn(*self.args, **self.kwargs)
  File "PGVECTORENV/lib/python3.10/site-packages/asgiref/sync.py", line 522, in thread_handler
    return func(*args, **kwargs)
  File "views/rag/api.py", line 69, in documents_index
    textnode.save()
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/base.py", line 822, in save
    self.save_base(
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/base.py", line 909, in save_base
    updated = self._save_table(
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/base.py", line 1040, in _save_table
    updated = self._do_update(
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/base.py", line 1105, in _do_update
    return filtered._update(values) > 0
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/query.py", line 1278, in _update
    return query.get_compiler(self.db).execute_sql(CURSOR)
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1990, in execute_sql
    cursor = super().execute_sql(result_type)
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1549, in execute_sql
    sql, params = self.as_sql()
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1953, in as_sql
    val = field.get_db_prep_save(val, connection=self.connection)
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/fields/__init__.py", line 1013, in get_db_prep_save
    return self.get_db_prep_value(value, connection=connection, prepared=False)
  File "PGVECTORENV/lib/python3.10/site-packages/django/db/models/fields/__init__.py", line 1006, in get_db_prep_value
    value = self.get_prep_value(value)
  File "/mnt/ssd_nas_homes/chrstianbahls-10073/.cache/pypoetry/virtualenvs/planetai-django-tenderx-IbQbFZXB-py3.10/lib/python3.10/site-packages/pgvector/django/sparsevec.py", line 33, in get_prep_value
    return SparseVector._to_db(value)
  File "PGVECTORENV/lib/python3.10/site-packages/pgvector/utils/sparsevec.py", line 125, in _to_db
    value = cls(value)
  File "PGVECTORENV/lib/python3.10/site-packages/pgvector/utils/sparsevec.py", line 16, in __init__
    raise ValueError('missing dimensions')
ValueError: missing dimensions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions