-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Sparse vector insertion not working #34063
Comments
what is the milvus server version? |
if you can get milvus log we can take a deep look |
I have exactly the same error on milvus 2.4.1 when inserting a sparse vector. id_field = FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False)
vector = FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=768)
bm25_vector = FieldSchema(name="bm25_vector", dtype=DataType.SPARSE_FLOAT_VECTOR)
# Create schema
schema = CollectionSchema(fields=[id_field, vector, bm25_vector], enable_dynamic_field=True)
mv_client.create_collection(collection_name="documents", schema=schema) # Creation du dictionnaire de points
uniq_id = 0
data = []
unique_package_names = set()
def csr_to_tuples(csr):
return [(int(i), float(v)) for i, v in zip(csr.indices, csr.data)]
for path in PATHS[:1]:
df_docs = pd.read_parquet(path)
for index, row in df_docs.iterrows():
if index > 0:
break
docs_embeddings = bm25_ef.encode_documents([row["content"]])
bm25_vector = list(docs_embeddings)[0]
sparse_iterable = csr_to_tuples(bm25_vector)
print("bm25_vector", bm25_vector)
print("sparse_iterable", sparse_iterable)
uniq_id += 1
data.append({
"id": uniq_id,
"content": row["content"],
"vector": row["semantic_vector"],
"bm25_vector": sparse_iterable,
"page_number": row["page_number"],
"file_name": row["file_name"],
"source_file": row["file_path"], # dossier is not a correct value
"dossier": row["dossier"],
"type": row["type"],
"numero": row["numero"],
"package_name": row["package_name"],
})
unique_package_names.add(row["package_name"])
# Chargement
import tqdm
def batch_generator(lst, batch_size):
for i in range(0, len(lst), batch_size):
yield lst[i : i + batch_size]
for batch in tqdm.tqdm(batch_generator(data, 1000)):
mv_client.insert(collection_name="documents", data=batch)
|
If i insert directly the csr_array I get.
|
I tried to reproduce the issue on milvus 2.4.5 and pymilvus 2.4.4, but no luck. could you please upgrade the milvus and pymilvus and retry? Here is the code that hard coded a few lines with yours:
|
/assign @louis-sanna-eki |
I thought this is saying some field to be inserted is nil . |
After upgrading to 2.4.5 the bug has disappeared. Thanks you all! |
great to hear that, thank you for updates. @louis-sanna-eki |
Is there an existing issue for this?
Environment
Current Behavior
When i run hello_sparse.py script, the get the error saying :
"MilvusException: <MilvusException: (code=65535, message=%!s() is not supported now)>"
This error occurs right after I start inserting into collection.
Expected Behavior
Daya should be inserted into the collection.
Steps To Reproduce
I am using python3.12.3 on MAC, with Milvus and pymilvus version 2.4.0 I am running python3 hello_sparse.py from VS code. Script can be found here: https://github.com/milvus-io/pymilvus/blob/2.4/examples/hello_sparse.py
Milvus Log
2024-06-21 13:59:34 === start connecting to Milvus ===
2024-06-21 13:59:35 Does collection hello_sparse exist in Milvus: True. Dropping
2024-06-21 13:59:35 === Create collection
hello_sparse
===2024-06-21 13:59:37 hello_sparse has 0 entities(0.0M), indexed False
2024-06-21 13:59:37 === Start creating entities to insert ===
2024-06-21 13:59:37 === Start inserting entities ===
RPC error: [batch_insert], <MilvusException: (code=65535, message=%!s() is not supported now)>, <Time:{'RPC start': '2024-06-21 13:59:37.289951', 'RPC error': '2024-06-21 13:59:39.011264'}>
MilvusException Traceback (most recent call last)
[... skipping hidden 1 frame]
Cell In[200], line 76
75 log(fmt.format("Start inserting entities"))
---> 76 insert_result = hello_sparse.insert(entities)
78 # -----------------------------------------------------------------------------
79 # create index
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/orm/collection.py:513, in Collection.insert(self, data, partition_name, timeout, **kwargs)
512 entities = Prepare.prepare_insert_data(data, self.schema)
--> 513 return conn.batch_insert(
514 self._name,
515 entities,
516 partition_name,
517 timeout=timeout,
518 schema=self._schema_dict,
519 **kwargs,
520 )
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:147, in error_handler..wrapper..handler(*args, **kwargs)
146 LOGGER.error(f"RPC error: [{inner_name}], {e}, Time:{record_dict}")
--> 147 raise e from e
148 except grpc.FutureTimeoutError as e:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:143, in error_handler..wrapper..handler(*args, **kwargs)
142 record_dict["RPC start"] = str(datetime.datetime.now())
--> 143 return func(*args, **kwargs)
144 except MilvusException as e:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:182, in tracing_request..wrapper..handler(self, *args, **kwargs)
181 self.set_onetime_request_id(req_id)
--> 182 return func(self, *args, **kwargs)
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:122, in retry_on_rpc_failure..wrapper..handler(*args, **kwargs)
121 else:
--> 122 raise e from e
123 except Exception as e:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:87, in retry_on_rpc_failure..wrapper..handler(*args, **kwargs)
86 try:
---> 87 return func(*args, **kwargs)
88 except grpc.RpcError as e:
89 # Do not retry on these codes
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/grpc_handler.py:582, in GrpcHandler.batch_insert(self, collection_name, entities, partition_name, timeout, **kwargs)
581 return MutationFuture(None, None, err)
--> 582 raise err from err
583 else:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/grpc_handler.py:576, in GrpcHandler.batch_insert(self, collection_name, entities, partition_name, timeout, **kwargs)
575 response = rf.result()
--> 576 check_status(response.status)
577 m = MutationResult(response)
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/utils.py:63, in check_status(status)
62 if status.code != 0 or status.error_code != 0:
---> 63 raise MilvusException(status.code, status.reason, status.error_code)
MilvusException: <MilvusException: (code=65535, message=%!s() is not supported now)>
The above exception was the direct cause of the following exception:
MilvusException Traceback (most recent call last)
[... skipping hidden 1 frame]
Cell In[200], line 76
75 log(fmt.format("Start inserting entities"))
---> 76 insert_result = hello_sparse.insert(entities)
78 # -----------------------------------------------------------------------------
79 # create index
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/orm/collection.py:513, in Collection.insert(self, data, partition_name, timeout, **kwargs)
512 entities = Prepare.prepare_insert_data(data, self.schema)
--> 513 return conn.batch_insert(
514 self._name,
515 entities,
516 partition_name,
517 timeout=timeout,
518 schema=self._schema_dict,
519 **kwargs,
520 )
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:147, in error_handler..wrapper..handler(*args, **kwargs)
146 LOGGER.error(f"RPC error: [{inner_name}], {e}, Time:{record_dict}")
--> 147 raise e from e
148 except grpc.FutureTimeoutError as e:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:143, in error_handler..wrapper..handler(*args, **kwargs)
142 record_dict["RPC start"] = str(datetime.datetime.now())
--> 143 return func(*args, **kwargs)
144 except MilvusException as e:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:182, in tracing_request..wrapper..handler(self, *args, **kwargs)
181 self.set_onetime_request_id(req_id)
--> 182 return func(self, *args, **kwargs)
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:122, in retry_on_rpc_failure..wrapper..handler(*args, **kwargs)
121 else:
--> 122 raise e from e
123 except Exception as e:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:87, in retry_on_rpc_failure..wrapper..handler(*args, **kwargs)
86 try:
---> 87 return func(*args, **kwargs)
88 except grpc.RpcError as e:
89 # Do not retry on these codes
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/grpc_handler.py:582, in GrpcHandler.batch_insert(self, collection_name, entities, partition_name, timeout, **kwargs)
581 return MutationFuture(None, None, err)
--> 582 raise err from err
583 else:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/grpc_handler.py:576, in GrpcHandler.batch_insert(self, collection_name, entities, partition_name, timeout, **kwargs)
575 response = rf.result()
--> 576 check_status(response.status)
577 m = MutationResult(response)
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/utils.py:63, in check_status(status)
62 if status.code != 0 or status.error_code != 0:
---> 63 raise MilvusException(status.code, status.reason, status.error_code)
MilvusException: <MilvusException: (code=65535, message=%!s() is not supported now)>
The above exception was the direct cause of the following exception:
MilvusException Traceback (most recent call last)
Cell In[200], line 76
70 entities = [
71 rng.random(num_entities).tolist(),
72 [generate_sparse_vector(dim, nnz) for _ in range(num_entities)],
73 ]
75 log(fmt.format("Start inserting entities"))
---> 76 insert_result = hello_sparse.insert(entities)
78 # -----------------------------------------------------------------------------
79 # create index
80 if not hello_sparse.has_index():
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/orm/collection.py:513, in Collection.insert(self, data, partition_name, timeout, **kwargs)
511 check_insert_schema(self.schema, data)
512 entities = Prepare.prepare_insert_data(data, self.schema)
--> 513 return conn.batch_insert(
514 self._name,
515 entities,
516 partition_name,
517 timeout=timeout,
518 schema=self._schema_dict,
519 **kwargs,
520 )
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:147, in error_handler..wrapper..handler(*args, **kwargs)
145 record_dict["RPC error"] = str(datetime.datetime.now())
146 LOGGER.error(f"RPC error: [{inner_name}], {e}, Time:{record_dict}")
--> 147 raise e from e
148 except grpc.FutureTimeoutError as e:
149 record_dict["gRPC timeout"] = str(datetime.datetime.now())
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:143, in error_handler..wrapper..handler(*args, **kwargs)
141 try:
142 record_dict["RPC start"] = str(datetime.datetime.now())
--> 143 return func(*args, **kwargs)
144 except MilvusException as e:
145 record_dict["RPC error"] = str(datetime.datetime.now())
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:182, in tracing_request..wrapper..handler(self, *args, **kwargs)
180 if req_id:
181 self.set_onetime_request_id(req_id)
--> 182 return func(self, *args, **kwargs)
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:122, in retry_on_rpc_failure..wrapper..handler(*args, **kwargs)
120 back_off = min(back_off * back_off_multiplier, max_back_off)
121 else:
--> 122 raise e from e
123 except Exception as e:
124 raise e from e
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/decorators.py:87, in retry_on_rpc_failure..wrapper..handler(*args, **kwargs)
85 while True:
86 try:
---> 87 return func(*args, **kwargs)
88 except grpc.RpcError as e:
89 # Do not retry on these codes
90 if e.code() in IGNORE_RETRY_CODES:
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/grpc_handler.py:582, in GrpcHandler.batch_insert(self, collection_name, entities, partition_name, timeout, **kwargs)
580 if kwargs.get("_async", False):
581 return MutationFuture(None, None, err)
--> 582 raise err from err
583 else:
584 return m
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/grpc_handler.py:576, in GrpcHandler.batch_insert(self, collection_name, entities, partition_name, timeout, **kwargs)
573 return f
575 response = rf.result()
--> 576 check_status(response.status)
577 m = MutationResult(response)
578 ts_utils.update_collection_ts(collection_name, m.timestamp)
File ~/Downloads/milvus/myenv/lib/python3.12/site-packages/pymilvus/client/utils.py:63, in check_status(status)
61 def check_status(status: Status):
62 if status.code != 0 or status.error_code != 0:
---> 63 raise MilvusException(status.code, status.reason, status.error_code)
MilvusException: <MilvusException: (code=65535, message=%!s() is not supported now)>
Anything else?
No response
The text was updated successfully, but these errors were encountered: