Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre run case error: The specified key does not exist for Search Performance Test (100M Dataset, 768 Dim) on Milvus Standalone #267

Closed
anrahman4 opened this issue Jan 29, 2024 · 2 comments · Fixed by #287
Assignees

Comments

@anrahman4
Copy link

Hello. I was trying to run a test run for Milvus Standalone with the Search Performance Test (100M Dataset, 768 Dim). However, I got the following output with an error of specified key does not exist:

2024-01-29 19:15:09,634 | INFO |Task summary: run_id=92ba5, task_label=2024012919_dockertes (models.py:285)
2024-01-29 19:15:09,634 | INFO |DB     | db_label case              label                | load_dur    qps          latency(p99)    recall        max_load_count | label (models.py:285)
2024-01-29 19:15:09,634 | INFO |------ | -------- ----------------- -------------------- | ----------- ------------ --------------- ------------- -------------- | ----- (models.py:285)
2024-01-29 19:15:09,634 | INFO |Milvus |          Performance768D1M 2024012919_dockertes | 562.5584    2261.2764    0.0044          0.9821        0              | :)    (models.py:285)
2024-01-29 19:15:09,634 | INFO: write results to disk /home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/results/Milvus/result_20240129_2024012919_dockertes_milvus.json (models.py:143) (3442542)
2024-01-29 19:15:09,635 | INFO: Succes to finish task: label=2024012919_dockertes, run_id=92ba58d5fda64c8f88100a54d0f18c0a (interface.py:207) (3442542)
2024-01-29 19:19:01,624 | INFO: generated uuid for the tasks: 81e208a5a31844e8be6f45b9ff165e66 (interface.py:69) (3442368)
2024-01-29 19:19:01,624 | INFO | DB             | CaseType     Dataset               Filter | task_label (task_runner.py:288)
2024-01-29 19:19:01,624 | INFO | -----------    | ------------ -------------------- ------- | -------    (task_runner.py:288)
2024-01-29 19:19:01,624 | INFO | Milvus         | Performance  LAION-LARGE-100M        None | 2024012919_100Mdockertest (task_runner.py:288)
2024-01-29 19:19:01,624 | INFO: task submitted: id=81e208a5a31844e8be6f45b9ff165e66, 2024012919_100Mdockertest, case number: 1 (interface.py:235) (3442368)
2024-01-29 19:19:02,223 | INFO: [1/1] start case: {'label': <CaseLabel.Performance: 2>, 'dataset': {'data': {'name': 'LAION', 'size': 100000000, 'dim': 768, 'metric_type': <MetricType.L2: 'L2'>}}, 'db': 'Milvus'}, drop_old=True (interface.py:167) (3463933)
2024-01-29 19:19:02,490 | INFO: Milvus client drop_old collection: VectorDBBenchCollection (milvus.py:45) (3463933)
2024-01-29 19:19:02,494 | INFO: Milvus create collection: VectorDBBenchCollection (milvus.py:55) (3463933)
2024-01-29 19:19:03,162 | INFO: local dataset root path not exist, creating it: /mnt/milvus_data/laion/laion_large_100m (data_source.py:126) (3463933)
2024-01-29 19:19:03,162 | INFO: Start to downloading files, total count: 104 (data_source.py:142) (3463933)
  2%|███▊                                                                                                                                                                                                    | 2/104 [00:00<00:46,  2.19it/s]
2024-01-29 19:19:04,074 | WARNING: pre run case error: The specified key does not exist. (task_runner.py:92) (3463933)
2024-01-29 19:19:04,074 | WARNING: [1/1] case {'label': <CaseLabel.Performance: 2>, 'dataset': {'data': {'name': 'LAION', 'size': 100000000, 'dim': 768, 'metric_type': <MetricType.L2: 'L2'>}}, 'db': 'Milvus'} failed to run, reason=The specified key does not exist. (interface.py:187) (3463933)
Traceback (most recent call last):
  File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/interface.py", line 168, in _async_task_v2
    case_res.metrics = runner.run(drop_old)
                       ^^^^^^^^^^^^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 96, in run
    self._pre_run(drop_old)
  File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 93, in _pre_run
    raise e from None
  File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 87, in _pre_run
    self.ca.dataset.prepare(self.dataset_source)
  File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/dataset.py", line 202, in prepare
    source.reader().read(
  File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/data_source.py", line 145, in read
    self.fs.download(s3_file, local_ds_root.as_posix())
  File "/home/labuser/.local/lib/python3.11/site-packages/fsspec/spec.py", line 1534, in download
    return self.get(rpath, lpath, recursive=recursive, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/fsspec/asyn.py", line 118, in wrapper
    return sync(self.loop, func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File "/home/labuser/.local/lib/python3.11/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
                ^^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/fsspec/asyn.py", line 650, in _get
    return await _run_coros_in_chunks(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/fsspec/asyn.py", line 254, in _run_coros_in_chunks
    await asyncio.gather(*chunk, return_exceptions=return_exceptions),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/tasks.py", line 452, in wait_for
    return await fut
           ^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/s3fs/core.py", line 1224, in _get_file
    body, content_length = await _open_file(range=0)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/s3fs/core.py", line 1215, in _open_file
    resp = await self._call_s3(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/s3fs/core.py", line 348, in _call_s3
    return await _error_wrapper(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/labuser/.local/lib/python3.11/site-packages/s3fs/core.py", line 140, in _error_wrapper
    raise err
FileNotFoundError: The specified key does not exist.

Currently starting up the Milvus Standalone through Docker Compose with the following docker-compose.yml:

version: '3.5'

services:
  etcd:
    container_name: milvus-etcd
    image: quay.io/coreos/etcd:v3.5.5
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296
      - ETCD_SNAPSHOT_COUNT=50000
    volumes:
      - /mnt/milvus_data:/etcd
    command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd

  minio:
    container_name: milvus-minio
    image: minio/minio:RELEASE.2023-03-20T20-16-18Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    volumes:
      - /mnt/milvus_data:/minio_data
    command: minio server /minio_data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3

  standalone:
    container_name: milvus-standalone
    image: milvusdb/milvus:v2.3.5
    command: ["milvus", "run", "standalone"]
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    volumes:
      - /mnt/milvus_data:/var/lib/milvus
    ports:
      - "19530:19530"
      - "9091:9091"
    depends_on:
      - "etcd"
      - "minio"

networks:
  default:
    name: milvus
@alwayslove2013
Copy link
Collaborator

@XuanYang-cn looks like some problem with the data download, please help ~

@anrahman4
Copy link
Author

Any fix to this yet?

XuanYang-cn added a commit to XuanYang-cn/VectorDBBench that referenced this issue Mar 4, 2024
XuanYang-cn added a commit to XuanYang-cn/VectorDBBench that referenced this issue Mar 4, 2024
XuanYang-cn added a commit to XuanYang-cn/VectorDBBench that referenced this issue Mar 4, 2024
XuanYang-cn added a commit to XuanYang-cn/VectorDBBench that referenced this issue Mar 5, 2024
XuanYang-cn added a commit to XuanYang-cn/VectorDBBench that referenced this issue Mar 5, 2024
alwayslove2013 pushed a commit that referenced this issue Mar 5, 2024
Fixes: #267, #275, #285

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants