
# Bayesian Optimization with Retrieval Optimizer

This notebook demonstrates how to use Bayesian optimization to tune Redis-based retrieval pipelines. Unlike a grid study—which tests all combinations—Bayesian optimization intelligently searches the configuration space, prioritizing promising settings based on previous results. This is especially useful when the number of possible configurations is large and exhaustive search would be too costly.

You'll define a study configuration, specify embedding models and search methods, and let the optimizer guide the search toward the best-performing retrieval setup.


# Installation

In [None]:
%pip install redis-retrieval-optimizer

## Dataset

We'll import a dataset from the [beir benchmark IR project](https://github.com/beir-cellar/beir) to get going quickly. 

In [1]:
# Load data
from redis_retrieval_optimizer.corpus_processors import eval_beir

# check the link above for different datasets to try
beir_dataset_name = "nfcorpus"
data_folder = "data"

# Load sample data
corpus, queries, qrels = eval_beir.get_beir_dataset(beir_dataset_name)

  from tqdm.autonotebook import tqdm


10:14:32 beir.datasets.data_loader INFO   Loading Corpus...


100%|██████████| 3633/3633 [00:00<00:00, 260170.17it/s]

10:14:32 beir.datasets.data_loader INFO   Loaded 3633 TEST Documents.
10:14:32 beir.datasets.data_loader INFO   Doc Example: {'text': 'Recent studies have suggested that statins, an established drug group in the prevention of cardiovascular mortality, could delay or prevent breast cancer recurrence but the effect on disease-specific mortality remains unclear. We evaluated risk of breast cancer death among statin users in a population-based cohort of breast cancer patients. The study cohort included all newly diagnosed breast cancer patients in Finland during 1995–2003 (31,236 cases), identified from the Finnish Cancer Registry. Information on statin use before and after the diagnosis was obtained from a national prescription database. We used the Cox proportional hazards regression method to estimate mortality among statin users with statin use as time-dependent variable. A total of 4,151 participants had used statins. During the median follow-up of 3.25 years after the diagnosis (rang




Now that we have our data we will save it locally to the gitignored `data/` folder

In [2]:
import os

os.makedirs(data_folder, exist_ok=True)

In [3]:
import json

with open(f"data/{beir_dataset_name}_corpus.json", "w") as f:
    json.dump(corpus, f)

with open(f"data/{beir_dataset_name}_queries.json", "w") as f:
    json.dump(queries, f)

with open(f"data/{beir_dataset_name}_qrels.json", "w") as f:
    json.dump(qrels, f)

## Study config

In this directory there is a yaml file containing a configuration for a bayesian study that looks like this:

```yaml
# path to data files for easy read
corpus: "data/nfcorpus_corpus.json"
queries: "data/nfcorpus_queries.json"
qrels: "data/nfcorpus_qrels.json"

index_settings:
  name: "optimize"
  vector_field_name: "vector" # name of the vector field to search on
  text_field_name: "text" # name of the text field for lexical search
  from_existing: false
  vector_dim: 384 # should match first embedding model or from_existing
  additional_fields:
      - name: "title"
        type: "text"

optimization_settings:
  # defines the options optimization can take
  metric_weights:
    f1_at_k: 1
    embedding_latency: 1
    total_indexing_time: 1
  algorithms: ["hnsw"]
  vector_data_types: ["float16", "float32"]
  distance_metrics: ["cosine"]
  n_trials: 10
  n_jobs: 1
  ret_k: [1, 10] # potential range of value to be sampled during study
  ef_runtime: [10, 20, 30, 50]
  ef_construction: [100, 150, 200, 250, 300]
  m: [8, 16, 64]


search_methods: ["vector", "lin_combo"]
embedding_models:
  - type: "hf"
    model: "sentence-transformers/all-MiniLM-L6-v2"
    dim: 384
    embedding_cache_name: "vec-cache" # avoid names with including 'ret-opt' as this can cause collisions
    dtype: "float32"
```

## Running a study

To run a study simple pass the path to config, redis_url, and corpus processing function to the `run_bayes_study` function and the package will take care of the rest. 

In [4]:
import os
from redis_retrieval_optimizer.bayes_study import run_bayes_study
from redis_retrieval_optimizer.corpus_processors import eval_beir
from dotenv import load_dotenv

# load environment variables containing necessary credentials
load_dotenv()

redis_url = os.environ.get("REDIS_URL", "redis://localhost:6379/0")

metrics = run_bayes_study(
    config_path="bayes_study_config.yaml",
    redis_url=redis_url,
    corpus_processor=eval_beir.process_corpus
)

[I 2025-06-18 10:14:47,411] A new study created in memory with name: test


10:14:49 datasets INFO   PyTorch version 2.7.0 available.
10:14:49 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:14:49 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00,  6.96it/s]

10:14:51 root INFO   Recreating index...





10:14:51 root INFO   Corpus size: 3633
10:14:54 root INFO   Data indexed total_indexing_time=2.938s, num_docs=3633
10:14:57 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid'], 'total_indexing_time': [-2.938], 'avg_query_time': [-0.0030586490690154558], 'model': ['sentence-transformers/all-MiniLM-L6-v2'], 'model_dim': [384], 'ret_k': [7], 'recall@k': [0.15498797328057845], 'ndcg@k': [0.20277849135854045], 'f1@k': [0.13071176354799932], 'precision': [0.24334365325077406], 'algorithm': ['hnsw'], 'ef_construction': [150], 'ef_runtime': [20], 'm': [64], 'distance_metric': ['cosine'], 'vector_data_type': ['float16']}


[I 2025-06-18 10:14:57,955] Trial 0 finished with value: 2.938 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 7, 'ef_runtime': 20, 'ef_construction': 150, 'm': 64}. Best is trial 0 with value: 2.938.


10:14:57 redisvl.index.index INFO   Index already exists, not overwriting.
10:14:57 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:14:57 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 69.92it/s]

10:14:58 root INFO   Data indexed total_indexing_time=2.938s, num_docs=3633





10:14:58 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector'], 'total_indexing_time': [-2.938, -2.938], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2'], 'model_dim': [384, 384], 'ret_k': [7, 3], 'recall@k': [0.15498797328057845, 0.14786315482520818], 'ndcg@k': [0.20277849135854045, 0.18650304495321093], 'f1@k': [0.13071176354799932, 0.12689261701063015], 'precision': [0.24334365325077406, 0.24179566563467494], 'algorithm': ['hnsw', 'hnsw'], 'ef_construction': [150, 300], 'ef_runtime': [20, 20], 'm': [64, 64], 'distance_metric': ['cosine', 'cosine'], 'vector_data_type': ['float16', 'float16']}


[I 2025-06-18 10:14:58,967] Trial 1 finished with value: 2.938 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'vector', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 3, 'ef_runtime': 20, 'ef_construction': 300, 'm': 64}. Best is trial 0 with value: 2.938.


10:14:58 redisvl.index.index INFO   Index already exists, not overwriting.
10:14:58 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:14:58 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 61.35it/s]

10:14:59 root INFO   Data indexed total_indexing_time=2.938s, num_docs=3633





10:15:00 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector'], 'total_indexing_time': [-2.938, -2.938, -2.938], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2'], 'model_dim': [384, 384, 384], 'ret_k': [7, 3, 6], 'recall@k': [0.15498797328057845, 0.14786315482520818, 0.14786315482520818], 'ndcg@k': [0.20277849135854045, 0.18650304495321093, 0.18650304495321093], 'f1@k': [0.13071176354799932, 0.12689261701063015, 0.12689261701063015], 'precision': [0.24334365325077406, 0.24179566563467494, 0.24179566563467494], 'algorithm': ['hnsw', 'hnsw', 'hnsw'], 'ef_construction': [150, 300, 200], 'ef_runtime': [20, 20, 10], 'm': [64, 64, 8], 'distance_metric': ['cosine', 'cosine', 'cosine'], 'vector_data_type': ['float16', 'float16', 'float16']}

[I 2025-06-18 10:15:00,292] Trial 2 finished with value: 2.938 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'vector', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 6, 'ef_runtime': 10, 'ef_construction': 200, 'm': 8}. Best is trial 0 with value: 2.938.


10:15:00 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:00 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:00 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 67.43it/s]

10:15:00 root INFO   Data indexed total_indexing_time=2.938s, num_docs=3633





10:15:01 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2'], 'model_dim': [384, 384, 384, 384], 'ret_k': [7, 3, 6, 4], 'recall@k': [0.15498797328057845, 0.14786315482520818, 0.14786315482520818, 0.14786315482520818], 'ndcg@k': [0.20277849135854045, 0.18650304495321093, 0.18650304495321093, 0.18650304495321093], 'f1@k': [0.13071176354799932, 0.12689261701063015, 0.12689261701063015, 0.12689261701063015], 'precision': [0.24334365325077406, 0.24179566563467494, 0.24179566563467494, 0.24179566563467494], 'algorithm': ['hnsw', 'hnsw', 'hnsw', 'hnsw'], 'ef_

[I 2025-06-18 10:15:01,261] Trial 3 finished with value: 2.938 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'vector', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 4, 'ef_runtime': 30, 'ef_construction': 200, 'm': 16}. Best is trial 0 with value: 2.938.


10:15:01 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:01 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:01 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 57.23it/s]

10:15:01 root INFO   Data indexed total_indexing_time=2.938s, num_docs=3633





10:15:02 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2'], 'model_dim': [384, 384, 384, 384, 384], 'ret_k': [7, 3, 6, 4, 7], 'recall@k': [0.15498797328057845, 0.14786315482520818, 0.14786315482520818, 0.14786315482520818, 0.14786315482520818], 'ndcg@k': [0.20277849135854045, 0.18650304495321093, 0.18650304495321093, 0.18650304495321093, 0.18650304495321093], 'f1@k': [0.13071176354799932, 0.12689261701063015, 0.12689261701063015, 0.12689261701063015, 0.126892617010630

[I 2025-06-18 10:15:02,258] Trial 4 finished with value: 2.938 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'vector', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 7, 'ef_runtime': 20, 'ef_construction': 100, 'm': 8}. Best is trial 0 with value: 2.938.


10:15:02 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:02 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:02 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 59.36it/s]

10:15:02 root INFO   Data indexed total_indexing_time=2.938s, num_docs=3633





10:15:03 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2'], 'model_dim': [384, 384, 384, 384, 384, 384], 'ret_k': [7, 3, 6, 4, 7, 3], 'recall@k': [0.15498797328057845, 0.14786315482520818, 0.14786315482520818, 0.14786315482520818, 0.14786315482520818, 0.14786315482520818], 'ndcg@k': [0.20277849135854045, 0.18650304495321093, 0.18650304495321093, 0.18650304495321093, 0.18650304495321093

[I 2025-06-18 10:15:03,321] Trial 5 finished with value: 2.938 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'vector', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 3, 'ef_runtime': 10, 'ef_construction': 300, 'm': 16}. Best is trial 0 with value: 2.938.


10:15:03 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:03 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 69.26it/s]

10:15:03 root INFO   Recreating index...





10:15:04 root INFO   Corpus size: 3633
10:15:05 root INFO   Data indexed total_indexing_time=1.368s, num_docs=3633
10:15:06 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2'], 'model_dim': [384, 384, 384, 384, 384, 384, 384], 'ret_k': [7, 3, 6, 4, 7, 3, 10], 'recall@k': [0.15498797328057845, 0.1478631548

[I 2025-06-18 10:15:06,282] Trial 6 finished with value: 1.368 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'vector', 'algorithm': 'hnsw', 'var_dtype': 'float32', 'distance_metric': 'cosine', 'ret_k': 10, 'ef_runtime': 50, 'ef_construction': 100, 'm': 8}. Best is trial 0 with value: 2.938.


10:15:06 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:06 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 60.96it/s]

10:15:06 root INFO   Recreating index...





10:15:07 root INFO   Corpus size: 3633
10:15:11 root INFO   Data indexed total_indexing_time=4.256s, num_docs=3633
10:15:13 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2'], 'model_dim': [384, 384, 384, 384, 384, 384, 38

[I 2025-06-18 10:15:13,023] Trial 7 finished with value: 4.256 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 2, 'ef_runtime': 50, 'ef_construction': 300, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:13 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:13 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 67.16it/s]

10:15:13 root INFO   Recreating index...





10:15:13 root INFO   Corpus size: 3633
10:15:16 root INFO   Data indexed total_indexing_time=2.851s, num_docs=3633
10:15:17 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sente

[I 2025-06-18 10:15:17,902] Trial 8 finished with value: 2.851 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float32', 'distance_metric': 'cosine', 'ret_k': 6, 'ef_runtime': 30, 'ef_construction': 300, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:17 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:17 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:17 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 62.83it/s]

10:15:18 root INFO   Data indexed total_indexing_time=2.851s, num_docs=3633





10:15:19 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v

[I 2025-06-18 10:15:19,683] Trial 9 finished with value: 2.851 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float32', 'distance_metric': 'cosine', 'ret_k': 8, 'ef_runtime': 20, 'ef_construction': 250, 'm': 16}. Best is trial 7 with value: 4.256.


10:15:19 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:19 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:19 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 58.17it/s]

10:15:20 root INFO   Data indexed total_indexing_time=2.851s, num_docs=3633





10:15:21 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v

[I 2025-06-18 10:15:21,295] Trial 10 finished with value: 2.851 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float32', 'distance_metric': 'cosine', 'ret_k': 1, 'ef_runtime': 50, 'ef_construction': 150, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:21 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:21 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 66.75it/s]

10:15:21 root INFO   Recreating index...





10:15:22 root INFO   Corpus size: 3633
10:15:25 root INFO   Data indexed total_indexing_time=2.99s, num_docs=3633
10:15:27 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-

[I 2025-06-18 10:15:27,112] Trial 11 finished with value: 2.99 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 1, 'ef_runtime': 50, 'ef_construction': 150, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:27 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:27 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 58.20it/s]

10:15:27 root INFO   Data indexed total_indexing_time=2.99s, num_docs=3633





10:15:29 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99, -2.99], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098, -0.0031798587125890397], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2',

[I 2025-06-18 10:15:29,226] Trial 12 finished with value: 2.99 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 1, 'ef_runtime': 50, 'ef_construction': 150, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:29 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:29 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:29 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 54.92it/s]

10:15:29 root INFO   Data indexed total_indexing_time=2.99s, num_docs=3633





10:15:31 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99, -2.99, -2.99], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098, -0.0031798587125890397, -0.0031702208445168133], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 

[I 2025-06-18 10:15:31,243] Trial 13 finished with value: 2.99 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 2, 'ef_runtime': 50, 'ef_construction': 250, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:31 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:31 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:31 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 69.75it/s]

10:15:31 root INFO   Data indexed total_indexing_time=2.99s, num_docs=3633





10:15:33 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99, -2.99, -2.99, -2.99], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098, -0.0031798587125890397, -0.0031702208445168133, -0.003212623920972133], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 's

[I 2025-06-18 10:15:33,338] Trial 14 finished with value: 2.99 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 4, 'ef_runtime': 50, 'ef_construction': 150, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:33 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:33 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:33 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 60.02it/s]

10:15:33 root INFO   Data indexed total_indexing_time=2.99s, num_docs=3633





10:15:35 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99, -2.99, -2.99, -2.99, -2.99], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098, -0.0031798587125890397, -0.0031702208445168133, -0.003212623920972133, -0.003174193503317818], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sen

[I 2025-06-18 10:15:35,427] Trial 15 finished with value: 2.99 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 1, 'ef_runtime': 50, 'ef_construction': 300, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:35 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:35 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:35 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 63.00it/s]

10:15:35 root INFO   Data indexed total_indexing_time=2.99s, num_docs=3633





10:15:37 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99, -2.99, -2.99, -2.99, -2.99, -2.99], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098, -0.0031798587125890397, -0.0031702208445168133, -0.003212623920972133, -0.003174193503317818, -0.003027929979212144], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'sente

[I 2025-06-18 10:15:37,299] Trial 16 finished with value: 2.99 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 3, 'ef_runtime': 50, 'ef_construction': 150, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:37 redisvl.index.index INFO   Index already exists, not overwriting.
10:15:37 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:37 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 57.45it/s]

10:15:37 root INFO   Data indexed total_indexing_time=2.99s, num_docs=3633





10:15:39 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99, -2.99, -2.99, -2.99, -2.99, -2.99, -2.99], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098, -0.0031798587125890397, -0.0031702208445168133, -0.003212623920972133, -0.003174193503317818, -0.003027929979212144, -0.0032318738222860332], 'model': ['sentence-transformers/all-MiniLM-L6-v2', 'sentence-transformers/all-MiniLM-L6-v2', 'senten

[I 2025-06-18 10:15:39,382] Trial 17 finished with value: 2.99 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 2, 'ef_runtime': 50, 'ef_construction': 300, 'm': 64}. Best is trial 7 with value: 4.256.


10:15:39 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:39 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 53.13it/s]

10:15:39 root INFO   Recreating index...





10:15:40 root INFO   Corpus size: 3633
10:15:41 root INFO   Data indexed total_indexing_time=1.365s, num_docs=3633
10:15:43 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99, -2.99, -2.99, -2.99, -2.99, -2.99, -2.99, -1.365], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098, -0.0031798587125890397, -0.0031702208445168133, -0.003212623920972133, -0.003174193503317818, -0.0030279299792

[I 2025-06-18 10:15:43,164] Trial 18 finished with value: 1.365 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float32', 'distance_metric': 'cosine', 'ret_k': 4, 'ef_runtime': 30, 'ef_construction': 100, 'm': 8}. Best is trial 7 with value: 4.256.


10:15:43 sentence_transformers.SentenceTransformer INFO   Use pytorch device_name: mps
10:15:43 sentence_transformers.SentenceTransformer INFO   Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batches: 100%|██████████| 1/1 [00:00<00:00, 59.60it/s]

10:15:43 root INFO   Recreating index...





10:15:44 root INFO   Corpus size: 3633
10:15:47 root INFO   Data indexed total_indexing_time=3.312s, num_docs=3633
10:15:49 root INFO   Saving metrics for study: 02c5c52b-9f83-4b87-93cf-3548f377d958, METRICS={'search_method': ['hybrid', 'vector', 'vector', 'vector', 'vector', 'vector', 'vector', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid', 'hybrid'], 'total_indexing_time': [-2.938, -2.938, -2.938, -2.938, -2.938, -2.938, -1.368, -4.256, -2.851, -2.851, -2.851, -2.99, -2.99, -2.99, -2.99, -2.99, -2.99, -2.99, -1.365, -3.312], 'avg_query_time': [-0.0030586490690154558, -0.0008422095709171827, -0.0009241443669463828, -0.000957826342745093, -0.0009460153963543682, -0.0009069782292510703, -0.0008249304981054537, -0.002800116967121514, -0.001982953895356264, -0.002098676959058449, -0.0022983004803258937, -0.003134949657570098, -0.0031798587125890397, -0.0031702208445168133, -0.003212623920972133, -0.003174193503317818

[I 2025-06-18 10:15:49,176] Trial 19 finished with value: 3.312 and parameters: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 5, 'ef_runtime': 10, 'ef_construction': 200, 'm': 16}. Best is trial 7 with value: 4.256.


10:15:49 root INFO   Completed Bayesian optimization... 


10:15:49 root INFO   Best Configuration: 7: {'model_info': {'type': 'hf', 'model': 'sentence-transformers/all-MiniLM-L6-v2', 'dim': 384, 'embedding_cache_name': 'vec-cache', 'dtype': 'float32'}, 'search_method': 'hybrid', 'algorithm': 'hnsw', 'var_dtype': 'float16', 'distance_metric': 'cosine', 'ret_k': 2, 'ef_runtime': 50, 'ef_construction': 300, 'm': 64}:


10:15:49 root INFO   Best Score: [4.256]




In [6]:
metrics[["search_method", "algorithm", "vector_data_type", "ef_construction", "ef_runtime", "m", "avg_query_time", "total_indexing_time", "f1@k"]].sort_values(by="f1@k", ascending=False)

Unnamed: 0,search_method,algorithm,vector_data_type,ef_construction,ef_runtime,m,avg_query_time,total_indexing_time,f1@k
0,hybrid,hnsw,float16,150,20,64,-0.003059,-2.938,0.130712
11,hybrid,hnsw,float16,150,50,64,-0.003135,-2.99,0.130712
18,hybrid,hnsw,float32,100,30,8,-0.00238,-1.365,0.130712
17,hybrid,hnsw,float16,300,50,64,-0.003232,-2.99,0.130712
16,hybrid,hnsw,float16,150,50,64,-0.003028,-2.99,0.130712
15,hybrid,hnsw,float16,300,50,64,-0.003174,-2.99,0.130712
14,hybrid,hnsw,float16,150,50,64,-0.003213,-2.99,0.130712
13,hybrid,hnsw,float16,250,50,64,-0.00317,-2.99,0.130712
12,hybrid,hnsw,float16,150,50,64,-0.00318,-2.99,0.130712
10,hybrid,hnsw,float32,150,50,64,-0.002298,-2.851,0.130712
