In [1]:
import pandas as pd

## Ingestion

In [2]:
df = pd.read_json('data/AWSBedrockRAG.json')

In [67]:
df

Unnamed: 0,id,service,category,title,content,tags
0,0,Amazon Bedrock,Terminology,Foundation model (FM),An AI model with a large number of parameters ...,"[foundation model, FM, AI model]"
1,1,Amazon Bedrock,Terminology,Base model,A foundation model that is packaged by a provi...,"[base model, foundation model]"
2,2,Amazon Bedrock,Terminology,Model inference,The process of a foundation model generating a...,"[model inference, inference, response generation]"
3,3,Amazon Bedrock,Terminology,Prompt,An input provided to a model to guide it to ge...,"[prompt, input, text prompt]"
4,4,Amazon Bedrock,Terminology,Inference parameters,Values that can be adjusted during model infer...,"[inference parameters, model control]"
...,...,...,...,...,...,...
643,643,Amazon Bedrock,Batch Inference,Monitoring Token Usage and Cost,To effectively monitor token usage and associa...,"[monitoring, usage, cost, token counts]"
644,644,Amazon Bedrock,Batch Inference,Token Counting Support Across Models and Regions,Token counting is a feature supported by all m...,"[supported models, regions, token counting, av..."
645,645,Amazon Bedrock,Batch Inference,Retrieving Token Counts from Model Responses,"When you interact with Amazon Bedrock models, ...","[token count, API response, input tokens, outp..."
646,646,Amazon Bedrock,Batch Inference,Example: Retrieving Token Counts with Python S...,You can retrieve token counts using the AWS SD...,"[example, python, boto3, API]"


In [5]:
df.columns = df.columns.str.lower()

In [6]:
df.columns

Index(['id', 'service', 'category', 'title', 'content', 'tags'], dtype='object')

In [7]:
documents = df.to_dict(orient="records")

In [8]:
import minsearch

In [9]:
index = minsearch.Index(
    text_fields = ['service', 'category', 'title', 'content'],
    keyword_fields = ['tags']
)

In [10]:
index.fit(documents)

<minsearch.minsearch.Index at 0x24b33f75810>

In [11]:
temp_query = "temperature in llm. answer in 2 lines"

In [12]:
index.search(temp_query, num_results=5)

[{'id': 164,
  'service': 'Amazon Bedrock',
  'category': 'Prompt engineering concepts',
  'title': 'Question-answer, with context',
  'content': 'In a question-answer, with context task, the user provides an input text followed by a specific question. The Large Language Model (LLM) is then instructed to answer the question solely based on the information contained within that provided text. Placing the question at the end of the prompt after the context is often beneficial for LLMs on Amazon Bedrock to better focus on the task. Model encouragement and wrapping input text in XML tags (for Anthropic Claude) can further enhance the accuracy of the generated responses.',
  'tags': ['question-answer',
   'with context',
   'information extraction',
   'prompting',
   'llm tasks']},
 {'id': 163,
  'service': 'Amazon Bedrock',
  'category': 'Prompt engineering concepts',
  'title': 'Question-answer, without context',
  'content': "In a question-answer, without context task, the Large Languag

## RAG Flow

In [21]:
import boto3
import os

In [22]:
from dotenv import load_dotenv
load_dotenv()

True

In [23]:
bearer_token = os.environ.get("AWS_BEARER_TOKEN_BEDROCK")

In [25]:
client =  boto3.client(
    service_name = "bedrock-runtime",
    region_name = "us-east-1"
)

In [26]:
model_id = "amazon.nova-micro-v1:0"
temp_query = "bedrock agent use. answer in 4 lines"

In [27]:
messages = [{ "role": "user", "content": [{"text": temp_query}] }]

In [28]:
response = client.converse(
 modelId=model_id,
 messages=messages,
)

In [29]:
print(response['output']['message']['content'][0]['text'])

Bedrock agents facilitate complex tasks by processing data and making decisions. They utilize machine learning to adapt and improve over time. These agents can automate repetitive processes, enhancing efficiency. Their applications span various fields, from healthcare to finance.


In [21]:
#search("what is prompt engineeing?")

In [22]:
def search(query):
    boost = {}

    results = index.search(
                query=query,
                filter_dict={},
                boost_dict=boost,
                num_results=3
            )

    return results

In [23]:
prompt_template = """
You're a aws expert. Answer the QUESTION based on the CONTEXT from our aws core service database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT:
{context}
""".strip()

entry_template = """
id: {id}
service: {service}
category: {category}
title: {title}
content: {content}
tags: {tags}
""".strip()

def build_prompt(query, search_results):
    context =""

    for doc in search_results:
        context = context + entry_template.format(**doc) + "\n\n"

    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

In [24]:
def llm(prompt, model):
    client = boto3.client("bedrock-runtime", region_name="us-east-1")

    messages = [{"role": "user", "content": [{"text": prompt}] }]

    inference_config = {"temperature": 0.1, "topP": 0.9}

    response = client.converse(modelId=model, messages=messages, inferenceConfig=inference_config)
    
    try:
        return response["output"]["message"]["content"][0]["text"]
    except (KeyError, IndexError, TypeError):
        return ""

In [36]:
def rag(query, model):
    search_results = search(query)
    prompt = build_prompt(query, search_results)
    #print(prompt)
    answer = llm(prompt, model=model)
    return answer

In [37]:
question = "how much it cost for fine tunning model? answr in concise way."
model_id = "amazon.nova-micro-v1:0"
answer = rag(question, model_id)
print(answer)

The cost for fine-tuning a model in Amazon Bedrock is not explicitly mentioned in the provided context. However, the context does detail the cost structure for running custom models based on the volume of input and output tokens processed during inference, and the need to purchase Provisioned Throughput. For specific fine-tuning costs, detailed pricing information is available on the Model providers page in the Amazon Bedrock console.


## ElasticSearch 

In [27]:
from elasticsearch import Elasticsearch

In [28]:
es_client = Elasticsearch('http://localhost:9200') 

In [29]:
index_settings = {
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    },
    "mappings": {
        "properties": {
            "service": {"type": "text"},
            "category": {"type": "text"},
            "title": {"type": "text"},
            "content": {"type": "text"},
            "tags": {"type": "keyword"}
        }
    }
}

index_name = "bedrock_knowledge_base"

es_client.indices.create(index=index_name, body=index_settings)

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'bedrock_knowledge_base'})

In [30]:
documents[0]

{'id': 0,
 'service': 'Amazon Bedrock',
 'category': 'Terminology',
 'title': 'Foundation model (FM)',
 'content': 'An AI model with a large number of parameters and trained on a massive amount of diverse data. A foundation model can generate a variety of responses for a wide range of use cases. Foundation models can generate text or image, and can also convert input into embeddings. Before you can use an Amazon Bedrock foundation model, you must request access.',
 'tags': ['foundation model', 'FM', 'AI model']}

In [31]:
from tqdm.auto import tqdm

In [32]:
for doc in tqdm(documents):
    es_client.index(index=index_name, document=doc)

  0%|          | 0/648 [00:00<?, ?it/s]

In [33]:
def elastic_search(query):
    search_query = {
        "size": 5,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": ["category^3", "title^2", "content", "tags^3"],
                        "type": "best_fields"
                    }
                },
                "filter": {
                    "match": {
                        "service": "Amazon Bedrock"
                    }
                }
            }
        }
    }

    response = es_client.search(index=index_name, body=search_query)
    
    result_docs = []
    
    for hit in response['hits']['hits']:
        result_docs.append(hit['_source'])
    
    return result_docs

In [34]:
def rag(query, model_id):
    search_results = elastic_search(query)
    prompt = build_prompt(query, search_results)
    #print(prompt)
    answer = llm(prompt, model_id)
    return answer

In [35]:
query = "how much it cost for fine tunning model? answer in concise way."
model_id = "amazon.nova-micro-v1:0"
answer = rag(query, model_id)
print(answer)

The cost for fine-tuning a model in Amazon Bedrock is not explicitly mentioned in the provided context. The context details the cost of running a custom model based on input and output tokens, provisioned throughput, and optimization strategies, but does not specify the cost associated with fine-tuning. For detailed pricing information, refer to the Model providers page in the Amazon Bedrock console.


## Retrieval Evaluation

In [38]:
df_question = pd.read_csv('data/ground-truth-retrieval.csv')

In [39]:
df_question.head()

Unnamed: 0,id,question
0,0,What is the definition of a foundation model i...
1,0,How can a foundation model generate a variety ...
2,0,What types of data can a foundation model conv...
3,0,What is required before using an Amazon Bedroc...
4,0,What are the different use cases for a foundat...


In [40]:
ground_truth = df_question.to_dict(orient='records')

In [41]:
ground_truth[0]

{'id': 0,
 'question': 'What is the definition of a foundation model in Amazon Bedrock?'}

In [42]:
def hit_rate(relevance_total):
    cnt = 0

    for line in relevance_total:
        if True in line:
            cnt = cnt + 1

    return cnt / len(relevance_total)

def mrr(relevance_total):
    total_score = 0.0

    for line in relevance_total:
        for rank in range(len(line)):
            if line[rank] == True:
                total_score = total_score + 1 / (rank + 1)

    return total_score / len(relevance_total)

In [43]:
def minsearch_search(query):
    boost = {
        "category": 2.0,
        "tags": 1.5,
        "title": 1.0 
    }

    results = index.search(
        query=query,
        filter_dict={},
        boost_dict=boost,
        num_results=10
    )

    return results

In [44]:
relevance_total = []

In [45]:
def evaluate(ground_truth, search_function):

    for q in tqdm(ground_truth):
        doc_id = q['id']
        results = search_function(q)
        relevance = [d['id'] == doc_id for d in results]
        relevance_total.append(relevance)

    return {
        'hit_rate': hit_rate(relevance_total),
        'mrr': mrr(relevance_total),
    }

In [46]:
from tqdm.auto import tqdm

In [47]:
evaluate(ground_truth, lambda q: minsearch_search(q['question']))

  0%|          | 0/1620 [00:00<?, ?it/s]

{'hit_rate': 0.341358024691358, 'mrr': 0.24078287281990976}

In [48]:
evaluate(ground_truth, lambda q: elastic_search(q['question']))

  0%|          | 0/1620 [00:00<?, ?it/s]

{'hit_rate': 0.5098765432098765, 'mrr': 0.38671859690378224}

## Finding the best parameters for minsearch

In [49]:
df_validation = df_question[:100]
df_test = df_question[100:]

In [50]:
gt_val = df_validation.to_dict(orient='records')

In [51]:
def minsearch_search(query, boost=None):
    if boost is None:
        boost = {}

    results = index.search(
        query=query,
        filter_dict={},
        boost_dict=boost,
        num_results=10
    )

    return results

In [53]:
import optuna

In [54]:
SEED = 42  # use only this seed as requested

param_ranges = {
    'category': (0.0, 2.0),
    'tags': (0.0, 1.5),
    'title': (0.0, 1.0),
}

def optuna_objective(trial):
    # Suggest floats in the same ranges as your original random search
    boost = {}
    for name, (low, high) in param_ranges.items():
        boost[name] = trial.suggest_uniform(name, low, high)

    # keep your evaluation plumbing unchanged
    def search_function(q):
        return minsearch_search(q['question'], boost)

    results = evaluate(gt_val, search_function)
    # maximize MRR
    return results['mrr']

def run_optuna_search(n_trials=20):
    sampler = optuna.samplers.TPESampler(seed=SEED)
    study = optuna.create_study(direction='maximize', sampler=sampler)
    study.optimize(optuna_objective, n_trials=n_trials)

    best_params = study.best_params
    best_value = study.best_value

    # Prepare a boost dict identical to what your old code returned
    best_boost = {k: float(v) for k, v in best_params.items()}

    print("Optuna best MRR:", best_value)
    print("Optuna best boost:", best_boost)

    return best_boost, best_value, study

# Example call (keeps n_trials same as your previous n_iterations)
best_boost, best_value, study = run_optuna_search(n_trials=20)


[I 2025-09-15 16:47:39,284] A new study created in memory with name: no-name-2b644404-00f7-492f-8e81-4216df585156
  boost[name] = trial.suggest_uniform(name, low, high)


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:47:42,796] Trial 0 finished with value: 0.3883877483128981 and parameters: {'category': 0.749080237694725, 'tags': 1.4260714596148742, 'title': 0.7319939418114051}. Best is trial 0 with value: 0.3883877483128981.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:47:46,528] Trial 1 finished with value: 0.3859426910299003 and parameters: {'category': 1.1973169683940732, 'tags': 0.23402796066365478, 'title': 0.15599452033620265}. Best is trial 0 with value: 0.3883877483128981.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:47:49,487] Trial 2 finished with value: 0.39262857591247413 and parameters: {'category': 0.11616722433639892, 'tags': 1.2992642186624028, 'title': 0.6011150117432088}. Best is trial 2 with value: 0.39262857591247413.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:47:52,597] Trial 3 finished with value: 0.3921869004011861 and parameters: {'category': 1.416145155592091, 'tags': 0.03087674144370367, 'title': 0.9699098521619943}. Best is trial 2 with value: 0.39262857591247413.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:47:55,652] Trial 4 finished with value: 0.3874718826924708 and parameters: {'category': 1.6648852816008435, 'tags': 0.31850866601741423, 'title': 0.18182496720710062}. Best is trial 2 with value: 0.39262857591247413.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:47:58,665] Trial 5 finished with value: 0.39083333333333314 and parameters: {'category': 0.36680901970686763, 'tags': 0.4563633644393066, 'title': 0.5247564316322378}. Best is trial 2 with value: 0.39262857591247413.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:01,496] Trial 6 finished with value: 0.3917265933446134 and parameters: {'category': 0.8638900372842315, 'tags': 0.43684371029706287, 'title': 0.6118528947223795}. Best is trial 2 with value: 0.39262857591247413.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:04,292] Trial 7 finished with value: 0.39591780606632054 and parameters: {'category': 0.27898772130408367, 'tags': 0.43821697280282723, 'title': 0.3663618432936917}. Best is trial 7 with value: 0.39591780606632054.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:07,241] Trial 8 finished with value: 0.3951824054903761 and parameters: {'category': 0.9121399684340719, 'tags': 1.1777639420895203, 'title': 0.19967378215835974}. Best is trial 7 with value: 0.39591780606632054.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:11,248] Trial 9 finished with value: 0.3929911650194664 and parameters: {'category': 1.0284688768272232, 'tags': 0.8886218532930636, 'title': 0.046450412719997725}. Best is trial 7 with value: 0.39591780606632054.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:17,550] Trial 10 finished with value: 0.39589102845439184 and parameters: {'category': 0.41523768324303534, 'tags': 0.7099909821820028, 'title': 0.35349387565816887}. Best is trial 7 with value: 0.39591780606632054.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:23,090] Trial 11 finished with value: 0.3986383705133697 and parameters: {'category': 0.4620399792515078, 'tags': 0.7784266882280907, 'title': 0.3670717954717113}. Best is trial 11 with value: 0.3986383705133697.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:26,470] Trial 12 finished with value: 0.40414429060904744 and parameters: {'category': 0.026035792562728532, 'tags': 0.7370529066411016, 'title': 0.3577123916785768}. Best is trial 12 with value: 0.40414429060904744.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:30,007] Trial 13 finished with value: 0.4095445915435131 and parameters: {'category': 0.007417207619000346, 'tags': 0.877397308352559, 'title': 0.3560912233349938}. Best is trial 13 with value: 0.4095445915435131.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:32,949] Trial 14 finished with value: 0.41409433393610506 and parameters: {'category': 0.0469671912922755, 'tags': 0.9738221775963493, 'title': 0.4263422836814598}. Best is trial 14 with value: 0.41409433393610506.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:39,309] Trial 15 finished with value: 0.4150452577725295 and parameters: {'category': 0.6365725002496158, 'tags': 1.034835635024538, 'title': 0.7676259977007911}. Best is trial 15 with value: 0.4150452577725295.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:43,377] Trial 16 finished with value: 0.4160535955272788 and parameters: {'category': 0.6630897845793549, 'tags': 1.0497103944025048, 'title': 0.8297184613542987}. Best is trial 16 with value: 0.4160535955272788.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:46,759] Trial 17 finished with value: 0.41706963340891806 and parameters: {'category': 0.6435343552005294, 'tags': 1.1113941097838445, 'title': 0.8825505476032763}. Best is trial 17 with value: 0.41706963340891806.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:49,538] Trial 18 finished with value: 0.41660073497622035 and parameters: {'category': 1.2552858905118995, 'tags': 1.153034814070398, 'title': 0.9913504578900646}. Best is trial 17 with value: 0.41706963340891806.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:48:52,443] Trial 19 finished with value: 0.41443618987034925 and parameters: {'category': 1.9920330634948704, 'tags': 1.4674120420139587, 'title': 0.9918275293482719}. Best is trial 17 with value: 0.41706963340891806.


Optuna best MRR: 0.41706963340891806
Optuna best boost: {'category': 0.6435343552005294, 'tags': 1.1113941097838445, 'title': 0.8825505476032763}


In [55]:
def minsearch_search(query):
    boost = {
        "category": 0.20,
        "tags": 0.11,
        "title": 0.92
    }

    results = index.search(
        query=query,
        filter_dict={},
        boost_dict=boost,
        num_results=10
    )

    return results

In [56]:
evaluate(ground_truth, lambda q: minsearch_search(q['question']))

  0%|          | 0/1620 [00:00<?, ?it/s]

{'hit_rate': 0.6293002915451895, 'mrr': 0.4619597737054016}

## Finding the best parameters for elasticsearch

In [57]:
df_validation = df_question[:100]
df_test = df_question[100:]

In [58]:
gt_val = df_validation.to_dict(orient='records')

In [62]:
def elastic_search(query, boost=None):
    if boost is None:
        boost = {'category': 3.0, 'title': 2.0, 'content': 1.0, 'tags': 3.0}

    fields = []
    for field, weight in boost.items():
        if weight > 0:
            fields.append(f"{field}^{weight}")
        else:
            fields.append(field)
    
    search_query = {
        "size": 5,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": fields,
                        "type": "best_fields"
                    }
                },
                "filter": {
                    "match": {
                        "service": "Amazon Bedrock"
                    }
                }
            }
        }
    }
    response = es_client.search(index=index_name, body=search_query)
    
    result_docs = []
    for hit in response['hits']['hits']:
        result_docs.append(hit['_source'])
    
    return result_docs

In [60]:
import optuna

In [61]:
SEED = 42  # use only this seed as requested

param_ranges = {
    'category': (0.0, 4.0),   
    'title': (0.0, 4.0),        
    'content': (0.0, 4.0),     
    'tags': (0.0, 4.0),
}

def optuna_objective(trial):
    # Suggest floats in the same ranges as your original random search
    boost = {}
    for name, (low, high) in param_ranges.items():
        boost[name] = trial.suggest_uniform(name, low, high)

    # keep your evaluation plumbing unchanged
    def search_function(q):
        return elastic_search(q['question'], boost)

    results = evaluate(gt_val, search_function)
    # maximize MRR
    return results['mrr']

def run_optuna_search(n_trials=20):
    sampler = optuna.samplers.TPESampler(seed=SEED)
    study = optuna.create_study(direction='maximize', sampler=sampler)
    study.optimize(optuna_objective, n_trials=n_trials)

    best_params = study.best_params
    best_value = study.best_value

    # Prepare a boost dict identical to what your old code returned
    best_boost = {k: float(v) for k, v in best_params.items()}

    print("Optuna best MRR:", best_value)
    print("Optuna best boost:", best_boost)

    return best_boost, best_value, study

# Example call (keeps n_trials same as your previous n_iterations)
best_boost, best_value, study = run_optuna_search(n_trials=20)

[I 2025-09-15 16:58:53,388] A new study created in memory with name: no-name-852c8c55-482a-4273-8936-0fc787b4f7d2
  boost[name] = trial.suggest_uniform(name, low, high)


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:00,580] Trial 0 finished with value: 0.46388084975369565 and parameters: {'category': 1.49816047538945, 'title': 3.8028572256396647, 'content': 2.9279757672456204, 'tags': 2.3946339367881464}. Best is trial 0 with value: 0.46388084975369565.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:07,486] Trial 1 finished with value: 0.4630208754890068 and parameters: {'category': 0.6240745617697461, 'title': 0.6239780813448106, 'content': 0.23233444867279784, 'tags': 3.4647045830997407}. Best is trial 0 with value: 0.46388084975369565.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:14,409] Trial 2 finished with value: 0.46180782787975616 and parameters: {'category': 2.404460046972835, 'title': 2.832290311184182, 'content': 0.08233797718320979, 'tags': 3.8796394086479773}. Best is trial 0 with value: 0.46388084975369565.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:21,412] Trial 3 finished with value: 0.4598315951725052 and parameters: {'category': 3.329770563201687, 'title': 0.8493564427131046, 'content': 0.7272998688284025, 'tags': 0.7336180394137353}. Best is trial 0 with value: 0.46388084975369565.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:28,436] Trial 4 finished with value: 0.4619715644409948 and parameters: {'category': 1.216968971838151, 'title': 2.0990257265289514, 'content': 1.727780074568463, 'tags': 1.1649165607921677}. Best is trial 0 with value: 0.46388084975369565.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:35,301] Trial 5 finished with value: 0.46375925571301013 and parameters: {'category': 2.447411578889518, 'title': 0.5579754426081673, 'content': 1.1685785941408726, 'tags': 1.4654473731747668}. Best is trial 0 with value: 0.46388084975369565.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:42,652] Trial 6 finished with value: 0.4627373078861184 and parameters: {'category': 1.8242799368681437, 'title': 3.1407038455720544, 'content': 0.7986951286334389, 'tags': 2.0569377536544464}. Best is trial 0 with value: 0.46388084975369565.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:49,830] Trial 7 finished with value: 0.4647076650503554 and parameters: {'category': 2.36965827544817, 'title': 0.1858016508799909, 'content': 2.4301794076057535, 'tags': 0.6820964947491661}. Best is trial 7 with value: 0.4647076650503554.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 16:59:57,164] Trial 8 finished with value: 0.4673768102601876 and parameters: {'category': 0.26020637194111806, 'title': 3.795542149013333, 'content': 3.8625281322982374, 'tags': 3.2335893924658445}. Best is trial 8 with value: 0.4673768102601876.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:03,983] Trial 9 finished with value: 0.46923800436205126 and parameters: {'category': 1.2184550766934827, 'title': 0.3906884560255355, 'content': 2.7369321060486276, 'tags': 1.7606099749584052}. Best is trial 9 with value: 0.46923800436205126.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:10,931] Trial 10 finished with value: 0.47105243479301384 and parameters: {'category': 3.7805461769215456, 'title': 1.4549106944865244, 'content': 3.4138884150769346, 'tags': 0.07184750245927995}. Best is trial 10 with value: 0.47105243479301384.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:17,796] Trial 11 finished with value: 0.47282184213635947 and parameters: {'category': 3.743031258080954, 'title': 1.3754701764520596, 'content': 3.494642402901741, 'tags': 0.0978537724175319}. Best is trial 11 with value: 0.47282184213635947.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:24,648] Trial 12 finished with value: 0.4745478816526624 and parameters: {'category': 3.9253051520013473, 'title': 1.4669605055263262, 'content': 3.975038528143521, 'tags': 0.0193091799165684}. Best is trial 12 with value: 0.4745478816526624.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:31,449] Trial 13 finished with value: 0.4762321284445995 and parameters: {'category': 3.894857815269665, 'title': 1.545578232165592, 'content': 3.975269225366084, 'tags': 0.0426745558849756}. Best is trial 13 with value: 0.4762321284445995.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:38,565] Trial 14 finished with value: 0.47787608225108363 and parameters: {'category': 3.0330095954911385, 'title': 2.0116409406313647, 'content': 3.8673100466218453, 'tags': 0.5763650708863748}. Best is trial 14 with value: 0.47787608225108363.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:45,319] Trial 15 finished with value: 0.47952057300461703 and parameters: {'category': 3.001542641636441, 'title': 2.1780809075385528, 'content': 3.3362927917656244, 'tags': 0.799602586720596}. Best is trial 15 with value: 0.47952057300461703.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:52,175] Trial 16 finished with value: 0.4812960336003575 and parameters: {'category': 3.0776966655866285, 'title': 2.34771477449443, 'content': 3.170568439631311, 'tags': 0.9085560466035147}. Best is trial 16 with value: 0.4812960336003575.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:00:59,606] Trial 17 finished with value: 0.4830978499945024 and parameters: {'category': 2.785225730790395, 'title': 2.6297953705047887, 'content': 2.2574800507322546, 'tags': 1.153828334261493}. Best is trial 17 with value: 0.4830978499945024.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:01:06,555] Trial 18 finished with value: 0.48408798108284506 and parameters: {'category': 2.9514209580170374, 'title': 2.686051220520369, 'content': 1.9878776004546268, 'tags': 2.578694741072342}. Best is trial 18 with value: 0.48408798108284506.


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-09-15 17:01:13,310] Trial 19 finished with value: 0.4845892454047088 and parameters: {'category': 2.667831003311455, 'title': 2.8137511819215306, 'content': 1.848459033604512, 'tags': 2.6123431320671635}. Best is trial 19 with value: 0.4845892454047088.


Optuna best MRR: 0.4845892454047088
Optuna best boost: {'category': 2.667831003311455, 'title': 2.8137511819215306, 'content': 1.848459033604512, 'tags': 2.6123431320671635}


In [63]:
def elastic_search(query):
    search_query = {
        "size": 5,
        "query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query": query,
                        "fields": ["category^2.6", "title^2.8", "content^1.8", "tags^2.6"],
                        "type": "best_fields"
                    }
                },
                "filter": {
                    "match": {
                        "service": "Amazon Bedrock"
                    }
                }
            }
        }
    }

    response = es_client.search(index=index_name, body=search_query)
    
    result_docs = []
    
    for hit in response['hits']['hits']:
        result_docs.append(hit['_source'])
    
    return result_docs

In [64]:
evaluate(ground_truth, lambda q: elastic_search(q['question']))

  0%|          | 0/1620 [00:00<?, ?it/s]

{'hit_rate': 0.6721374045801527, 'mrr': 0.515851849327514}