# Definição de Combined_metric

**Dado**

* Colunas lexicais: $L_i$, $i=1,\dots,n_{\rm lex}$
* Colunas semânticas: $S_j$, $j=1,\dots,n_{\rm sem}$
* Pesos internos:

  $$
    A1_i\quad\text{para cada }L_i,\quad\sum_{i=1}^{n_{\rm lex}}A1_i = 1,
    \quad
    A2_j\quad\text{para cada }S_j,\quad\sum_{j=1}^{n_{\rm sem}}A2_j = 1
  $$
* Pesos globais:

  $$
    B_1, B_2,\quad B_1 + B_2 = 1
  $$

**1. Cálculo dos scores parciais**

$$
\begin{aligned}
\text{lexical\_score}
&= \sum_{i=1}^{n_{\rm lex}} A1_i \;L_i,\\
\text{semantic\_score}
&= \sum_{j=1}^{n_{\rm sem}} A2_j \;S_j.
\end{aligned}
$$

**2. Combinação final**

$$
\text{combined\_metric}
= B_1 \,\times\, \text{lexical\_score}
\;+\;
B_2 \,\times\, \text{semantic\_score}.
$$

---

# Gerando Base Agg Frame

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from research.src.functions import get_final_corr, get_agg_frame

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
avg_summeval_metrics,df_agg = get_agg_frame(n=2)

100%|██████████| 22/22 [00:00<00:00, 577.86it/s]
100%|██████████| 22/22 [00:40<00:00,  1.84s/it]


## Visualizando Corr com pesos básicos

In [4]:
get_final_corr(avg_summeval_metrics,df_agg)

np.float64(0.9999999999999999)

## Testando pesos customizados

Aqui vamos só checar o tamanho dos vetores de A1 e A2 

In [5]:
from research.src.functions import get_metric_columns

lexical_cols, semantic_cols = get_metric_columns(df_agg)
# 2) Inicializa A1 e A2 com pesos iguais, se não fornecidos
n_lex = len(lexical_cols)
n_sem = len(semantic_cols)
print(f"n_lex: {n_lex}, n_sem: {n_sem}")

n_lex: 3, n_sem: 3


In [6]:
A1 = [.2,.3,.5]
A2 = [.2,.2,.6]
B1 = .3
B2 = .7

In [7]:
teste = get_final_corr(avg_summeval_metrics,df_agg,A1=A1,A2=A2,B1=B1,B2=B2)

In [8]:
teste

np.float64(-0.9999999999999999)

# Criando Study - Optuna

In [9]:
import optuna
import pandas as pd
from functools import partial
from src.model import objective, get_best_weights

## Definindo minha função Objetivo

In [10]:
objective_fn = partial(
    objective,
    avg_summeval_metrics=avg_summeval_metrics,
    df_agg=df_agg
)

In [11]:
study = optuna.create_study(direction="maximize")

[I 2025-05-07 09:57:40,240] A new study created in memory with name: no-name-79a9fd60-b609-49a6-a363-f02f6771471f


In [12]:
study.optimize(objective_fn, n_trials=50) 

[I 2025-05-07 09:57:40,778] Trial 0 finished with value: 0.9999999999999999 and parameters: {'A1_0': 0.25785443552981857, 'A1_1': 0.011140974193092235, 'A1_2': 0.8004729511223438, 'A2_0': 0.1636959235748594, 'A2_1': 0.05763655713417615, 'A2_2': 0.1256710165654752}. Best is trial 0 with value: 0.9999999999999999.
[I 2025-05-07 09:57:40,912] Trial 1 finished with value: -0.9999999999999999 and parameters: {'A1_0': 0.6585323037775125, 'A1_1': 0.25554335839694353, 'A1_2': 0.39682593480450556, 'A2_0': 0.161607223913661, 'A2_1': 0.8025401073076451, 'A2_2': 0.2812584053548074}. Best is trial 0 with value: 0.9999999999999999.
[I 2025-05-07 09:57:40,946] Trial 2 finished with value: 0.9999999999999999 and parameters: {'A1_0': 0.35867216053599205, 'A1_1': 0.2832215392182925, 'A1_2': 0.4880074802957508, 'A2_0': 0.9266280010057905, 'A2_1': 0.9423934614870204, 'A2_2': 0.803223525609393}. Best is trial 0 with value: 0.9999999999999999.
[I 2025-05-07 09:57:40,985] Trial 3 finished with value: 0.99999

## Analisando meu Estudo

In [13]:
get_best_weights(study,df_agg)


🧪 Resultado do Otimização de Pesos
───────────────────────────────────
▶️ Pesos Lexicais (A1):
    • lexical_rouge1_f1   : 0.2411
    • lexical_rougeL_f1   : 0.0104
    • lexical_bleu        : 0.7485

▶️ Pesos Semânticos (A2):
    • semantic_bert_score_precision: 0.4717
    • semantic_bert_score_recall: 0.1661
    • semantic_bert_score_f1: 0.3622

▶️ Melhor correlação (objetivo):
    1.0000



In [14]:
from plotly.io import show


fig = optuna.visualization.plot_parallel_coordinate(study, params=study.best_params.keys())
show(fig)