### op16 新增部分
    # 固定定义6个隐藏层参数
    hd0 = trial.suggest_categorical("NEW_hidden_dim_0", candidate_values)
    hd1 = trial.suggest_categorical("NEW_hidden_dim_1", candidate_values)
    hd2 = trial.suggest_categorical("NEW_hidden_dim_2", candidate_values)
    hd3 = trial.suggest_categorical("NEW_hidden_dim_3", candidate_values)
    hd4 = trial.suggest_categorical("NEW_hidden_dim_4", candidate_values)
    hd5 = trial.suggest_categorical("NEW_hidden_dim_5", candidate_values)
    hidden_dims_all = [hd0, hd1, hd2, hd3, hd4, hd5]

    # 只使用前 num_layers 个隐藏层
    hidden_dims = hidden_dims_all[:num_layers]


### Below is the complete code with all the modifications integrated. In this version:

**The number of layers is restricted to 2–6.
For each trial, the hidden dimensions are sampled one by one so that they form a strictly decreasing (trapezoidal) sequence. For example, if you sample 128 for layer 0, then layer 1 will be chosen from values strictly less than 128.
Hyperparameters (learning rate, dropout, num_layers, and each hidden dimension) are all logged so that they appear in the Optuna dashboard.
During training, intermediate validation losses are reported to enable early-stopping tracking in the “Intermediate Values” plot of the dashboard.**


### GNN层遍历的架构逻辑

1. **Use a Fixed Candidate Set:**  
   Instead of sampling any integer between 16 and 256, we'll define a fixed set of allowable hidden dimensions. For example, the candidate set will be:  
   ```python
   candidate_values = [256, 128, 64, 32, 16]
   ```  
   This set contains exactly the values you mentioned in your examples.

2. **Enforce a Non-Increasing (Trapezoidal) Structure:**  
   For each layer, we will choose a hidden dimension from the candidate set with the following rules:
   - **First Layer:**  
     For layer 0, we simply choose one value from the entire candidate set.
   - **Subsequent Layers:**  
     For layer _i_ (where _i_ > 0), we restrict the available options to only those values that are less than or equal to the hidden dimension chosen for layer _i-1_.  
     This means if the first layer was chosen as 256, then for the next layer, the allowed options will be [256, 128, 64, 32, 16].  
     If the first layer is 128, then the second layer can only be one of [128, 64, 32, 16].  
     This restriction ensures that the sequence is non-increasing (i.e., it "gets narrower" as you go deeper) and that only valid choices like [256,256,32,32] or [128,64,64] are considered.

3. **Hyperparameter Sampling with Optuna:**  
   - We use `trial.suggest_categorical` to sample the hidden dimension for each layer from the appropriate candidate list.
   - The number of layers (between 2 and 6) is also a hyperparameter. For each layer in the chosen architecture, we apply the above logic to build the list of hidden dimensions.
   - This way, every trial in the optimization will have a candidate architecture with a discrete, non-increasing set of hidden dimensions (for example, [256,256,32,32] or [128,64,64]) and will exclude choices like [256,155,17,16] because 155 and 17 are not in the candidate set.

4. **Result Logging and Dashboard Integration:**  
   - All hyperparameters (including the number of layers and each layer's hidden dimension) will be logged.
   - Intermediate validation loss values will be reported during training so that you can track the early-stopping behavior on the Optuna dashboard (via the "Intermediate Values" plot).

This logic guarantees that the search space will consist only of architectures that conform to a trapezoidal (non-increasing) pattern using your predefined candidate values. Once you confirm this approach, we can proceed to implement it in code.

### 具体逻辑

1. Fixed Candidate Set of Hidden Dimensions:
We use candidate_values = [256, 128, 64, 32, 16].

2. Non-Increasing (Trapezoidal) Architecture:

- The first layer’s dimension is chosen from the full candidate set.
- Each subsequent layer is chosen from those values less than or equal to the previous layer’s dimension, supporting valid patterns like [256, 256, 32, 32] or [128, 64, 64].

3. Number of Layers Restriction (2–6):
Controlled via trial.suggest_int("num_layers", 2, 6).

4. Hyperparameter Logging:

- Learning rate, dropout, number of layers, and each hidden_dim_i (via trial.suggest_categorical).
- All parameters automatically appear in the Optuna dashboard.

5. Intermediate Validation Loss Reporting:

trial.report(val_loss, step=epoch) is called each epoch, letting you track early stopping in the “Intermediate Values” plot.

### How This Solves the Dynamic Value Space Error如何解决动态值空间错误
Fixed Distribution:固定分布：
Every hidden_dim_{i} parameter is sampled from the same list [256, 128, 64, 32, 16]. Thus, Optuna never sees a changing candidate list, which avoids the “CategoricalDistribution does not support dynamic value space.” error.每个hidden_dim_{i}参数都从同一个列表[256, 128, 64, 32, 16]中采样。因此，Optuna永远不会看到变化的候选列表，从而避免了“CategoricalDistribution 不支持动态值空间。”错误。

Trapezoidal Constraint:梯形约束：
After sampling each layer’s width, a quick check ensures subsequent layers’ widths do not exceed the previous layer’s. If no suitable options are found, the default is the smallest dimension (16). The net result is a non-increasing (trapezoidal) architecture, and Optuna does not complain because each parameter’s distribution remains constant.在对每层的宽度进行采样后，会进行快速检查，确保后续层的宽度不超过前一层的宽度。如果未找到合适的选项，则默认为最小尺寸（16）。最终结果是非递增（梯形）架构，Optuna 不会抱怨，因为每个参数的分布保持不变。

Pruning & Logging:修剪和伐木：

If needed, you can prune trials that violate the non-increasing rule—though the code above simply narrows the candidate options.如果需要，您可以修剪违反非增加规则的试验 - 尽管上述代码只是缩小了候选选项的范围。
Hyperparameters (learning_rate, dropout, num_layers, hidden_dim_i) are all logged for dashboard inspection.超参数（ learning_rate 、 dropout 、 num_layers 、 hidden_dim_i ）均已记录以供仪表板检查。
trial.report(val_loss, step=epoch) allows you to visualize intermediate values for early-stopping.trial.report(val_loss, step=epoch) 可让您直观地看到提前停止的中间值。
By combining a fixed candidate set for every layer with a post-check that filters invalid configurations, you get your desired trapezoidal GNN structure and avoid the Optuna dynamic-distribution error.通过将每一层的固定候选集与过滤无效配置的后检查相结合，您可以获得所需的梯形 GNN 结构并避免 Optuna 动态分布错误。

In [2]:
import os
import pickle
import pandas as pd
import numpy as np

import networkx as nx

import torch
import torch.nn.functional as F
from torch import nn
from torch.utils.data import Dataset, DataLoader, random_split
from torch_geometric.data import Data, DataLoader as PyGDataLoader
from torch_geometric.utils import from_networkx
from torch_geometric.nn import GINEConv, global_mean_pool, BatchNorm
from torch_geometric.nn.conv.gcn_conv import gcn_norm

from sklearn.metrics import r2_score, mean_squared_error
from sklearn.model_selection import KFold

import warnings
warnings.filterwarnings('ignore')

# =============================================================================
#                             DATASET & PREP
# =============================================================================

class MolDataset(Dataset):
    """
    A custom dataset that:
      - Reads external factors from CSV
      - Loads the corresponding pickle for the molecule's graph
      - Converts it into a PyG Data object
    """
    def __init__(self,
                 raw_dataframe: pd.DataFrame,
                 nx_graph_dict: dict,
                 *,
                 component_col: str,
                 global_state_cols: list[str],
                 label_col: str,
                 transform=None):
        """
        Args:
            raw_dataframe: The input dataframe containing molecule info.
            nx_graph_dict: Dictionary mapping component names to networkx graphs.
            component_col: Column name for the component.
            global_state_cols: List of columns representing external factors.
            label_col: Column name for the regression target.
            transform: Any transform to apply to each PyG Data object.
        """
        self.raw_dataframe = raw_dataframe
        self.nx_graph_dict = nx_graph_dict
        self.component_col = [component_col] if type(component_col) is str else component_col
        self.global_state_cols = global_state_cols
        self.label_col = [label_col] if type(label_col) is str else label_col
        self.transform = transform
        
        required_cols = set(self.global_state_cols + self.label_col + self.component_col)
        for col in required_cols:
            if col not in self.raw_dataframe.columns:
                raise ValueError(f"Missing column in DataFrame: '{col}'")

    def __len__(self):
        return len(self.raw_dataframe)

    def __getitem__(self, idx):
        row = self.raw_dataframe.iloc[idx]
        
        # 1. Load the molecule graph
        component_name = row[self.component_col[0]]  # e.g. "C23"
        pyg_data = self.nx_graph_dict[component_name]

        # 2. Prepare the external factors
        externals = torch.tensor(row[self.global_state_cols].values.astype(float), dtype=torch.float)
        externals = externals.unsqueeze(0)

        # 3. Prepare the label (regression target)
        label = torch.tensor([row[self.label_col][0]], dtype=torch.float)

        # 4. Attach externals & label to the Data object
        pyg_data.externals = externals  # shape [1, external_in_dim]
        pyg_data.y = label  # shape [1]

        if self.transform:
            pyg_data = self.transform(pyg_data)

        return pyg_data


def networkx_to_pyg(nx_graph):
    """
    Convert a networkx graph to a torch_geometric.data.Data object.
    This is a basic template; adjust for your actual node/edge features.
    """
    # Sort nodes to ensure consistent ordering
    node_mapping = {node: i for i, node in enumerate(nx_graph.nodes())}

    x_list = []
    edge_index_list = []
    edge_attr_list = []

    # Node features
    for node in nx_graph.nodes(data=True):
        original_id = node[0]
        attrs = node[1]
        symbol = attrs.get("symbol", "C")
        symbol_id = 0 if symbol == "C" else 1 if symbol == "H" else 2
        x_list.append([symbol_id])

    # Edge features
    for u, v, edge_attrs in nx_graph.edges(data=True):
        u_idx = node_mapping[u]
        v_idx = node_mapping[v]
        edge_index_list.append((u_idx, v_idx))
        bde_pred = edge_attrs.get("bde_pred", 0.0) or 0.0
        bdfe_pred = edge_attrs.get("bdfe_pred", 0.0) or 0.0
        edge_attr_list.append([bde_pred, bdfe_pred])

    x = torch.tensor(x_list, dtype=torch.float)
    edge_index = torch.tensor(edge_index_list, dtype=torch.long).t().contiguous()
    edge_attr = torch.tensor(edge_attr_list, dtype=torch.float)

    data = Data(x=x, edge_index=edge_index, edge_attr=edge_attr)
    return data


# =============================================================================
#                     BASE GNN MODEL (WITH DIM MATCH)
# =============================================================================

class GINE_Regression(nn.Module):
    """
    A GNN for regression using GINEConv layers + edge attributes,
    where all layers have the same hidden_dim (no dimension mismatch).
    """
    def __init__(self,
                 node_in_dim: int,
                 edge_in_dim: int,
                 external_in_dim: int,
                 hidden_dim: int = 128,
                 num_layers: int = 3,
                 dropout: float = 0.1):
        super().__init__()
        
        # Encode edges from edge_in_dim to hidden_dim
        self.edge_encoder = nn.Sequential(
            nn.Linear(edge_in_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim)
        )
        
        # Encode nodes from node_in_dim to hidden_dim
        self.node_encoder = nn.Linear(node_in_dim, hidden_dim)
        
        # Multiple GINEConv layers & corresponding BatchNorm
        self.convs = nn.ModuleList()
        self.bns = nn.ModuleList()
        
        for _ in range(num_layers):
            net = nn.Sequential(
                nn.Linear(hidden_dim, hidden_dim),
                nn.ReLU(),
                nn.Linear(hidden_dim, hidden_dim)
            )
            conv = GINEConv(nn=net)
            self.convs.append(conv)
            self.bns.append(BatchNorm(hidden_dim))

        self.dropout = nn.Dropout(p=dropout)

        # Process external factors
        self.externals_mlp = nn.Sequential(
            nn.Linear(external_in_dim, hidden_dim),
            nn.ReLU(),
            nn.Dropout(p=dropout),
            nn.Linear(hidden_dim, hidden_dim)
        )

        # Final regression
        self.final_regressor = nn.Sequential(
            nn.Linear(hidden_dim + hidden_dim, hidden_dim),
            nn.ReLU(),
            nn.Dropout(p=dropout),
            nn.Linear(hidden_dim, 1)
        )

    def forward(self, data):
        x, edge_index, edge_attr, batch = data.x, data.edge_index, data.edge_attr, data.batch

        # Encode
        x = self.node_encoder(x)
        edge_emb = self.edge_encoder(edge_attr)
        
        # Pass through GINEConv layers
        for conv, bn in zip(self.convs, self.bns):
            x = conv(x, edge_index, edge_emb)
            x = bn(x)
            x = F.relu(x)
            x = self.dropout(x)

        # Global pooling
        graph_emb = global_mean_pool(x, batch)

        # Process external factors
        ext_emb = self.externals_mlp(data.externals)

        # Combine & regress
        combined = torch.cat([graph_emb, ext_emb], dim=-1)
        out = self.final_regressor(combined).squeeze(-1)
        return out


# =============================================================================
#                   TRAIN/VALID/EVALUATION UTILS
# =============================================================================

def train_one_epoch(model, loader, optimizer, criterion, device):
    model.train()
    total_loss = 0.0
    count = 0
    for batch_data in loader:
        batch_data = batch_data.to(device)
        optimizer.zero_grad()
        preds = model(batch_data)
        y = batch_data.y.to(device).view(-1)
        loss = criterion(preds, y)
        loss.backward()
        optimizer.step()
        total_loss += loss.item() * batch_data.num_graphs
        count += batch_data.num_graphs
    return total_loss / count if count > 0 else 0.0


def validate(model, loader, criterion, device):
    model.eval()
    total_loss = 0.0
    count = 0
    with torch.no_grad():
        for batch_data in loader:
            batch_data = batch_data.to(device)
            preds = model(batch_data)
            y = batch_data.y.to(device).view(-1)
            loss = criterion(preds, y)
            total_loss += loss.item() * batch_data.num_graphs
            count += batch_data.num_graphs
    return total_loss / count if count > 0 else 0.0


def evaluate_model(model, loader, device):
    """
    Evaluate the model on a dataset loader and compute R² and RMSE.
    """
    model.eval()
    y_true, y_pred = [], []

    with torch.no_grad():
        for batch in loader:
            batch = batch.to(device)
            preds = model(batch)
            y_true.append(batch.y.cpu())
            y_pred.append(preds.cpu())

    y_true = torch.cat(y_true).numpy().squeeze()
    y_pred = torch.cat(y_pred).numpy().squeeze()

    r2 = r2_score(y_true, y_pred)
    rmse = np.sqrt(mean_squared_error(y_true, y_pred))

    print(f"R²: {r2:.4f}")
    print(f"RMSE: {rmse:.4f}")

    return r2, rmse


# =============================================================================
#                             DATA PREPARATION
# =============================================================================

env_file = r"F:\2025 energing\PYTHON\GNN_chemicalENV-main\GNN molecules\graph_pickles\dataset02.xlsx"
data = pd.read_excel(env_file, engine='openpyxl').dropna(subset=['degradation_rate'])
data['seawater'] = data['seawater'].map({'art': 1, 'sea': 0})

folder_path = r"F:\2025 energing\PYTHON\GNN_chemicalENV-main\GNN molecules\graph_pickles\molecules"
graph_pickles = [f for f in os.listdir(folder_path) if f.endswith(".pkl")]

base_dir = r"F:\2025 energing\PYTHON\GNN_chemicalENV-main\GNN molecules\graph_pickles\molecules"
if os.path.exists(base_dir):
    print("Directory exists:", base_dir)
    print("Files in directory:", os.listdir(base_dir))
else:
    print(f"Error: Directory {base_dir} does not exist!")

compounds = data.component.unique()
graphs_dict = {}
for compound, graph_pickle in zip(compounds, graph_pickles):
    with open(os.path.join(base_dir, graph_pickle), 'rb') as f:
        graph = pickle.load(f)
        graphs_dict[compound] = networkx_to_pyg(graph)

dataset = MolDataset(
    raw_dataframe=data,
    nx_graph_dict=graphs_dict,
    component_col="component",
    global_state_cols=["temperature", "concentration", "time", "seawater"],
    label_col="degradation_rate",
    transform=None
)

# =============================================================================
#                      CROSS-VALIDATION (FIXED MODEL)
# =============================================================================
from torch_geometric.data import DataLoader as PyGDataLoader

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
k = 5  # number of folds
from sklearn.model_selection import KFold
kf = KFold(n_splits=k, shuffle=True, random_state=42)

fold_results = []

for fold, (train_idx, val_idx) in enumerate(kf.split(dataset)):
    print(f"\n--- Fold {fold + 1} ---")

    train_subset = torch.utils.data.Subset(dataset, train_idx)
    val_subset = torch.utils.data.Subset(dataset, val_idx)

    train_loader = PyGDataLoader(train_subset, batch_size=16, shuffle=True)
    val_loader = PyGDataLoader(val_subset, batch_size=16, shuffle=False)

    model = GINE_Regression(
        node_in_dim=1,
        edge_in_dim=2,
        external_in_dim=4,
        hidden_dim=16,  # Example hidden_dim
        num_layers=5,
        dropout=0.1
    ).to(device)

    optimizer = torch.optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-5)
    criterion = torch.nn.MSELoss()

    num_epochs = 500
    for epoch in range(1, num_epochs + 1):
        train_loss = train_one_epoch(model, train_loader, optimizer, criterion, device)
        val_loss = validate(model, val_loader, criterion, device)

        if epoch % 10 == 0:
            print(f"[Fold {fold + 1} Epoch {epoch}] Train Loss: {train_loss:.4f} | Val Loss: {val_loss:.4f}")

    print(f"Evaluating fold {fold + 1} ...")
    r2, rmse = evaluate_model(model, val_loader, device)
    fold_results.append({"fold": fold + 1, "r2": r2, "rmse": rmse})

r2_scores = [res["r2"] for res in fold_results]
rmse_scores = [res["rmse"] for res in fold_results]

print("\n--- Cross-Validation Summary ---")
for res in fold_results:
    print(f"Fold {res['fold']}: R² = {res['r2']:.4f}, RMSE = {res['rmse']:.4f}")

print(f"\nAverage R²: {np.mean(r2_scores):.4f} ± {np.std(r2_scores):.4f}")
print(f"Average RMSE: {np.mean(rmse_scores):.4f} ± {np.std(rmse_scores):.4f}")


# =============================================================================
#             IMPROVED MODEL WITH TRAPEZOID DIMENSIONS & PROJECTIONS
# =============================================================================
import optuna
import optuna.visualization as vis

class GINE_RegressionTrapezoid(nn.Module):
    """
    A GINEConv-based regression model that uses a list of hidden dimensions
    to build layers with decreasing size (trapezoid architecture),
    ensuring dimension consistency with projection layers between convs.
    """
    def __init__(self,
                 node_in_dim: int,
                 edge_in_dim: int,
                 external_in_dim: int,
                 hidden_dims: list,
                 dropout: float = 0.1):
        super().__init__()

        # For the first layer, encode edges to hidden_dims[0], and encode nodes as well
        self.initial_edge_encoder = nn.Linear(edge_in_dim, hidden_dims[0])
        self.initial_node_encoder = nn.Linear(node_in_dim, hidden_dims[0])

        # We'll build each GINEConv to transform dimension: hidden_dims[i] -> hidden_dims[i].
        # After each conv i, if i < len(hidden_dims)-1, we project to hidden_dims[i+1].
        self.convs = nn.ModuleList()
        self.bns = nn.ModuleList()
        self.projections = nn.ModuleList()  # for node features
        self.edge_projections = nn.ModuleList()  # for edge features

        for i in range(len(hidden_dims)):
            net = nn.Sequential(
                nn.Linear(hidden_dims[i], hidden_dims[i]),
                nn.ReLU(),
                nn.Linear(hidden_dims[i], hidden_dims[i])
            )
            conv = GINEConv(nn=net)
            self.convs.append(conv)
            self.bns.append(BatchNorm(hidden_dims[i]))

            if i < len(hidden_dims) - 1:
                self.projections.append(nn.Linear(hidden_dims[i], hidden_dims[i+1]))
                self.edge_projections.append(nn.Linear(hidden_dims[i], hidden_dims[i+1]))
            else:
                self.projections.append(None)
                self.edge_projections.append(None)

        self.dropout_layer = nn.Dropout(p=dropout)

        final_dim = hidden_dims[-1]
        self.externals_mlp = nn.Sequential(
            nn.Linear(external_in_dim, final_dim),
            nn.ReLU(),
            nn.Dropout(p=dropout),
            nn.Linear(final_dim, final_dim)
        )

        self.final_regressor = nn.Sequential(
            nn.Linear(final_dim + final_dim, final_dim),
            nn.ReLU(),
            nn.Dropout(p=dropout),
            nn.Linear(final_dim, 1)
        )

    def forward(self, data):
        x, edge_index, edge_attr, batch = data.x, data.edge_index, data.edge_attr, data.batch

        x = self.initial_node_encoder(x)
        edge_emb = self.initial_edge_encoder(edge_attr)

        for i, (conv, bn) in enumerate(zip(self.convs, self.bns)):
            x = conv(x, edge_index, edge_emb)
            x = bn(x)
            x = F.relu(x)
            x = self.dropout_layer(x)

            if i < len(self.projections) - 1 and self.projections[i] is not None:
                x = self.projections[i](x)
                edge_emb = self.edge_projections[i](edge_emb)

        graph_emb = global_mean_pool(x, batch)
        ext_emb = self.externals_mlp(data.externals)
        combined = torch.cat([graph_emb, ext_emb], dim=-1)
        out = self.final_regressor(combined).squeeze(-1)
        return out


def objective(trial):
    """
    Objective function for Optuna.
    We do k-fold cross validation using a new GNN model with a non-increasing
    (trapezoidal) architecture. The hyperparameters include learning rate, dropout,
    number of layers, and for each layer, a hidden dimension chosen from a fixed
    candidate set so that each subsequent layer's dimension is <= the previous layer's.
    We return the average RMSE (lower = better). R² is stored in user_attrs.
    Intermediate validation losses are reported for early stopping & dashboard visualization.
    """
    # Hyperparameter search space
    lr = trial.suggest_categorical("learning_rate", [1e-3, 3e-3, 1e-4, 3e-4])
    dropout = trial.suggest_categorical("dropout", [0.1, 0.5])
    num_layers = trial.suggest_int("num_layers", 2, 6)  # 层数限制

    # 固定候选集：所有层均使用相同的候选集
    candidate_values = [200, 128, 64, 32, 16]

    # 固定定义6个隐藏层参数
    hd0 = trial.suggest_categorical("NEW_hidden_dim_0", candidate_values)
    hd1 = trial.suggest_categorical("NEW_hidden_dim_1", candidate_values)
    hd2 = trial.suggest_categorical("NEW_hidden_dim_2", candidate_values)
    hd3 = trial.suggest_categorical("NEW_hidden_dim_3", candidate_values)
    hd4 = trial.suggest_categorical("NEW_hidden_dim_4", candidate_values)
    hd5 = trial.suggest_categorical("NEW_hidden_dim_5", candidate_values)
    hidden_dims_all = [hd0, hd1, hd2, hd3, hd4, hd5]

    # 只使用前 num_layers 个隐藏层
    hidden_dims = hidden_dims_all[:num_layers]

    # 检查是否满足非递增（梯形）结构
    for i in range(1, num_layers):
        if hidden_dims[i] > hidden_dims[i-1]:
            raise optuna.TrialPruned()

    # Early stopping 参数
    max_epochs = 200
    patience = 10
    min_delta = 1e-5

    kf_local = KFold(n_splits=5, shuffle=True, random_state=42)
    rmse_scores = []
    r2_scores = []

    for fold_idx, (train_idx, val_idx) in enumerate(kf_local.split(dataset)):
        train_subset = torch.utils.data.Subset(dataset, train_idx)
        val_subset = torch.utils.data.Subset(dataset, val_idx)

        train_loader = PyGDataLoader(train_subset, batch_size=16, shuffle=True)
        val_loader = PyGDataLoader(val_subset, batch_size=16, shuffle=False)

        # 构建使用选定梯形隐藏维度的 GNN 模型
        model = GINE_RegressionTrapezoid(
            node_in_dim=1,
            edge_in_dim=2,
            external_in_dim=4,
            hidden_dims=hidden_dims,
            dropout=dropout
        ).to(device)

        optimizer = torch.optim.Adam(model.parameters(), lr=lr, weight_decay=1e-5)
        criterion = nn.MSELoss()

        best_val_loss = float("inf")
        patience_counter = 0

        for epoch in range(1, max_epochs + 1):
            train_loss = train_one_epoch(model, train_loader, optimizer, criterion, device)
            val_loss = validate(model, val_loader, criterion, device)

            trial.report(val_loss, step=epoch)

            if trial.should_prune():
                raise optuna.TrialPruned()

            if val_loss < (best_val_loss - min_delta):
                best_val_loss = val_loss
                patience_counter = 0
            else:
                patience_counter += 1
                if patience_counter >= patience:
                    break

        r2_fold, rmse_fold = evaluate_model(model, val_loader, device)
        rmse_scores.append(rmse_fold)
        r2_scores.append(r2_fold)

    avg_rmse = float(np.mean(rmse_scores))
    avg_r2 = float(np.mean(r2_scores))

    trial.set_user_attr("avg_r2", avg_r2)
    return avg_rmse


# =============================================================================
#                           OPTUNA STUDY & DASHBOARD
# =============================================================================
if __name__ == "__main__":
    # 使用 load_if_exists=False 以确保使用全新 study，避免旧数据冲突
    study = optuna.create_study(
        storage="sqlite:///gnn_mix_op09.sqlite3",
        study_name="GNN-mixed model different layers02",
        direction="minimize",
        load_if_exists=False
    )

    study.optimize(objective, n_trials=100, show_progress_bar=True)

    print("\n================= Optuna Study Results =================")
    best_trial = study.best_trial
    print(f"Best Trial Value (RMSE): {best_trial.value}")
    print("Best Hyperparameters:")
    for key, val in best_trial.params.items():
        print(f"  {key}: {val}")
    print(f"User Attrs (R², etc.): {best_trial.user_attrs}")

    try:
        fig1 = vis.plot_optimization_history(study)
        fig1.show()
    except Exception as e:
        print(f"Could not generate optimization history plot: {e}")

    try:
        fig2 = vis.plot_param_importances(study)
        fig2.show()
    except Exception as e:
        print(f"Could not generate hyperparameter importance plot: {e}")

    try:
        fig3 = vis.plot_intermediate_values(study)
        fig3.show()
    except Exception as e:
        print(f"Could not generate intermediate values plot: {e}")

    print("\n================= End of Optuna Tuning =================")


Directory exists: F:\2025 energing\PYTHON\GNN_chemicalENV-main\GNN molecules\graph_pickles\molecules
Files in directory: ['gpickle_graph_0.pkl', 'gpickle_graph_1.pkl', 'gpickle_graph_10.pkl', 'gpickle_graph_11.pkl', 'gpickle_graph_12.pkl', 'gpickle_graph_13.pkl', 'gpickle_graph_14.pkl', 'gpickle_graph_15.pkl', 'gpickle_graph_16.pkl', 'gpickle_graph_17.pkl', 'gpickle_graph_18.pkl', 'gpickle_graph_19.pkl', 'gpickle_graph_2.pkl', 'gpickle_graph_3.pkl', 'gpickle_graph_4.pkl', 'gpickle_graph_5.pkl', 'gpickle_graph_6.pkl', 'gpickle_graph_7.pkl', 'gpickle_graph_8.pkl', 'gpickle_graph_9.pkl']

--- Fold 1 ---
[Fold 1 Epoch 10] Train Loss: 0.2102 | Val Loss: 0.0691
[Fold 1 Epoch 20] Train Loss: 0.1101 | Val Loss: 0.0513
[Fold 1 Epoch 30] Train Loss: 0.0830 | Val Loss: 0.0509
[Fold 1 Epoch 40] Train Loss: 0.0788 | Val Loss: 0.0476
[Fold 1 Epoch 50] Train Loss: 0.0925 | Val Loss: 0.0548
[Fold 1 Epoch 60] Train Loss: 0.0706 | Val Loss: 0.0571
[Fold 1 Epoch 70] Train Loss: 0.0722 | Val Loss: 0.0557


[I 2025-04-02 17:19:18,971] A new study created in RDB with name: GNN-mixed model different layers02


  0%|          | 0/100 [00:00<?, ?it/s]

[I 2025-04-02 17:19:19,136] Trial 0 pruned. 
[I 2025-04-02 17:19:19,299] Trial 1 pruned. 
[I 2025-04-02 17:19:19,466] Trial 2 pruned. 
[I 2025-04-02 17:19:19,625] Trial 3 pruned. 
[I 2025-04-02 17:19:19,786] Trial 4 pruned. 
[I 2025-04-02 17:19:19,955] Trial 5 pruned. 
R²: -0.4289
RMSE: 0.2833
R²: -0.0930
RMSE: 0.3378
R²: -0.5349
RMSE: 0.3273
R²: -0.9162
RMSE: 0.4148
R²: 0.0558
RMSE: 0.2558
[I 2025-04-02 17:22:53,338] Trial 6 finished with value: 0.3238190534668016 and parameters: {'learning_rate': 0.0001, 'dropout': 0.1, 'num_layers': 2, 'NEW_hidden_dim_0': 32, 'NEW_hidden_dim_1': 16, 'NEW_hidden_dim_2': 256, 'NEW_hidden_dim_3': 16, 'NEW_hidden_dim_4': 32, 'NEW_hidden_dim_5': 256}. Best is trial 6 with value: 0.3238190534668016.
[I 2025-04-02 17:22:53,497] Trial 7 pruned. 
[I 2025-04-02 17:22:53,656] Trial 8 pruned. 
R²: -0.7431
RMSE: 0.3130
R²: -2.6059
RMSE: 0.6136
R²: -34.0888
RMSE: 1.5651
R²: -18.1290
RMSE: 1.3104
R²: -2.9611
RMSE: 0.5239
[I 2025-04-02 17:33:17,731] Trial 9 finishe

KeyboardInterrupt: 