# # OpenAD Code-Generation Benchmark Notebook
# This notebook benchmarks the OpenAD code-generation pipeline across multiple libraries (PyOD, PyGOD, Darts, sktime).
# It measures success rate, total runtime, InfoMiner durations, and LLM token usage, then exports results.json and summary tables.

In [None]:
import os, sys, types, json
%pip install tiktoken faiss-cpu pandas matplotlib pygod
# ensure project root is on path
sys.path.append(os.getcwd())
sys.path.append(os.path.dirname(os.getcwd()))



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [88]:
# ## 1. Setup Imports and Instrumentation
# Install required packages (if needed) and import modules



import time
import json
import pandas as pd
import matplotlib.pyplot as plt

# Import your instrumentation and pipeline
from benchmark.instrumentation import InstrumentedChatOpenAI, InstrumentedInfoMiner, InstrumentedCoder
from main import compiled_full_graph, FullToolState

import langchain_openai

# Monkey-patch ChatOpenAI to our instrumented version
langchain_openai.ChatOpenAI = InstrumentedChatOpenAI


In [51]:
%pip install torch_geometric

from pygod.utils import load_data
import os
import torch

os.makedirs('pygod_data', exist_ok=True)
for name in ['weibo']:
    path = f'pygod_data/{name}.pt'
    if not os.path.exists(path):
        print(f"Downloading '{name}' dataset...")
        data = load_data(name)
        torch.save(data, path)


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [52]:

# ## 3. Define Experiment Configurations
# Provide dataset paths for each library
exp_configs = {
    'pyod': {
        'algorithm': ['ABOD','LOF','IForest'],
        'dataset_train': './data/glass_train.mat',
        'dataset_test': './data/glass_test.mat',
        'parameters': {'contamination': 0.1}
    },
    'pygod': {
        'algorithm': ['OCGNN','GCN','SCAN'],
        'dataset_train': './pygod_data/graph1.pt',  # clone https://github.com/pygod-team/data
        'dataset_test': './pygod_data/graph2.pt',
        'parameters': {}
    },
    # 'darts': {
    #     'algorithm': ['DifferenceScorer','NormScorer'],
    #     'dataset_train': './data/yahoo_train.csv',
    #     'dataset_test': './data/yahoo_test.csv',
    #     'parameters': {}
    # },
    # 'sktime': {
    #     'algorithm': ['KMeansScorer'],
    #     'dataset_train': './data/yahoo_train.csv',
    #     'dataset_test': './data/yahoo_test.csv',
    #     'parameters': {}
    # }
}

In [147]:
# ## 2. Helper Function for PyOD InfoMiner

def run_pyod_infominer(algorithms, train_path, test_path, params):
    # Directly benchmark InfoMiner.query_docs without running the full pipeline
    infom = InstrumentedInfoMiner()
    results = []
    for algo in algorithms:
        # Time a single documentation query
        _ = infom.query_docs(algo, None, 'pyod')
        results.append({
            'algorithm': algo,
            'infominer_time': infom.last_query_duration
        })
    return results

# %% [markdown]
# ## 3. Run Benchmark for Selected PyOD Algorithms
algos = [
    'MO-GAAL','SO-GAAL','AutoEncoder','VAE','AnoGAN',
    'DeepSVDD','ALAD','AE1SVM','DevNet','LUNAR'
]
train_file = './data/glass_train.mat'
test_file  = './data/glass_test.mat'
params = {'contamination': 0.1}

metrics = run_pyod_infominer(algos, train_file, test_file, params)

# Convert to DataFrame
df = pd.DataFrame(metrics)
df.to_json('pyod_infominer_times.json', orient='records', indent=2)

display(df)

# %% [markdown]
# ## 4. Summary of InfoMiner Time
summary = df['infominer_time'].agg(['mean','std'])
display(summary)


[Cache Hit] Using recent cache for MO-GAAL
The `MO_GAAL` class in PyOD is designed for Multi-Objective Generative Adversarial Active Learning, which generates potential outliers to help classifiers effectively distinguish between normal data and outliers. To prevent mode collapse, it employs multiple generators with different objectives.

**Initialization Function (`__init__`):**

The `__init__` method initializes the `MO_GAAL` class with the following parameters:

- **contamination**: float in (0., 0.5), optional (default=0.1)
  - The proportion of outliers in the dataset. Used to define the threshold on the decision function.

- **k**: int, optional (default=10)
  - The number of sub-generators.

- **stop_epochs**: int, optional (default=20)
  - The number of training epochs. The total number of epochs equals three times this value.

- **lr_d**: float, optional (default=0.01)
  - Learning rate of the discriminator.

- **lr_g**: float, optional (default=0.0001)
  - Learning rate of th

Unnamed: 0,algorithm,infominer_time
0,MO-GAAL,0.004609
1,SO-GAAL,0.000713
2,AutoEncoder,0.00057
3,VAE,0.000506
4,AnoGAN,0.000483
5,DeepSVDD,0.000439
6,ALAD,0.000404
7,AE1SVM,0.000387
8,DevNet,0.000334
9,LUNAR,0.000349


mean    0.000879
std     0.001315
Name: infominer_time, dtype: float64