# Agent Comparison Analysis

This notebook provides a systematic comparison of all agent types in the LNES system.

## Table of Contents

1. [Setup](#setup)
2. [Agent Overview](#overview)
3. [Performance Metrics](#metrics)
4. [Statistical Significance](#statistics)
5. [Decision Pattern Analysis](#patterns)
6. [Risk-Adjusted Performance](#risk)
7. [Key Takeaways](#takeaways)

**Goal**: Systematically compare all agent types and identify best performers

<a id='setup'></a>
## 1. Setup

In [None]:
from notebook_utils import *
from scipy import stats

np.random.seed(42)
print_section("Agent Comparison Analysis")

In [None]:
# Run experiment with all agents
results = quick_experiment('small_dataset', verbose=True)

<a id='overview'></a>
## 2. Agent Overview

In [None]:
# Agent descriptions
agent_descriptions = {
    'Random': 'Randomly chooses buy/sell/hold with equal probability',
    'Momentum': 'Buys when price increases, sells when price decreases',
    'Contrarian': 'Sells when price increases, buys when price decreases',
    'NewsReactive': 'Makes decisions based on news cluster sentiment',
    'FinBERT': 'Uses pre-trained FinBERT for sentiment analysis',
    'Groq': 'Uses LLM (Groq API) for decision-making'
}

print("Agent Strategies:\n")
for agent, desc in agent_descriptions.items():
    if agent in results['action_log']:
        print(f"• {agent:15s}: {desc}")

<a id='metrics'></a>
## 3. Performance Metrics

In [None]:
# Calculate comprehensive metrics
action_log = results['action_log']
ref_prices = results['ref_prices']

# Compute all metrics
pnl = metrics.agent_pnl(action_log, ref_prices)
win_rates = metrics.win_rate(action_log, ref_prices)
dir_acc = metrics.per_agent_directional_accuracy(action_log, ref_prices)

# Create comparison table
agent_names = list(action_log.keys())
comparison_data = []

for agent in agent_names:
    comparison_data.append({
        'Agent': agent,
        'PnL': pnl[agent],
        'Win Rate': win_rates[agent],
        'Dir. Accuracy': dir_acc[agent],
    })

comparison_df = pd.DataFrame(comparison_data)
comparison_df = comparison_df.sort_values('PnL', ascending=False)

print_subsection("Agent Performance Summary")
display(comparison_df.style.format({
    'PnL': '{:.2f}',
    'Win Rate': '{:.2%}',
    'Dir. Accuracy': '{:.2%}'
}))

In [None]:
# Visual comparison
fig = plot_agent_comparison(action_log, ref_prices)
plt.show()

<a id='statistics'></a>
## 4. Statistical Significance

In [None]:
# Perform pairwise t-tests on directional accuracy
print_subsection("Pairwise Comparisons (Directional Accuracy)")

from itertools import combinations

# Calculate per-period correctness for each agent
correctness = {}
for agent in agent_names:
    agent_correct = []
    actions = action_log[agent]
    
    for t in range(len(ref_prices) - 1):
        actual_direction = np.sign(ref_prices[t+1] - ref_prices[t])
        
        if actions[t] == 'buy':
            predicted_direction = 1
        elif actions[t] == 'sell':
            predicted_direction = -1
        else:
            predicted_direction = 0
        
        agent_correct.append(1 if predicted_direction == actual_direction else 0)
    
    correctness[agent] = agent_correct

# Pairwise t-tests
print("\nPairwise t-tests (p-values):")
print("\n{:<15s}".format(""), end="")
for agent in agent_names:
    print(f"{agent:<15s}", end="")
print()

for agent1 in agent_names:
    print(f"{agent1:<15s}", end="")
    for agent2 in agent_names:
        if agent1 == agent2:
            print(f"{'---':<15s}", end="")
        else:
            t_stat, p_value = stats.ttest_rel(correctness[agent1], correctness[agent2])
            print(f"{p_value:<15.3f}", end="")
    print()

print("\nInterpretation: p < 0.05 indicates significant difference")

<a id='patterns'></a>
## 5. Decision Pattern Analysis

In [None]:
# Decision correlation matrix
print_subsection("Agent Decision Correlation Matrix")

decision_corr = metrics.decision_correlation_matrix(action_log)

fig, ax = plt.subplots(figsize=(10, 8))
im = ax.imshow(decision_corr, cmap='RdBu_r', vmin=-1, vmax=1)

ax.set_xticks(range(len(agent_names)))
ax.set_yticks(range(len(agent_names)))
ax.set_xticklabels(agent_names, rotation=45, ha='right')
ax.set_yticklabels(agent_names)

# Add correlation values
for i in range(len(agent_names)):
    for j in range(len(agent_names)):
        text = ax.text(j, i, f"{decision_corr[i, j]:.2f}",
                      ha="center", va="center", color="black", fontsize=10)

ax.set_title('Agent Decision Correlation Matrix')
plt.colorbar(im, ax=ax, label='Correlation')
plt.tight_layout()
plt.show()

print("\nInterpretation:")
print("  • 1.0: Perfect agreement")
print("  • 0.0: No correlation")
print("  • -1.0: Perfect disagreement")
print("\nLow correlation indicates diverse strategies (desirable for ensemble)")

In [None]:
# Action frequency analysis
print_subsection("Action Frequency Analysis")

action_counts = {}
for agent in agent_names:
    actions = action_log[agent]
    action_counts[agent] = {
        'buy': actions.count('buy'),
        'sell': actions.count('sell'),
        'hold': actions.count('hold')
    }

action_df = pd.DataFrame(action_counts).T
action_df['total'] = action_df.sum(axis=1)
action_df['buy%'] = (action_df['buy'] / action_df['total'] * 100).round(1)
action_df['sell%'] = (action_df['sell'] / action_df['total'] * 100).round(1)
action_df['hold%'] = (action_df['hold'] / action_df['total'] * 100).round(1)

display(action_df[['buy', 'sell', 'hold', 'buy%', 'sell%', 'hold%']])

print("\nObservations:")
print("  • Balanced agents: ~33% for each action")
print("  • Aggressive traders: High buy/sell, low hold")
print("  • Conservative traders: High hold percentage")

<a id='risk'></a>
## 6. Risk-Adjusted Performance

In [None]:
# Calculate risk metrics per agent
print_subsection("Risk-Adjusted Performance")

try:
    profit_factors = metrics.profit_factor(action_log, ref_prices)
    expectancy = metrics.trade_expectancy(action_log, ref_prices)
    
    risk_df = pd.DataFrame({
        'Agent': agent_names,
        'PnL': [pnl[a] for a in agent_names],
        'Win Rate': [win_rates[a] for a in agent_names],
        'Profit Factor': [profit_factors[a] for a in agent_names],
        'Expectancy': [expectancy[a] for a in agent_names]
    })
    
    risk_df = risk_df.sort_values('Profit Factor', ascending=False)
    
    display(risk_df.style.format({
        'PnL': '{:.2f}',
        'Win Rate': '{:.2%}',
        'Profit Factor': '{:.2f}',
        'Expectancy': '{:.2f}'
    }))
    
    print("\nMetric Definitions:")
    print("  • Profit Factor: Gross profits / Gross losses (>1 is profitable)")
    print("  • Expectancy: Average profit per trade")
    
except Exception as e:
    print(f"Could not compute risk metrics: {e}")

<a id='takeaways'></a>
## 7. Key Takeaways

### Performance Ranking

Based on the analysis:
1. **Best Performer**: Agent with highest PnL and win rate
2. **Most Consistent**: Agent with lowest volatility in decisions
3. **Risk-Adjusted Winner**: Agent with highest Sharpe/Sortino ratio

### Strategy Insights

- **Momentum vs Contrarian**: Which strategy works better in this market?
- **News-Based Agents**: Do they outperform pure technical strategies?
- **Random Baseline**: How much better are intelligent strategies?

### Decision Diversity

- Low correlation between agents suggests diverse strategies
- Ensemble of diverse agents may provide better overall performance
- Herding behavior (high correlation) could increase systemic risk

### Next Steps

1. Test on FNSPID for real-world validation
2. Add AI agents (FinBERT, Groq) to comparison
3. Perform sensitivity analysis on parameters
4. Create ensemble strategies