# Adversarial MARL for Robustness

## Introduction

In this notebook, we'll explore Adversarial Multi-Agent Reinforcement Learning (MARL) to improve the robustness of our agents. This approach is particularly useful in scenarios where agents need to perform well in uncertain or potentially hostile environments. In the context of app modernization, this could represent dealing with legacy code vulnerabilities, unexpected system behaviors, or potential security threats during the modernization process.

## Setup

First, let's import the necessary libraries and set up our environment.

In [16]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import numpy as np
import matplotlib.pyplot as plt

# Set random seed for reproducibility
torch.manual_seed(42)
np.random.seed(42)

## Implementing Adversarial Agents

We'll create a simple environment where we have two types of agents: protagonists and adversaries. The protagonists will try to accomplish a task, while the adversaries will try to hinder their progress.

In [17]:
class Agent(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(Agent, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        self.fc3 = nn.Linear(hidden_dim, output_dim)
        
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        return self.fc3(x)

class AdversarialEnvironment:
    def __init__(self, num_protagonists, num_adversaries, state_dim, action_dim):
        self.num_protagonists = num_protagonists
        self.num_adversaries = num_adversaries
        self.state_dim = state_dim
        self.action_dim = action_dim
        
    def reset(self):
        return torch.randn(self.num_protagonists + self.num_adversaries, self.state_dim)
    
    def step(self, protagonist_actions, adversary_actions):
        # Simplified environment dynamics
        protagonist_effect = torch.sum(protagonist_actions, dim=0)
        adversary_effect = torch.sum(adversary_actions, dim=0)
        
        next_states = torch.randn(self.num_protagonists + self.num_adversaries, self.state_dim)
        
        protagonist_rewards = torch.sum(protagonist_effect - adversary_effect, dim=-1).repeat(self.num_protagonists)
        adversary_rewards = -protagonist_rewards[:self.num_adversaries]
        
        done = False
        return next_states, protagonist_rewards, adversary_rewards, done

# Hyperparameters
num_protagonists = 2
num_adversaries = 1
state_dim = 10
hidden_dim = 64
action_dim = 5

# Initialize agents and environment
protagonists = [Agent(state_dim, hidden_dim, action_dim) for _ in range(num_protagonists)]
adversaries = [Agent(state_dim, hidden_dim, action_dim) for _ in range(num_adversaries)]
env = AdversarialEnvironment(num_protagonists, num_adversaries, state_dim, action_dim)

## Training Loop

Now, let's implement a training loop where protagonists learn to accomplish their task while adapting to adversarial actions.

In [18]:
def train_agents(num_episodes, max_steps):
    protagonist_optimizers = [optim.Adam(agent.parameters(), lr=0.001) for agent in protagonists]
    adversary_optimizers = [optim.Adam(agent.parameters(), lr=0.001) for agent in adversaries]
    episode_rewards = []

    for episode in range(num_episodes):
        states = env.reset()
        episode_reward = 0

        for step in range(max_steps):
            protagonist_actions = torch.stack([agent(states[i].unsqueeze(0)).squeeze(0) 
                                               for i, agent in enumerate(protagonists)])
            adversary_actions = torch.stack([agent(states[i+num_protagonists].unsqueeze(0)).squeeze(0) 
                                             for i, agent in enumerate(adversaries)])

            next_states, protagonist_rewards, adversary_rewards, done = env.step(protagonist_actions, adversary_actions)
            episode_reward += protagonist_rewards.sum().item()

            # Combine losses
            total_loss = sum(-protagonist_rewards[i] for i in range(len(protagonists))) + \
                         sum(-adversary_rewards[i] for i in range(len(adversaries)))

            # Zero gradients for all optimizers
            for optimizer in protagonist_optimizers + adversary_optimizers:
                optimizer.zero_grad()

            # Single backward pass
            total_loss.backward()

            # Step optimizers
            for optimizer in protagonist_optimizers + adversary_optimizers:
                optimizer.step()

            states = next_states
            if done:
                break

        episode_rewards.append(episode_reward)
        if episode % 100 == 0:
            print(f"Episode {episode}, Avg Reward: {np.mean(episode_rewards[-100:]):.2f}")

    return episode_rewards



# Train the agents
num_episodes = 1000
max_steps = 50
rewards = train_agents(num_episodes, max_steps)

# Plot the learning curve
plt.plot(rewards)
plt.title("Learning Curve (Protagonist Rewards)")
plt.xlabel("Episode")
plt.ylabel("Total Reward")
plt.show()

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

## Evaluating Robustness

Let's evaluate the robustness of our trained protagonists by testing them against different adversarial strategies.

In [None]:
def evaluate_robustness(num_evaluations, max_steps):
    evaluation_rewards = []

    for _ in range(num_evaluations):
        states = env.reset()
        episode_reward = 0

        for _ in range(max_steps):
            protagonist_actions = torch.stack([agent(states[i].unsqueeze(0)).squeeze(0) 
                                               for i, agent in enumerate(protagonists)])
            
            # Use random actions for adversaries to test robustness
            adversary_actions = torch.randn(num_adversaries, action_dim)

            next_states, protagonist_rewards, _, done = env.step(protagonist_actions, adversary_actions)
            episode_reward += protagonist_rewards.sum().item()

            states = next_states
            if done:
                break

        evaluation_rewards.append(episode_reward)

    return np.mean(evaluation_rewards), np.std(evaluation_rewards)

mean_reward, std_reward = evaluate_robustness(100, max_steps)
print(f"Robustness Evaluation - Mean Reward: {mean_reward:.2f}, Std Reward: {std_reward:.2f}")

## Conclusion

In this notebook, we implemented an Adversarial MARL system to improve the robustness of our agents. We saw how protagonists can learn to perform their task while adapting to adversarial actions. This approach has several potential applications in app modernization:

1. Improving the resilience of modernized applications against potential security threats
2. Developing robust migration strategies that can handle unexpected issues during the modernization process
3. Testing the reliability of modernized systems under various adverse conditions

Future work could involve:
- Implementing more sophisticated adversarial strategies
- Applying this approach to specific app modernization challenges, such as handling legacy code vulnerabilities
- Exploring the trade-offs between performance and robustness in the context of app modernization

## References

1. Pinto, L., Davidson, J., Sukthankar, R., & Gupta, A. (2017). Robust adversarial reinforcement learning. ICML.
2. Gleave, A., Dennis, M., Wild, C., Kant, N., Levine, S., & Russell, S. (2020). Adversarial policies: Attacking deep reinforcement learning. ICLR.