Multi-Agent Reinforcement Learning Pong Assignment

This repository contains materials for a multi-agent reinforcement learning assignment using the PettingZoo Pong environment.

Assignment Overview

Students will implement a multi-agent reinforcement learning agent to play competitive Pong. The assignment focuses on:

Processing RGB image observations (210, 160, 3)
Implementing neural networks for visual reinforcement learning
Training agents using any multi-agent RL algorithm
Understanding competitive multi-agent scenarios

Files for Students

The following files are provided for you to complete the assignment:

Assignment Files (Provided):
├── pettingzoo_pong_wrapper.py     # PettingZoo Pong environment wrapper
├── student_agent.py               # Agent template (implement this)
├── train_baseline_agent.py        # Reference training script (study this)
├── validate_student_submission.py # Validation script (use this)
└── README.md                      # This instruction file

Quick Start

1. Set up the Environment

Requirements:

Python 3.8 or higher
Required packages: torch, numpy, gymnasium, pettingzoo[atari]

# Install required packages
pip install torch numpy gymnasium pettingzoo[atari]

Student Requirements

Each student must implement their agent and submit 3 files:

Required Files

_agent.py - Your agent implementation
- Must follow the template in student_agent.py
- Implement the StudentAgent class with required methods
- Handle RGB observations (210, 160, 3)
- Output 6 actions (0=NOOP, 1=FIRE, 2=RIGHT, 3=LEFT, 4=FIRE+RIGHT, 5=FIRE+LEFT)
_model.pt - Your trained PyTorch model
- Saved model weights from your training
- Must be compatible with your agent architecture
_train.py - Your training script
- Shows how you trained your agent
- Should demonstrate your training process

Example Submission Structure

aib242295_ail821_assignment2/
├── aib242295_agent.py    # Your agent implementation
├── aib242295_model.pt    # Your trained model
└── aib242295_train.py    # Your training script

PettingZoo Pong Environment

Observation Space

Each agent receives an RGB image observation:

Shape: (210, 160, 3) - Height, Width, RGB channels
Type: Raw pixel values from the Atari Pong game
Range: 0-255 for each color channel

Action Space

Each agent can take 6 actions (Atari Pong controls):

0: NOOP (no operation)
1: FIRE (serve ball or spike)
2: RIGHT (move paddle right)
3: LEFT (move paddle left)
4: FIRE+RIGHT (serve and move right)
5: FIRE+LEFT (serve and move left)

Reward Structure

+1: Score a point (opponent misses ball)
-1: Concede a point (miss the ball)
0: Otherwise

🏆 Tournament System

Round-Robin Format

Each agent plays against every other agent
Multiple episodes per match
3 points for win, 1 point for draw

Standard Scoring

Win: 3 points
Draw: 1 point
Loss: 0 points

Weighted Scoring System

The assignment uses weighted scoring for final evaluation:

30% weight: Performance against baseline agent
70% weight: Performance in tournament against other students

Formula:

Final Score = (0.3 × Baseline_Score) + (0.7 × Tournament_Score)

Where:

Baseline_Score: Points earned against the baseline agent (max 3 points per game)
Tournament_Score: Points earned against other student agents (max 3 points per game)

Why Weighted Scoring?

Fair Evaluation: Rewards both absolute performance and competitive skill
Comprehensive Assessment: Tests agents against known baseline and unknown opponents

Implementation Details

Required Agent Interface

You must implement the StudentAgent class in _agent.py:

class StudentAgent:
    def __init__(self, agent_id, action_space):
        """
        Initialize your multi-agent RL algorithm.

        Args:
            agent_id: ID for this agent (0 or 1)
            action_space: Gym action space (6 discrete actions)
        """
        self.agent_id = agent_id
        self.action_space = action_space
        self.n_actions = 6

        # TODO: Initialize your algorithm (DQN, PPO, MADDPG, etc.)

    def act(self, obs):
        """
        Select an action given RGB observation.

        Args:
            obs: RGB image of shape (210, 160, 3)

        Returns:
            action: Integer (0-5) representing the action
        """
        # TODO: Implement your action selection
        # This function will be used during evaluation
        return 0  # Placeholder

    def update(self, obs, action, reward, next_obs, done):
        """
        Update your agent using a transition.

        Args:
            obs: Current RGB observation
            action: Action taken
            reward: Reward received
            next_obs: Next RGB observation
            done: Whether episode finished
        """
        # TODO: Implement your learning update
        pass

    def save_model(self, filepath):
        """Save your trained model."""
        # TODO: Implement model saving
        pass

    def load_model(self, model_data):
        """Load your trained model."""
        # TODO: Implement model loading
        pass

Neural Network Architecture

You need to process RGB images (210, 160, 3). Consider using:

CNN layers for visual feature extraction
Fully connected layers for action values
Any architecture that works with RGB input

Testing Your Implementation

Validate Your Submission

Before submitting, use the validation script to check your implementation:

python validate_student_submission.py ./your_entry_number_ail821_assignment2

This script checks:

All required files are present
StudentAgent class is properly implemented
Model file is valid
Agent can select actions correctly

Test with Reference Training Script

Study train_baseline_agent.py to understand:

How to set up the PettingZoo environment
How to implement training loops
How to use the agent interface
How to save and load models

Test Your Agent Manually

# Test your agent with the environment
from pettingzoo_pong_wrapper import PettingZooPongWrapper
from student_agent import StudentAgent

env = PettingZooPongWrapper()
obs, _ = env.reset()

# Create your agent
agent = StudentAgent(agent_id=0, action_space=env.action_space[0])

# Test action selection
action = agent.act(obs[0])
print(f"Selected action: {action}")

env.close()

Getting Help

Resources

Study train_baseline_agent.py - Complete training example
Study student_agent.py - Template with required interface
Study pettingzoo_pong_wrapper.py - Environment usage

Common Issues

RGB Processing
- Remember to handle (210, 160, 3) RGB images
- Use CNN layers or preprocessing for visual features
- Consider normalization or scaling
6-Action Space
- Actions 0-5 correspond to Atari Pong controls
- Make sure your agent outputs valid actions (0-5)
Training Stability
- Use appropriate learning rates for CNN training
- Consider experience replay for stable learning
- Monitor training progress and loss values

Expected Performance

A well-implemented agent should:

Process RGB observations effectively
Learn meaningful policies from visual input
Compete in multi-agent scenarios
Show improvement during training

Success Criteria

Your implementation is successful if:

_agent.py implements required interface
_model.pt loads without errors
_train.py demonstrates training process
Agent can select valid actions (0-5)
Code runs without crashes

Submission Format

Package your files in a directory named:

<kerberos_id>_ail821_assignment2/
├── <kerberos_id>_agent.py    # Your agent implementation
├── <kerberos_id>_model.pt    # Your trained model
└── <kerberos_id>_train.py    # Your training script

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Agent Reinforcement Learning Pong Assignment

Assignment Overview

Files for Students

Quick Start

1. Set up the Environment

Student Requirements

Required Files

Example Submission Structure

PettingZoo Pong Environment

Observation Space

Action Space

Reward Structure

🏆 Tournament System

Round-Robin Format

Standard Scoring

Weighted Scoring System

Why Weighted Scoring?

Implementation Details

Required Agent Interface

Neural Network Architecture

Testing Your Implementation

Validate Your Submission

Test with Reference Training Script

Test Your Agent Manually

Getting Help

Resources

Common Issues

Expected Performance

Success Criteria

Submission Format

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
pettingzoo_pong_wrapper.py		pettingzoo_pong_wrapper.py
requirements.txt		requirements.txt
student_agent.py		student_agent.py
train_baseline_agent.py		train_baseline_agent.py
validate_student_submission.py		validate_student_submission.py

Sherlock-221BBS/Advanced_RL_Assignment2

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Reinforcement Learning Pong Assignment

Assignment Overview

Files for Students

Quick Start

1. Set up the Environment

Student Requirements

Required Files

Example Submission Structure

PettingZoo Pong Environment

Observation Space

Action Space

Reward Structure

🏆 Tournament System

Round-Robin Format

Standard Scoring

Weighted Scoring System

Why Weighted Scoring?

Implementation Details

Required Agent Interface

Neural Network Architecture

Testing Your Implementation

Validate Your Submission

Test with Reference Training Script

Test Your Agent Manually

Getting Help

Resources

Common Issues

Expected Performance

Success Criteria

Submission Format

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages