Skip to content

pwnixone/MachineLearningProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Agent Reinforcement Learning Soccer Environment

This project implements a multi-agent reinforcement learning system for simulated robot soccer using PyTorch and PettingZoo.

Features

  • Custom soccer environment with realistic physics and game dynamics
  • Multiple agents per team with individual policies
  • PPO (Proximal Policy Optimization) implementation for policy learning
  • Team coordination and opponent adaptation
  • Visual rendering of the soccer matches
  • Performance logging and visualization using TensorBoard

Installation

  1. Create a virtual environment and activate it:
python -m venv .venv
.venv\Scripts\activate  # On Windows
  1. Install the required packages:
pip install torch gymnasium pettingzoo numpy matplotlib tensorboard pygame

Project Structure

  • soccer_env.py: Defines the soccer environment using PettingZoo's ParallelEnv
  • agent.py: Implements the PPO agent with ActorCritic architecture
  • train.py: Contains the training loop for the agents
  • visualize.py: Provides visualization of trained agents playing soccer

Usage

  1. Train the agents:
python train.py

This will create a new directory runs with TensorBoard logs.

  1. Monitor training progress using TensorBoard:
tensorboard --logdir=runs
  1. Once training is complete, visualize the trained agents:
python visualize.py

Environment Details

  • Field size: 800x600 pixels
  • 3 players per team
  • Continuous action space: [move_x, move_y, kick_power, kick_direction]
  • Observation space includes:
    • Agent's position and velocity
    • Ball's relative position
    • Goal's relative position
    • Teammates' relative positions
    • Opponents' relative positions

Agent Architecture

  • Shared feature extractor
  • Actor network (policy) with Gaussian distribution
  • Critic network (value function)
  • PPO algorithm with:
    • GAE (Generalized Advantage Estimation)
    • Value function clipping
    • Adaptive learning rate

Rewards

  • Goal scoring: +10 for scoring team, -10 for conceding team
  • Ball possession: proximity-based continuous reward
  • Field position: encourages offensive/defensive positioning

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages