Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games

Code for reproducing results in the paper "Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games".

About

This paper proposes Mutation-Driven Multiplicative Weights Update (M2WU) for learning an equilibrium in two-player zero-sum normal-form games and proves that it exhibits the last-iterate convergence property in both full and noisy feedback settings. In the former, players observe their exact gradient vectors of the utility functions. In the latter, they only observe the noisy gradient vectors. Even the celebrated Multiplicative Weights Update (MWU) and Optimistic MWU (OMWU) algorithms may not converge to a Nash equilibrium with noisy feedback. On the contrary, M2WU exhibits the last-iterate convergence to a stationary point near a Nash equilibrium in both feedback settings. We then prove that it converges to an exact Nash equilibrium by iteratively adapting the mutation term. We empirically confirm that M2WU outperforms MWU and OMWU in exploitability and convergence rates.

Installation

This code is written in Python 3. To install the required dependencies, execute the following command:

$ pip install -r requirements.txt

For Docker User

Build the container:

$ docker build -t m2wu .

After build finished, run the container:

$ docker run -it m2wu

Run Experiments

In order to investigate the performance of M2WU in biased Rock-Paper-Scissors with full feedback, execute the following command:

$ python run_experiment.py --num_trials 10 --T 100000 --random_init_strategy --algorithm m2wu
$ python run_experiment.py --num_trials 10 --T 100000 --random_init_strategy --algorithm m2wu --update_freq 100

In this experiment, the following options can be specified:

--game: Name of a matrix game. The default value is biased_rps.
--algorithm: Learning algorithm.
--num_trials: Number of trials to run experiments. The default value is 1.
--T: Number of iterations. The default value is 10000.
--feedback: Feedback type given to players. The default value is full.
--seed: Random seed. The default value is 0.
--random_init_strategy: Whether to generate the initial strategy uniformly at random. The default value is False.
--eta: Learning rate. The default value is 0.1.
--decay: Whether to use decreasing learning rates. The default value is False.
--mu: Mutation rate for M2WU. The default value is 0.1.
--update_freq: Update the reference strategies in M2WU every N iterations. The value of None means that the reference strategies are not updated. The default value is None.

To evaluate M2WU via an experiment in biased Rock-Paper-Scissors with noisy feedback, execute the following command:

$ python run_experiment.py --num_trials 10 --T 1000000 --feedback noisy --algorithm m2wu --eta 0.001 --mu 0.1
$ python run_experiment.py --num_trials 10 --T 1000000 --feedback noisy --algorithm m2wu --eta 0.001 --mu 0.5 --update_freq 20000

Citation

If you use our code in your work, please cite our paper:

@article{abe2022m2wu,
  title={Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games},
  author={Abe, Kenshi and Ariu, Kaito and Sakamoto, Mitsuki and Toyoshima, Kentaro and Iwasaki, Atsushi},
  journal={arXiv preprint arXiv:2208.09855},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
algorithms		algorithms
games		games
runner		runner
utility		utility
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_experiment.py		run_experiment.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algorithms

algorithms

games

games

runner

runner

utility

utility

.gitignore

.gitignore

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

run_experiment.py

run_experiment.py

Repository files navigation

Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games

About

Installation

For Docker User

Run Experiments

Citation

About

Releases

Packages

Languages

License

CyberAgentAILab/m2wu

Folders and files

Latest commit

History

Repository files navigation

Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games

About

Installation

For Docker User

Run Experiments

Citation

About

Resources

License

Stars

Watchers

Forks

Languages