This is a Scala project that provides an implementation of the Multiplicative Weights Algorithm (MWA) for online learning.
The algorithm, also called the experts algorithm, has many applications, but this project is intrested in particular in applications to game theory and repeated games, including routing games. This implementation is useful in particular for observing trajecories of player strategies, as well as studying the effects of tha algorithm parameters on convergence.
Repeated game and no regret-learning
In the most general form, we consider a game in which many populations of players interact. On each day, every player chooses an action, and observes a loss (a cost) at the end of the day. The loss of a player may depend on the actions of other players, as well as external factors that do not directly depend on players' actions. In other words, nature may affect the losses of players. Each player seekes to minimize her cumulative loss.
The idea of the MW algorithm is to maintain a probability distribution over the action set of the player, such that on each day, the player draws an action from that distribution, and at the end of that day, after observing the losses, updates the probability distribution by putting more weight on the actions that performed well. The distribution on a given day is called the strategy of the player on that day.
The MW algorithm generates a sequence of strategies that has the following guarantee: its expected cumlative loss is within a small distance of the cumulative loss of the best constant strategy in hindsight. In other words, once the sequence of losses is revealed, we can compute, after the fact, the best constant strategy, and compare it to the performace of the algorithm. The difference in average cumulative loss (called the regret of the algorithm), is guaraneed to have a small upper bound. The upper bound can be made arbitrarily small with an appropriate choice of two important parameters: the number of iterations (or number of days the repeated game is played), and a learning rate which specifies how much one puts weight on past observations.
For a very nice introduction to the MW algorithm, see Professor Rao's notes.
For an extensive survey on the MW algorithm and its applications, see [Arora, Hazan, Kale: The Multiplicative Weights Update Method: A Meta-Algorithm and its Applications
One particularly interesting application of the MW algorithm is the selfish routing game. In this game, we are given a graph and a set of players, where each player seeks to send one unit of flow from a given source node to a given destination node (sources and destinations may differ from player to player). Each player seeks to minimize the total latency, i.e. the total time it takes to send the flow. The total latency on a path is the sum of edge latencies, and the edge latency depends on the total flow on that edge (including flow of other players). This game can be used to model drivers on a road network, for example.
On each day, the strategy of a player is an allocation of the unit flow over possible paths. If every player uses the MW algorithm, and if the sequence of learning rates satisfies a mild assumption, we show that the combined strategies of the players converges to the Nash equilibrium of the game. This proves in particular that players can converge to the Nash equilibrium
- in a decentralized way (no coordination required)
- using a simple to implement algorithm
- with minimal information: they do not need to know the latency functions, only to observe the resulting path latency at the end of each day.
More details on the topic will be posted here soon: http://www.eecs.berkeley.edu/~walid/projects.html
A talk that I gave on multiplictive weights algorithms and application to Routing: http://www.eecs.berkeley.edu/~walid/talks/selfish-routing-and-experts.pdf
First, you should install the Scala Simple Build Tool, or
sbt. If you are on Mac and have
brew, you can simply
brew install sbt. For other options see http://www.scala-sbt.org/release/docs/Getting-Started/Setup.html.
You can then build
MWRouting by running
The first time you run
sbt, it make take a couple of minutes since it will fetch the dependencies.
Then you can either run the project from commandline
or run it from your favorite IDE. For example to open in Eclipse with the Scala plugin
sbt eclipseto generate the project files
- In Eclipse: file>import>general>existing projects and select the root directory where you cloned MWRouting.
The main classes are:
Game: this is an abstract class, that represents the game in which players interact. This class provides two important methods:
update, which updates the state of the game given the previous state and the strategies of the
MWAlgorithminstances playing the game.
loss, which specifies the loss of each expert (action), given the current game state.
MWAlgorithm: the class that stores the set of
experts(actions), and the logic for updating the strategy given a vector of losses. Other parameters of the class include
- an instance of the
LearningRateclass, which specifies a sequence of learning rates (a decreasing sequence of learning rates means that the algorithm will be less agressive in its updates as time increases. This helps guarantee convergence of the algorithm)
- an instance of the
UpdateRuleclass, which specifies which multiplicative update rule should be used. Several rules are implemented in the
- an instance of the
MWCoordinator: this class takes an instance of
Gameand an array of
MWAlgorithm, and coordinates the algorithms with the game. This consists in iteratively applying the following steps:
- update the game state using the previous strategies of the algorithms.
- get the losses from the game given the current game state
- update the strategies of each algorithm instance given the current losses
MWCoordinator defines a number of streams that contain the sequence of strategies and game states. The most important ones are:
Note about the use of
Streams: defining the sequences of states as
Streams allows us to easily express each sequence as functional transformations of other sequences. This does not incur any significant increase in time complexity, since each Stream is immutable and is not recomputed. Also Stream's are evaluated lazily, so theses Streams are only evaluated when needed.
To implement a routing game, we create a class
extends Game. The description of the game is contained in
network, an instance of
LatencyNetowrk, which contains in particular
- the topology of the graph
- the edge latency functions
- the commodities (source-sink pairs and flow demands)
- and methods for computing the flows and latencies.
The RoutingGame class then specifies
Statetype that extends
GameType(as required by the abstract class
Statetype is simply a case class that stores the pathFlows and pathLatencies.
updatemethod. This calls the appropriate methods in
Creating and running an instance of the RoutingGame
Here we walk through an example, given in
To create an instance of the routing game, we first need to create
- a Graph instance
- the corresponding latencies, stored in a
HashMap[Int, LatencyFunction]this can be done using a helper function as follows:
val (graph, latencyFunctions) = DirectedGraph.graphAndLatenciesFromAdjMap(adj)
adj is an adjacency map, that specifies, for each node, its neighboring nodes, and the latency on the edge connecting it to that neighbor.
The following example describes a parallel network:
val adj: Map[Int, List[(Int, LatencyFunction)]] = Map(1->Nil, 0->List( (1, SLF(x => 2 + 2 * x)), (1, SLF(x => x * x)), (1, SLF(x => 2 * (x + 1) * (x + 1) - 1)) ))
0 is connected to node
1 by multiple edges. The first edge has an affine latency function,
SLF is an alias for
StaticLatencyFunction, and as its name suggests, creates a latency function that is constant in time.
A note on latency functions: an instance
LatencyFunction can be applied to a
lat(f) will return the latency for a flow value
f. Lateny functions can depend not only on flow, but also on time (see the class
TimeDependentLatencyFunction for details)
Once we have a
graph and its
latencies, we need to desrcibe the commodity, i.e. the source-destination pair, how much flow is to be sent, and algorithm parameters:
val commodity = Commodity(0, 1, flowDemand, epsilon, updateRule, paths)
- the first two fields specify the source and the sink.
updateRule = ExponentialUpdate()specifies the update rule
learningRate = HarmonicLearningRate(1.)specifies that we use a sequence of learning rates that decreases as 1/t
flowDemand = ConstantFlowDemand(2.)is the flow that we need to send from the source to the sink (an instance of
FlowDemand, which behaves essentially as a
paths = graph.findLooplessPaths(0, 1)returns all the paths that connect the source to the sink. One could potentially restrict the set of paths that are offered to the players by manually giving a set of paths.
Finally, a helper class,
RoutingGameSim, takes care of creating the game, an instance of MWAlgorithm and an instance of MWCoordinator
val sim = new RoutingGameSim(graph, latencyFunctions, Array(commodity), randomizedStart)
sim.runFor(T) will run the game for
T iterations, and generate a number of plots:
- the trajectory of the algorithm strategy in the simplex
- path latencies
- path flows
running the simulation with different learning rate sequences and update rules shows the impact on convergence. Running the above example will generate the following plots:
Trajectory of strategies
breezefor linear algebra and visualization https://github.com/scalanlp/breeze/wiki/Breeze-Linear-Algebra