Add new game: 2048 #894

Jazeem · 2022-07-29T20:39:45Z

Clone of https://github.com/gabrielecirulli/2048

…uct Tile

…e game to never end

…turn

lanctot · 2022-07-30T10:34:33Z

Very nice, thanks @Jazeem !!

I'm on vacation at the moment but will take a look when I'm back next week.

Jazeem · 2022-07-30T19:35:45Z

Sure @lanctot

I'm curious to see if any algorithms implemented in open_spiel would be able to reach 2048. I tried the default mcts implementation, but it was not faring better than a random player.

lanctot · 2022-07-31T09:43:09Z

Hi @Jazeem,

Hmm.. MCTS performing similarly to random-- even for MCTS with random rollouts-- I'm surprised. I suspect a bug in the game implementation. How many simulations did MCTS have? And how long is the maximum length of each game?

Other algorithms: you could also try some RL algorithms like DQN. Or, expectimax with a heuristic evaluation function.

lanctot · 2022-07-31T13:54:38Z

Oh wait, depends on what you decided for the reward structure. If there are no intermediate rewards then I can see why it would be super hard for vanilla MCTS but still I would expect it to be better than random.

Jazeem · 2022-07-31T19:49:49Z

@lanctot yeah you are correct. I think it's because there's no intermediate rewards.

When I ran the simulation over 100 games, MCTS and random couldn't win once.
But when I reduced the winning condition to tile 128 instead of 2048, MCTS could win 100 games vs 48 games for random.

steventrouble · 2022-08-02T23:11:49Z

This is really impressive!

I suspect the terminal condition and reward function still need some tweaking. 2048 doesn't end when the player reaches 2048, it keeps going forever. And the score isn't win/lose, it increases based on the number and size of the merged tiles.

The reason I mention this is that current SOTA algorithms can get scores of ~500k, which IIUC is far beyond reaching 2048.

Disclaimer: I am not an owner of OpenSpiel

Jazeem · 2022-08-03T06:07:17Z

Hi @steventrouble

I was experimenting around the terminal condition and reward function. My initial assumption was that setting the game to end at 2048 might make it easier to solve.

I've since then modified the reward system to give a reward for every new tile unlocked. But like you suggested it might be better to keep the game forever like the original and modify Rewards() as score from current action and Returns() as total score.

…r player move

lanctot · 2022-08-11T10:27:20Z

Hi @steventrouble

I was experimenting around the terminal condition and reward function. My initial assumption was that setting the game to end at 2048 might make it easier to solve.

I recommend we make this as consistent with the original scoring (or with other RL implementations) as possible.

I've since then modified the reward system to give a reward for every new tile unlocked. But like you suggested it might be better to keep the game forever like the original and modify Rewards() as score from current action and Returns() as total score.

I think this should be configurable by a game parameter. All games in OpenSpiel are episodic: the tests assume this, so they should end in finite time (e.g. the game simulation tests can't go on forever) . I recommend you make a flag like "max_steps" or "max_game_length" parameter that plays 2048 for a finite number of steps or a flag that ends the episode after getting 2048 as the default, and then supporting the infinite game as a non-default parameter value.

lanctot · 2022-08-11T12:32:13Z

Clone of https://github.com/gabrielecirulli/2048

Can you clarify how much of the code in here is based on this implementation? If you actually copied code from that repos, we likely have to put their copyright notice in the file before we can import. (And also it'd be a nice courtesy to tag the authors to let them know, they might have some comments / preferences).

tewalds · 2022-08-15T12:02:05Z

open_spiel/games/2048.cc

+  return BoardAt(r, c).value == 0;
+}
+
+Coordinate GetVector(int direction) {


I'm pretty sure you can annotate this function with constexpr: https://en.cppreference.com/w/cpp/language/constexpr , which will execute it at compile time instead of at run time. It might be simple enough that the compiler will do that anyway, but if not, this could speed things up slightly.

tewalds · 2022-08-15T13:52:02Z

open_spiel/games/2048.cc

+                         GameType::ChanceMode::kExplicitStochastic,
+                         GameType::Information::kPerfectInformation,
+                         GameType::Utility::kGeneralSum,
+                         GameType::RewardModel::kTerminal,


This is no longer a terminal reward, right?

My bad. Forgot to change this, initially the code was written as a win/loss game. Fixed it now.

tewalds · 2022-08-15T14:00:12Z

open_spiel/games/2048.cc

+void TwoZeroFourEightState::DoApplyAction(Action action) {
+  if (IsChanceNode()) {
+    // The original 2048 game starts with two random tiles
+    if (!extra_chance_turn_) {


Ah, I think you're right. This is needed due to the explicit stochastic actions. Can you add that as a comment? If the action was applied implicitly then you wouldn't need this.

tewalds · 2022-08-15T14:23:08Z

open_spiel/games/2048.cc

+constexpr int kMoveRight = 1;
+constexpr int kMoveDown = 2;
+constexpr int kMoveLeft = 3;
+inline const std::vector<Action> kPlayerActions() {


You could statically define a std::array<Action, 4> instead of generating a vector each time this is called (which is tight in a loop in a few places!)

LegalActions() expects a vector as returns. That's why kept this as vector too.

tewalds · 2022-08-15T14:24:44Z

open_spiel/games/2048.cc

+  current_player_ = 0;
+  for (int r = 0; r < kRows; r++) {
+    for (int c = 0; c < kColumns; c++) {
+      SetBoard(r, c, Tile(board_seq[r * kRows + c], false));


I'm pretty sure this should be r * kCols + c. It doesn't matter now because kRows == kCols, but this is a bug waiting to happen.

Good catch. Yes, you are correct. Fixed it now.

tewalds · 2022-08-15T14:30:09Z

open_spiel/games/2048.cc

+      reverse(y.begin(), y.end());
+      break;
+    case kMoveLeft:
+    case kMoveDown:


Is this right? Shouldn't kMoveLeft and kMoveDown do different things? As in shouldn't one reverse x and the other y? Or can you add a comment about what this does?

True. Simplified this.

tewalds · 2022-08-15T15:27:48Z

open_spiel/games/2048.cc

+    return {};
+  }
+  if (IsChanceNode()) {
+    return LegalChanceOutcomes();


LegalChanceOutcomes doesn't seem to be defined anywhere. Is this ChanceOutcomes()?

I was referencing backgammon.cc for writing this.
As I understand, LegalChanceOutcomes() is defined in spiel.h and it uses the output of ChanceOutcomes() to return vector<Action>

lanctot · 2022-08-17T10:57:50Z

Hi @Jazeem, something changes on the MacOS behavior via GitHub Actions, can you apply these changes to trigger the tests again: https://github.com/deepmind/open_spiel/pull/907/files

lanctot · 2022-08-17T11:38:34Z

Yeah it'll take more than just that one line, now there seems to be issues with the Jax versions .. :(

Keep an eye on #907, might only be resolved tomorrow.

Jazeem · 2022-08-17T11:51:43Z

Keep an eye on #907, might only be resolved tomorrow.

Sure. Will do 👍

lanctot · 2022-08-17T16:00:00Z

Keep an eye on #907, might only be resolved tomorrow.

Sure. Will do 👍

Ok tests are passing, so you just need to update the Jax and etc. package versions (one line change in install.sh)

lanctot · 2022-08-18T08:31:55Z

@Jazeem thanks, looking good! Are you done fixing things up for the moment?

I've contacted @tewalds who will review the import, would be good to pull this one in now if you're done making changes.

Jazeem · 2022-08-18T09:48:19Z

@lanctot
Yes, done fixing things up for now.

lanctot · 2022-08-20T11:44:22Z

Hi @Jazeem, it's now been merged, but a fair bit changed from internal review. Please take a look through it let us know what you think. You can still do the almost-infinite version by setting max_tile to a very high value.

Jazeem · 2022-08-22T10:31:04Z

@lanctot It looks great 👍

Only thing I noticed is that if a player/agent makes a move that doesn't change anything on the board, then additional chance tiles are not added. So there could be a few inconsequential moves like that, so not sure if those should be counted in

int MaxGameLength() const override { return 2 * 2 * max_tile_; }

Jazeem added 16 commits July 27, 2022 16:35

2048 cloned from Checkers

d23b557

ActionToString modified

3c6b520

2048 game logic added

2414598

Removed unused code

4c22e10

Playthrough added

ef4445e

Return empty list of actions if terminal state is reached

e79fa27

2048 added in pyspiel_test

88bacca

Bugfix: Multiple merges happening in a single move by introducing str…

8319c8e

…uct Tile

Bugfix: TileMatchesAvailable was looking out of bounds and causing th…

c51f4d0

…e game to never end

Rows and columns made constant

6e956da

Fixed existing tests

bf1509b

Removed TurnHistoryInfo

1b67333

New test cases added

c33e7b1

ObservationTensor added, Bugfix that allowed multiple mergers in one …

1fac050

…turn

Code made readable

3f17e38

Line length limited to 80

1c73ea9

lanctot mentioned this pull request Jul 30, 2022

Call for New Games #843

Open

2048 added to games.md

a3f41eb

Jazeem added 3 commits August 3, 2022 15:47

Bugfix: Random tiles were appearing even when board is unchanged afte…

25c226e

…r player move

Start the game with 2 random tiles

256ecc3

Intermediate rewards added

1d5cbce

Jazeem added 5 commits August 13, 2022 12:19

Removed unnecessary StrCat

6fcc37e

SetTileIsMerged method added

d24b4ed

Minor data type changes

ef5aa22

direction_diff initialisation taken out of loop

016950b

Introduced BoardAt method that takes Coordinate

397a43d

tewalds requested changes Aug 15, 2022

View reviewed changes

Jazeem added 6 commits August 16, 2022 00:03

GetVector made constexpr

7453a7c

RewardModel changed from kTerminal to kRewards

b9024c4

Comments added

86d13d1

Fixed logic in SetCustomBoard

e004db0

BuildTraversals simplified

4d692e2

UndoAction removed

8a5ab94

lanctot mentioned this pull request Aug 17, 2022

Update install.sh to avoid CI failure #907

Merged

Update install.sh to avoid CI failure

e1029f3

Jazeem added 2 commits August 18, 2022 11:11

Update jax versions

8efaa64

Update python_extra_deps.sh

df84e89

lanctot approved these changes Aug 18, 2022

View reviewed changes

OpenSpiel merged commit f4121ea into google-deepmind:master Aug 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new game: 2048 #894

Add new game: 2048 #894

Jazeem commented Jul 29, 2022

lanctot commented Jul 30, 2022

Jazeem commented Jul 30, 2022

lanctot commented Jul 31, 2022

lanctot commented Jul 31, 2022

Jazeem commented Jul 31, 2022

steventrouble commented Aug 2, 2022 •

edited

Jazeem commented Aug 3, 2022

lanctot commented Aug 11, 2022

lanctot commented Aug 11, 2022

tewalds Aug 15, 2022

tewalds Aug 15, 2022

Jazeem Aug 15, 2022

tewalds Aug 15, 2022

tewalds Aug 15, 2022

Jazeem Aug 15, 2022

tewalds Aug 15, 2022

Jazeem Aug 15, 2022

tewalds Aug 15, 2022

Jazeem Aug 15, 2022

tewalds Aug 15, 2022

Jazeem Aug 15, 2022 •

edited

lanctot commented Aug 17, 2022

lanctot commented Aug 17, 2022

Jazeem commented Aug 17, 2022

lanctot commented Aug 17, 2022

lanctot commented Aug 18, 2022

Jazeem commented Aug 18, 2022

lanctot commented Aug 20, 2022

Jazeem commented Aug 22, 2022

Add new game: 2048 #894

Add new game: 2048 #894

Conversation

Jazeem commented Jul 29, 2022

lanctot commented Jul 30, 2022

Jazeem commented Jul 30, 2022

lanctot commented Jul 31, 2022

lanctot commented Jul 31, 2022

Jazeem commented Jul 31, 2022

steventrouble commented Aug 2, 2022 • edited

Jazeem commented Aug 3, 2022

lanctot commented Aug 11, 2022

lanctot commented Aug 11, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Jazeem Aug 15, 2022 • edited

Choose a reason for hiding this comment

lanctot commented Aug 17, 2022

lanctot commented Aug 17, 2022

Jazeem commented Aug 17, 2022

lanctot commented Aug 17, 2022

lanctot commented Aug 18, 2022

Jazeem commented Aug 18, 2022

lanctot commented Aug 20, 2022

Jazeem commented Aug 22, 2022

steventrouble commented Aug 2, 2022 •

edited

Jazeem Aug 15, 2022 •

edited