Skip to content

rl_env.py

jtwwang edited this page Jul 27, 2019 · 7 revisions

File description

Contains the abstract Environment and Agent classes, as well as the HanabiEnv class.

HanabiEnv class

This class is the reinforcement learning interface to DeepMind's Hanabi environment.

Usage

environment = rl_env.make()  
config = { 'players': 5 }  
observation = environment.reset(config)  
while not done:  
    # Agent takes action  
    action =  ...  
    # Environment take a step  
    observation, reward, done, info = environment.step(action)  

Observation

Each player's observation is a dictionary. Find specific keys and values in rl_env.py

Action History

Stored in class HanabiEnv's self.hist
The action history format for two players is as follows:
[Agent0's history, Agent1's history]

Agent0's history:
[first_move_done_by_agent0, second_... , ... , most_recent_move_done_by_agent0]

Move encoding for a 2-player game:
Moves are encoded in a dictionary.
action_type = {'PLAY', 'DISCARD', 'REVEAL_COLOR', 'REVEAL_RANK'}
card_index = {0, 1, 2, 3, 4} # Index of card that was played or discarded.
color = {'B', 'G', 'R', 'W', 'Y'} # Color of card(s) that was hinted.
rank = {0, 1, 2, 3, 4} # Rank (number) of card(s) that was hinted.
target_offset = {1} # Specifies agent that was targeted by hint.
indices_affected = {0, 1, 2, 3, 4} # Positions of the hand that were affected by a hint. Can be multiple.