Check details in Jidi Environments
Tags: discrete action space
discrete observation space
Environment Introduction: Agents participate football game. In this series games, agents from two sides participate the football game and the aim is to score more goals to win the game.
Environment Rules:
- This game has two sides(teams). In this game, each side controls 11 players in 11-player teams(11vs11 scenario) or 4 players in 5-player teams except the goal-keeper(5vs5 scenario). Rules are similar to the official football (soccer) rules, including offsides, yellow and red cards. There are, however, small differences.
- Game consists of two halves, 45 minutes (1500 steps) each, 3000 steps in total. Kick-off at the beginning of each half is done by a different team, but there is no sides swap (game is fully symmetrical).
- Teams do not switch sides during the game. Left/right sides are assigned randomly.
- Reward: each team obtains a +1 reward when scoring a goal, and a −1 reward when conceding one to the opposing team.
- Non-cup scoring rules apply, i.e. the team which scored more goal wins; otherwise it is a draw.
- There is no walkover applied when the number of players on the team goes below 7. There are no substitute players. There is no extra time applied. Game ends after 3000 steps.
Action space: a list with length n_action_dim,here n_action_dim=1 an each element is a Discrete object in Gym, like [Discrete(19)]. The action input to the environment is a matrix with size 1*19. Here 1 represents the dimension of action space, and 19 represents the value of the action (the action is supposed to be a one-hot vector inside a list, like [action_list], action_list=[1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]). In each turn, the agent is supposed to generate one of 19 actions (numbered from 0 to 18) from the default action set and set the correspounding index of the action_list into 1.
Observation space: observation is a dictionary and contains keys"obs" and "controlled_player_index"。 The correspounding value of "obs" is a dictionary. "controlled_player_index" represents the index of the controlled agent. Here, agents in left team are integers from 0 to 10 and agents in right team are integers from 11 to 21.
Reward: each team obtains a +1 reward when scoring a goal, and a −1 reward when conceding one to the opposing team.
Environment end conditions: Game ends after 3000 steps.
Evaluation Guide: During verification and evaluation, the platform runs the user code on the single-core CPU (GPU is not supported temporarily), and limits the time for the user to return action in each step to no more than 2s and memory to no more than 500M. The scores in the leaderboard are calculated and ranked according to the average score of the latest 30 games. In the evaluation of competition, all submissions are evaluated using Swiss round.
conda create -n football python=3.8.5
conda activate football
pip install -r requirements.txt
install gfootball environment
copy the
env/football_scenarios/malib_5_vs_5.py
file under folder like~/anaconda3/envs/env_name/lib/python3.x/site-packages/gfootball/scenarios
using environmentenv_name
or~/anaconda3/lib/python3.x/site-packages/gfootball/scenarios
using base environment.
python run_log.py
You can locally test your submission. At Jidi platform, we evaluate your submission as same as run_log.py
For example,
python run_log.py --env "football_11_vs_11_stochastic" --my_ai "random" --opponent "random"
in which you are controlling agent 1 which is green.
- Random policy --> agents/random/submission.py
- RL policy --> all files in agents/football_5v5_mappo or all files in agents/football_11v11_mappo