# Our Library

### Neural Network Training Library

We made a Python library that lets us train AI by actually playing Pokémon! The library handles everything: watching the game, picking moves, and learning from mistakes—so the AI gets better all by itself.

### Rust-based Emulator with PyO3 Bindings

We souped up a super-fast Game Boy Advance emulator (written in Rust) so it can chat with Python. Thanks to PyO3, Python can now:
- **See what’s happening:** Grab all the juicy game details—Pokémon stats, battle info, and more—and send them to the AI.
- **Control the action:** Let the AI pick moves or switch Pokémon, and actually make those things happen in-game.


### Custom Pokémon Disassembly for RL

We also hacked the Pokémon Emerald game code so it can:
- **Understand the AI’s commands** sent from Python.
- **Skip all the boring stuff** (like graphics and text) so training is way, way faster.

---

# The tutorial 
In this tutorial, we will, step by step, train a neural network with our librairy
## Goals : 
 - Train a small model in MARL to be the best on 1v1 battles 
 - Watch the performance of our model 
 - Export the model in ONNX (needed for teh next tutorial, run the model on GBA)
 

 ## Imports
 Make sure that you followed the README.md install step correctly

In [2]:
# Imports for training and interacting with the environment
import sys
sys.path.append("..")  

import numpy as np
import random

# PettingZoo for multi-agent RL environments
from pettingzoo.utils import parallel_to_aec
from pettingzoo.test import parallel_api_test

# Main environment and core components
from pkmn_rl_arena.env.battle_core import BattleCore
from pkmn_rl_arena.env.battle_arena import BattleArena, RenderMode, ReplayBuffer
from pkmn_rl_arena.env.pkmn_team_factory import PkmnTeamFactory
from pkmn_rl_arena.env.observation import ObservationFactory, ObsIdx
from pkmn_rl_arena.paths import PATHS

# Logging and debugging
from pkmn_rl_arena import log

# For RL algorithms and neural networks
import torch
import torch.nn as nn
import torch.optim as optim



## Instanciate
With PyTorch, we can easily create a model — and as you can see, this one is really small. But why keep it so small?
On a regular PC, the model size doesn’t matter too much (it mostly depends on your hardware).
However, if you want to export and run the model on the GBA, memory becomes a huge limitation. Here’s the compilation info from **pokeemerald**:
```bash
Memory region         Used Size  Region Size  %age Used
           EWRAM:      251688 B       256 KB     96.01%
           IWRAM:       30416 B        32 KB     92.82%
             ROM:    13334028 B        32 MB     39.74%
```
What does this mean in practice?
- EWRAM → about 10.2 KB left, this is your real RAM (read–write), similar to system RAM on a PC.
- ROM → about 19.27 MB free, this is slow, read-only memory, but it’s where we can store the model’s weights.

If we quantize the model to int8, then:
- We can store up to ~20 million parameters in ROM.
- But RAM is extremely limited: only 10,456 int8 values can fit into EWRAM.

That’s why we need to be very careful. For each node n in the model, the sum of its inputs and outputs must stay below this RAM limit:
$$
\text{input}_n + \text{output}_n < 10,456
$$

![My model architecture](./assets/gba-archi-model.png)


In [None]:
# Instantiate the environment
core = BattleCore(PATHS["ROM"], PATHS["BIOS"], PATHS["MAP"])
env = BattleArena(core)

# Example: get observation and action space sizes
obs = env.reset()[0]
obs_size = obs["player"]["observation"].shape[0]
action_size = env.action_manager.action_space_size

# Define a simple DQN agent for each player
class DQN(nn.Module):
    def __init__(self, obs_size, action_size):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(obs_size, 128),
            nn.ReLU(),
            nn.Linear(128, action_size)
        )
    def forward(self, x): 
        return self.net(x)

agents = {
    agent: DQN(obs_size, action_size)
    for agent in env.possible_agents
}

# Example: set up optimizers for each agent
optimizers = {
    agent: optim.Adam(agents[agent].parameters(), lr=1e-3)
    for agent in env.possible_agents
}

print("Environment and agents initialized.")

[1;38;5;208mWARN[0m [rustboyadvance_utils::elf] [1;38;5;208mELF: skipping program header ProgramHeader { p_type: "PT_LOAD", p_flags: 0x6, p_offset: 0x1000, p_vaddr: 0x2000000, p_paddr: 0x2000000, p_filesz: 0x3d728, p_memsz: 0x3d728, p_align: 4096 }[0m
[1;38;5;208mWARN[0m [rustboyadvance_utils::elf] [1;38;5;208mELF: skipping program header ProgramHeader { p_type: "PT_LOAD", p_flags: 0x6, p_offset: 0x0, p_vaddr: 0x3000000, p_paddr: 0x3000000, p_filesz: 0x0, p_memsz: 0x76d0, p_align: 4096 }[0m
INFO [rustboyadvance_utils::elf] ELF: loading segment phdr: ProgramHeader { p_type: "PT_LOAD", p_flags: 0x7, p_offset: 0x3f000, p_vaddr: 0x8000000, p_paddr: 0x8000000, p_filesz: 0xcb760c, p_memsz: 0xcb760c, p_align: 4096 } range 0x3f000..0xcf660c vec range 0x8000000..0x8cb760c
INFO [rustboyadvance_core::cartridge::builder] Loaded ROM: CartridgeHeader { game_title: "POKEMON EMER", game_code: "BPEE", maker_code: "01", software_version: 0, checksum: 114 }
INFO [rustboyadvance_core::cartridge::b

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


INFO [rustboyadvance_core::sound] bias - setting sample frequency to 32768hz
INFO [rustboyadvance_core::sound] bias - setting sample frequency to 32768hz
INFO [rustboyadvance_core::sound] MSE enabled!
INFO [rustboyadvance_core::sound] bias - setting sample frequency to 65536hz
INFO [rustboyadvance_core::sound] bias - setting sample frequency to 65536hz
INFO [rustboyadvance_core::mgba_debug] mGBA log enabled: true
[32mINFO    [0m[2628238138::<module>]  Created save_state : ['boot_state_turntype:0_step:1_id:0.savestate']
[36mDEBUG   [0m[2628238138::<module>]  Resetting env with options {'save_state': 'boot_state', 'teams': None}
[32mINFO    [0m[battle_arena::load_save_state]  Loading save state : boot_state
[36mDEBUG   [0m[battle_arena::load_save_state]  Given state without file ext, attempting to load 1st state whose name matches regex boot_state.+
[32mINFO    [0m[save_state::load_state]  Loading following save state : /home/wboussella/Documents/rl_new_pokemon_ai/rl_new_pokemo

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


Environment and agents initialized.


[36mDEBUG   [0m[763603381::<module>]  Resetting env with options {'save_state': 'boot_state', 'teams': None}
[32mINFO    [0m[battle_arena::load_save_state]  Loading save state : boot_state
[36mDEBUG   [0m[battle_arena::load_save_state]  Given state without file ext, attempting to load 1st state whose name matches regex boot_state.+
[32mINFO    [0m[save_state::load_state]  Loading following save state : /home/wboussella/Documents/rl_new_pokemon_ai/rl_new_pokemon_ai/pkmn_rl_arena/../savestate/boot_state_turntype:0_step:1_id:0.savestate
[1;38;5;208mWARN[0m [rustboyadvance_utils::elf] [1;38;5;208mELF: skipping program header ProgramHeader { p_type: "PT_LOAD", p_flags: 0x6, p_offset: 0x1000, p_vaddr: 0x2000000, p_paddr: 0x2000000, p_filesz: 0x3d728, p_memsz: 0x3d728, p_align: 4096 }[0m
[36mDEBUG   [0m[battle_core::load_savestate]  Creating battlestate from /home/wboussella/Documents/rl_new_pokemon_ai/rl_new_pokemon_ai/pkmn_rl_arena/../savestate/boot_state_turntype:0_step:1_id:

Using device: cpu
Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.ENEMY: 3>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, st

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, 

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, 

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, 

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, 

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, 

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, 

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.ENEMY: 3>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, st

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, 

Adding stop address: addr=33556836, value=1, is_active=true, name=stopHandleTurnCreateTeam, id=0
Adding stop address: addr=33556838, value=1, is_active=true, name=stopHandleTurn, id=1
Adding stop address: addr=33556832, value=1, is_active=true, name=stopHandleTurnPlayer, id=2
Adding stop address: addr=33556834, value=1, is_active=true, name=stopHandleTurnEnemy, id=3
Adding stop address: addr=33556840, value=1, is_active=true, name=stopHandleTurnEnd, id=4


[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=2, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=3, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=4, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=5, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=6, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=7, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=8, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=9, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=10, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, step=11, turn=<TurnType.GENERAL: 1>)
[36mDEBUG   [0m[763603381::<module>]  BattleState(id=0, 

Episode 10/10, Win rate: 0.00, Epsilon: 0.95
