# __`POKE-ENV` RL MAIN NOTEBOOK__

This notebook serves as an easy way to connect with the `pokemon-showdown-master` module. It will implement an RL approach to Pokemon battling on a local Pokemon Showdown server.

> note: A Local Pokemon-Showdown server is required to run this code. for more details look [here](https://github.com/smogon/pokemon-showdown/blob/master/server/README.md)

## These are the neccassary imports that are needed for the notebook to function as intended. 
use `pip install poke-env` to install poke-env

`Create_teams.py` should be included in the repository and can be found [here](https://github.com/TrevorKWalker/Poke-AI)

In [1]:
## These are the neccassary imports that are needed for the notebook to function as intended. 
from poke_env.player.player import Player
from poke_env import RandomPlayer
from poke_env.ps_client.server_configuration import ServerConfiguration
from poke_env import AccountConfiguration
import asyncio
import nest_asyncio 
import Create_teams
import simple_rl_player

import numpy as np
from gymnasium.spaces import Box, Space
from poke_env.data import GenData
import stable_baselines3
from stable_baselines3.common.env_checker import check_env

from stable_baselines3.common.callbacks import CheckpointCallback
from poke_env.environment.abstract_battle import AbstractBattle
from poke_env.player.player import Player
from poke_env.player.openai_api import OpenAIGymEnv
from poke_env.player.env_player import Gen8EnvSinglePlayer

### Global Constants
These are globals that mainly relate to the server you are hosting, Check to ensure that they are correct for your system.

In [2]:
#name of the account for the Bot that you use. Should only neccassary for challenging Human players. 
my_account_config = AccountConfiguration("175bot", "pokeai")

# The address of the server that you are hosting. 
server_config = ServerConfiguration(
        websocket_url="ws://localhost:8000/showdown/websocket",  # WebSocket URL for your local server
        authentication_url="http://localhost:8000",           # Authentication URL (often the same as server URL)
    )

# Making a player for a gen 9 random battle

we must make 2 players and then we can have them battle against each other. we will create Random players from poke-env. 
battle_against allows us to have one bot send another bot a challenge and battle. it takes parameters : `oppenent : str` and `n_batttles` : `int`
by using RandomPlayer having these two battle will cause a gen 9 random battle ( at the time of writing)

If the code runs the outcome will be visable on the local server that is being hosted. 

In [4]:
player_1 = RandomPlayer()
player_2 = RandomPlayer()
await player_1.battle_against(player_2, n_battles=1)

# Creating teams using `Create_teams.py`
`create_teams` takes one parameter: `directory` which is the path to the folder containing the teams as seperate .txt files. Each team is in a seperate .txt file in pokemon showdown export format.
Easiest way to make new teams is to use [pokemon showdown](https://play.pokemonshowdown.com/teambuilder) to create a team and then export it as text. 
The pokemon must be in the Pokemon Showdown format or the server will stall later due to the team getting rejected by the validater.
Depending on the teams that you make you may need to change the `battle_format` in the future sections. 

In [3]:
#competitive teams that are taken from past top placing teams in format H
Competitive_teams = Create_teams.create_teams("./Teams/Competitive")
Num_competitive_teams = len(Competitive_teams)


# Teams of pokemon that are set to lvl 50. All teams of 3 from different generations with movesets they would have at lvl 15 (lvl 15 was chosen because that is the average lvl cap of the first gym in nuzlockes.)
Early_game_teams = Create_teams.create_teams("./Teams/In-Game/Early_game")
Num_early_game_teams = len(Early_game_teams)

## Making Custom Player Class

 To start we will use `MaxDamagePlayer` which is a simple player that always chooses the highest base power move. This is the most basic that is possible and only being used to better understand the Poke-env module

In [4]:

class MaxDamagePlayer(Player):
    def choose_move(self, battle):
        # Choose the move with the highest base power
        if battle.available_moves:
            best_move = max(battle.available_moves, key=lambda move: move.base_power)

            # Terastallize if possible
            if battle.can_tera:
                return self.create_order(best_move, terastallize=True)

            return self.create_order(best_move)
        else:
            return self.choose_random_move(battle)
    def choose_team_preview(self, battle):
        
        # For simplicity, send the first Pokémon in the team
        return "/team 1"


### Creating a player for `MaxDamagePlayer`

To be able to create a player that uses `MaxDamagePlayer` and is able to battle with one of our teams we must assign it a battle format that is not random battles. availible battle formats are found in `config/formats.ts`
All `competitive` teams are able to be played in Gen9 OU but because `In-Game` teams need Gen9balancedhackmons we will be using that.

In [8]:
#create player_1 with the right battle_format and  give them a team
player_1 = MaxDamagePlayer(battle_format="gen9balancedhackmons", team = Early_game_teams[0])


#create player_2 with the right battle_format and  give them a team
player_2 = MaxDamagePlayer( battle_format="gen9balancedhackmons", team = Early_game_teams[1])


# Have them battle. check the local server to see results
await player_1.battle_against(player_2, n_battles=1)


# Challenging the Human

This section will cover how to send a challenge to a Human player. The Human must also be on an account that is connected to the local server. we will use send_challenges which is similar to battle against but for Humans. 

IMPORTANT: The human is required to have a team that is the same size as the agent of the game will hang indefinitely.

In [None]:
#create the agent in the same way
player_1 = MaxDamagePlayer(battle_format="gen9balancedhackmons", team = Early_game_teams[0])


# change oppenent to your pokemon showdown account name.
await player_1.send_challenges(opponent="KingKylan", n_challenges=1)

### Implementing RL policies
first we need to make a new custom player class as before that will be our RL agent. this is defined in simple_rl_player.py

next we need an oppenent so that we can test the env.


In [5]:
team = """Porygon2 @ Eviolite  
Ability: Download  
Level: 50  
Tera Type: Ground  
EVs: 252 HP / 4 Atk / 100 Def / 116 SpA / 36 SpD  
Quiet Nature  
- Tera Blast  
- Ice Beam  
- Recover  
- Trick Room  

Sneasler @ Focus Sash  
Ability: Poison Touch  
Level: 50  
Tera Type: Flying  
EVs: 252 Atk / 4 SpD / 252 Spe  
Jolly Nature  
- Close Combat 
- Coaching  
- Protect  

Gholdengo @ Life Orb  
Ability: Good as Gold  
Level: 50  
Tera Type: Water  
EVs: 212 HP / 148 Def / 132 SpA / 12 SpD  
Modest Nature  
IVs: 0 Atk  
- Make It Rain  
- Shadow Ball  
- Nasty Plot  
- Protect  

Talonflame @ Sharp Beak  
Ability: Gale Wings  
Level: 50  
Tera Type: Ghost  
EVs: 4 HP / 252 Atk / 252 Spe  
Jolly Nature  
- Dual Wingbeat  
- Tailwind  
- Will-O-Wisp  
- Protect  

Garchomp @ Clear Amulet  
Ability: Rough Skin  
Level: 50  
Tera Type: Fire  
EVs: 44 HP / 204 Atk / 4 Def / 4 SpD / 252 Spe  
Jolly Nature  
- Earthquake  
- Stomping Tantrum  
- Dragon Claw  
- Protect  

Incineroar @ Safety Goggles  
Ability: Intimidate  
Level: 50  
Tera Type: Ghost  
EVs: 252 HP / 36 Atk / 76 Def / 140 SpD  
Brave Nature  
IVs: 29 Spe  
- Flare Blitz  
- Knock Off  
- Fake Out  
- Parting Shot  
"""

In [None]:
player_1 = MaxDamagePlayer(battle_format="gen9balancedhackmons", team = team)

env = simple_rl_player.SimpleRLPlayer(
        battle_format="gen9balancedhackmons", start_challenging=True, opponent=player_1, team = team
    )

# Define Stable-Baselines3 model
model = stable_baselines3.PPO(
    "MlpPolicy",
    env,
    verbose=1,
    learning_rate=1e-3,
    batch_size=32,
    n_epochs=10,
    ent_coef=0.01
)

checkpoint_callback = CheckpointCallback(save_freq=10000, save_path="./models/", name_prefix="poke_ppo")
model.learn(total_timesteps=10000, callback=checkpoint_callback)

# Save the final model
model.save("poke_ppo_final")

Using cpu device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
---------------------------------
| rollout/           |          |
|    ep_len_mean     | 31.8     |
|    ep_rew_mean     | -30      |
| time/              |          |
|    fps             | 221      |
|    iterations      | 1        |
|    time_elapsed    | 9        |
|    total_timesteps | 2048     |
---------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 30.6        |
|    ep_rew_mean          | -30         |
| time/                   |             |
|    fps                  | 235         |
|    iterations           | 2           |
|    time_elapsed         | 17          |
|    total_timesteps      | 4096        |
| train/                  |             |
|    approx_kl            | 0.039827824 |
|    clip_fraction        | 0.441       |
|    clip_range           | 0.2         |
|    entropy_loss   

KeyboardInterrupt: 