# Using DQN to Train Atari Donkey Kong

This notebook implements a DQN (Deep Q-Network) agent to play the Atari game *Donkey Kong*. The implementation includes the following features:
- Parallel training of multiple game environments
- Preprocessing of game frames to improve training efficiency
- Using prioritized experience replay to enhance training quality
- Logging of training statistics
- Periodic saving of the model
- Periodic evaluation and recording of gameplay videos


## 1. Install Required Dependencies

In [None]:
# Install necessary libraries
# Uncomment and run the following line if you haven't installed the dependencies.
# %pip install stable-baselines3[extra] gymnasium[atari] numpy matplotlib opencv-python tensorboard autorom[accept-rom-license]

## 2. Import Libraries

In [None]:
import os
import random
import time
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
import cv2
from collections import deque

# Additional imports for the environment and DQN algorithm
import gymnasium as gym
from stable_baselines3 import DQN
from stable_baselines3.common.vec_env import DummyVecEnv, VecFrameStack
from stable_baselines3.common.callbacks import CheckpointCallback
from stable_baselines3.common.logger import configure

## 3. Set Up Environment and Preprocessing

In [None]:
# Define a function to preprocess game frames
def preprocess_frame(frame):
    # Convert to grayscale
    gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
    # Resize to a fixed resolution (e.g., 84x84)
    resized = cv2.resize(gray, (84, 84))
    return resized

# Create a custom wrapper for preprocessing frames
from gymnasium import ObservationWrapper

class PreprocessFrame(ObservationWrapper):
    def __init__(self, env):
        super().__init__(env)
        self.observation_space = gym.spaces.Box(low=0, high=255, shape=(84, 84, 1), dtype=np.uint8)

    def observation(self, obs):
        processed = preprocess_frame(obs)
        return np.expand_dims(processed, axis=-1)

# Set up the environment with the preprocessing wrapper
def create_env():
    env = gym.make("ALE/DonkeyKong-v5", render_mode="rgb_array")
    env = PreprocessFrame(env)
    return env

# Use DummyVecEnv to create a vectorized environment for parallel training
env = DummyVecEnv([create_env])
# Optionally, stack frames if needed (e.g., stacking 4 consecutive frames)
env = VecFrameStack(env, n_stack=4)

## 4. Configure Training Callbacks and Logger

In [None]:
# Create a checkpoint callback to save the model periodically
checkpoint_callback = CheckpointCallback(save_freq=10000, save_path='./models/',
                                         name_prefix='dqn_donkeykong')

# Configure the logger to record training metrics
new_logger = configure('./logs/', ["stdout", "csv"])

## 5. Define and Train the DQN Agent

In [None]:
# Define the DQN agent with the desired parameters
model = DQN('CnnPolicy', env, learning_rate=1e-4, buffer_size=100000, learning_starts=1000,
            batch_size=32, tau=1.0, gamma=0.99, train_freq=4, target_update_interval=1000,
            exploration_fraction=0.1, exploration_final_eps=0.01, verbose=1)

# Set the new logger to the model
model.set_logger(new_logger)

# Train the agent. The total_timesteps parameter can be adjusted as needed
model.learn(total_timesteps=200000, callback=checkpoint_callback)

# Save the final model
model.save("dqn_donkeykong_final")

## 6. Evaluation and Video Recording

In [None]:
# Define a function to evaluate the trained model
def evaluate(model, num_episodes=5):
    env = create_env()
    total_rewards = []
    for episode in range(num_episodes):
        obs, _ = env.reset()
        done = False
        episode_reward = 0
        while not done:
            action, _states = model.predict(obs)
            obs, reward, done, truncated, info = env.step(action)
            episode_reward += reward
        total_rewards.append(episode_reward)
        print(f"Episode {episode+1}: Reward = {episode_reward}")
    return total_rewards

# Run evaluation
evaluate(model, num_episodes=5)

## 7. Recording Gameplay Videos

In [None]:
# Optionally, record a video of the trained agent playing the game
from gymnasium.wrappers import RecordVideo

def record_video(model, video_folder='./videos/', episode_length=500):
    # Create a new environment that records the gameplay
    env = RecordVideo(create_env(), video_folder=video_folder, episode_trigger=lambda x: True)
    obs, _ = env.reset()
    done = False
    step = 0
    while not done and step < episode_length:
        action, _states = model.predict(obs)
        obs, reward, done, truncated, info = env.step(action)
        step += 1
    env.close()
    print(f"Video recorded in {video_folder}")

# Record a video for one episode
record_video(model, episode_length=500)

## 8. Conclusion

In this notebook, we implemented a DQN agent to play Atari *Donkey Kong*. We demonstrated how to preprocess game frames, set up a parallel environment, configure training callbacks, train the agent, evaluate its performance, and record gameplay videos. Adjust the parameters and training duration according to your needs for improved performance.