<a href="https://colab.research.google.com/github/Randyflourish/Intro2AI-Final/blob/main/AI_Final_BASALT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<div style="text-align: center">
  <img src="https://github.com/KarolisRam/MineRL2021-Intro-baselines/blob/main/img/colab_banner.png?raw=true">
</div>

# Introduction
This notebook is the installation part for the [MineRL 2022](https://minerl.io/) competition, building on the original introductory notebooks created for the MineRL 2021 competition.

## Note: About this file

This file is updated by NYCU 2024 Spring Intro2AI Team 11: まふまふ.
The original file is come from [here](https://colab.research.google.com/drive/1rJ3lGy-bG7kJRe_wYBWg7fjSaD9oOMDw?usp=sharing)

## There's a video to explain...
Please visit [this intro YouTube video](https://youtu.be/8yIrWcyWGek) to see some background information.  Hopefully, this will lead to a number of additional videos that explore what can be done in this environment...

And if you see me=@mdda online, then please say "Hi!"

## Software 2.0
The approach we are going to use, where we took some human written code and replaced it with an AI component is quite similar to how Tesla approaches self driving cars. See this talk by Andrej Karpathy, Director of AI at Tesla:  
[Building the Software 2.0 Stack](https://databricks.com/session/keynote-from-tesla)


# Setup

In [1]:
%%capture
!sudo add-apt-repository -y ppa:openjdk-r/ppa
!sudo apt-get purge openjdk-*
!sudo apt-get install openjdk-8-jdk
!sudo apt-get install xvfb xserver-xephyr vnc4server python-opengl ffmpeg
# Takes ~1min to run this
# New Add
!sudo apt-get install -y xvfb  # Install Xvfb

In [2]:
# This takes ~22mins - which would hit us every time we start Colab
#   So we'll do it once, and store a '.tar.gz' of the installation into our
#   Google Drive, so that we can get it back much quicker the second time!

##%%capture
##!pip3 install --upgrade minerl # Default is 0.4.4, we want 1.0.0 for VPT
##!pip3 uninstall minerl
#!pip3 install git+https://github.com/minerllabs/minerl@v1.0.0
#
#!pip3 install pyvirtualdisplay
#!pip3 install -U colabgymrender

In [None]:
import os, sys, time

mine_env = 'mine_env'
mine_env_full = f'/content/{mine_env}'
mine_tar = f'{mine_env}.tar.gz'

if mine_env_full not in sys.path:
  sys.path.insert(0, mine_env_full)
  os.environ['PYTHONPATH'] += f':{mine_env_full}'

mine_env, mine_env_full, mine_tar

In [None]:
# We'll connect to our Google Drive here, and see whether we've already saved off a copy
#   This will ask permission to 'connect to your drive' : The answer is 'Yes'!
MINE_ENV_IS_NEW = True

from google.colab import drive  # google.colab contains functions specifically for interacting with Google Colab's environment.
drive.mount('/content/drive')    # mounts your Google Drive as a local file system
if os.path.isfile(f'/content/drive/MyDrive/pythonLib/{mine_tar}'): # check if "mine_env.tar.gz" is in your Google Drive
  ! cp /content/drive/MyDrive/pythonLib/$mine_tar ./$mine_tar  # ! means the command is to be executed in the shell rather than as Python code.
                                              # This command copies the file from your Google Drive to the current working directory of the Colab notebook.

  ! ls -l ./$mine_tar                         # This lists the file details such as permissions, owner, size, and modification date for the copied file in the current directory.
                                              # It helps verify that the file has been copied correctly and shows its properties.
  # e.g.: -rw------- 1 root root 1510118446 Jun 26 08:48 ./mine_env.tar.gz

  # ! tar -tzf ./$mine_tar | grep minerl | head -5    # list some contents of the compressed tar file without extracting it
  ! tar -xzf ./$mine_tar    # This extracts the contents of the tar file into the current directory

  MINE_ENV_IS_NEW = False
  # Takes 1min too (huge saving!)

sys.path.append('/content/drive/MyDrive/pythonLib')
sys.path.append('/content/drive/MyDrive/pythonLib/VPT')

"DONE"

In [None]:
# Build the mine_env if necessary
"""
try:
  from pyvirtualdisplay import Display
except :
  !pip3 install --target=$mine_env git+https://github.com/minerllabs/minerl@v1.0.2   # 21 mins
  # https://stackoverflow.com/questions/55833509/attributeerror-type-object-callable-has-no-attribute-abc-registry
  !mv $mine_env/typing.py $mine_env/MEH-typing.py  # Fix for Python3.7 ...

  !pip3 install --target=$mine_env pyvirtualdisplay  # 4 secs  #注 Display creates a virtual framebuffer that graphical applications can use to render output as if they were using a real monitor.
                                                              #注 This allows you to run applications that require a GUI without having an actual GUI environment installed on the system.
  !pip3 install --target=$mine_env --upgrade colabgymrender # 22 secs  #注 colabgymrender provides a workaround by capturing the graphical output of the environment and displaying it within the notebook.

  MINE_ENV_IS_NEW = True
  # NB: some restart notices in the output ... but there's no need to restart!
  #     In any case, please wait for the 'DONE' message to print out
f"DONE, with MINE_ENV_IS_NEW={MINE_ENV_IS_NEW}"
"""

In [None]:
# check content of mine_env (execute if needed)
'''
! du -b mine_env | tail -5  # mine_env = ~ 2,094,031,775 bytes overall (a little bit less)
'''

In [None]:
# Build new env.tar.gz file in google drive (execute if needed)
'''
if MINE_ENV_IS_NEW: #  or True
  # ! ls -l /gdrive/MyDrive/mine*
  ! rm -f ./$mine_tar   #注 removes the existing tar.gz archive of the environment, if any, from the current directory.
  ! tar -czf ./$mine_tar $mine_env  #注 This command creates a new compressed (gzipped) tar archive of the directory specified by the $mine_env variable (the environment directory).
  ! ls -l ./$mine_tar
  # Without running the env...
  # -rw-r--r-- 1 root root 1505020174 Jun 26 07:26 ./mine_env.tar.gz
  # Once the minerl env has been reset once (i.e. java has built...)
  # -rw------- 1 root root 1511976116 Jun 26 08:43 ./mine_env.tar.gz
  ! tar -tzf ./$mine_tar | head
  ! cp ./$mine_tar /content/drive/MyDrive/pythonLib/  #注 This copies the newly created archive to a Google Drive directory.
  ! ls -l /content/drive/MyDrive/pythonLib/$mine_tar
"DONE"
'''

# Import Libraries

In [None]:
import os   # For interacting with the operating system.

import numpy as np  # For numerical operations.

import gym    # To create and manage environments based on the OpenAI Gym toolkit.
import minerl

from tqdm.notebook import tqdm  # For displaying progress bars in Jupyter notebooks.
from colabgymrender.recorder import Recorder # To facilitate rendering of Gym environments in Google Colab.
from pyvirtualdisplay import Display # To create a virtual display to render environments in a headless server or environment like Google Colab.

import logging
logging.disable(logging.ERROR) # reduce clutter, remove if something doesn't work to see the error logs.

np.__version__  # '1.21.6' => that this is reading from our ~/mine_env directory
# Numpy version may be different from the content above
# About warning: since warning is in a local package, so if error occurs, please comment the specific line

import cv2
#from google.colab.patches import cv2_imshow
#from PIL import Image
import matplotlib.pylab as plt


import json
import glob
from run_inverse_dynamics_model import json_action_to_env_action
from agent import ENV_KWARGS # need to modify

import torch as th

# Download Dataset
download a number of BASALT find-cave dataset, whose is specified in 'find-cave-Jul-28.json'

Downloaded data includes the video data(.mp4), but also the corresponding actions(.jsonl)

Target directory, which saves all the downloaded data, is /content/MineRLBasaltFindCave-v0 (in the disk)

function 'download_file()' is refer to 'utils/download_dataset.py' in [basalt-2022-behavioural-cloning-baseline](https://github.com/minerllabs/basalt-2022-behavioural-cloning-baseline) with a little change


In [None]:
n_videos = 400

from download_dataset import download_file
download_file(n_videos) # default is 400, about 40 GB?
# !ls /content/MineRLBasaltFindCave-v0


# Construct Inverse Dynamic Model Agent
load the 4x_idm model and load 4x_idm weight into the model

function 'load_IDM_agent()' is refer to 'main() in ./VPT/run_inverse_dynamics_model.py'

In [None]:
'''
from inverse_dynamics_model import load_IDM_agent
IDMAgent = load_IDM_agent()
'''

# Test the IDM
check if the prediction of IDM is almost the same as ground truth actions

also refer to 'main() in ./VPT/run_inverse_dynamics_model.py'

the higher n_frames, the more accurate, however also lead to more memory consumption


In [None]:
'''
# Test for IDMAgent
import json
import glob
from run_inverse_dynamics_model import json_action_to_env_action
from agent import ENV_KWARGS # need to modify

import torch as th

required_resolution = ENV_KWARGS["resolution"] # video required resolution
files = glob.glob("/content/MineRLBasaltFindCave-v0/*.mp4")
video_path = files[0] # pick the first video to test
json_path = video_path.replace(".mp4", ".jsonl")

cap = cv2.VideoCapture(video_path) # create a video capture object that can be used to read video files or streams

json_index = 0
with open(json_path) as json_file:
  json_lines = json_file.readlines()
  json_data = "[" + ",".join(json_lines) + "]"
  json_data = json.loads(json_data)

# can be modified
n_frames = 1000  # how many frames of the video are loaded and processed in a single batch.
          # Processing a specific number of frames at a time, instead of loading an entire video into memory, helps manage memory usage and computational load.
# can be modified
n_batches = 0  # This term describes the total number of these batches that will be processed during the script’s run
          # If the video has more frames, they won't be processed unless the n_batches count is increased.
for _ in range(n_batches):
  th.cuda.empty_cache() # clear the unused memory from the GPU's cache
  print("=== Loading up frames ===")
  frames = []
  recorded_actions = []
  for _ in range(n_frames):
    ret, frame = cap.read() # capture a frame from the video, 'ret' is a boolean indicating if the frame was successfully read
    if not ret: # end of the video
      break
    assert frame.shape[0] == required_resolution[1] and frame.shape[1] == required_resolution[0], "Video must be of resolution {}".format(required_resolution)
    # BGR -> RGB
    frames.append(frame[..., ::-1]) # Converts the frame from BGR (Blue, Green, Red - default color format in OpenCV) to RGB format and stores it in the frames list
    env_action, _ = json_action_to_env_action(json_data[json_index]) # extract the ground truth action
    recorded_actions.append(env_action)
    json_index += 1
  frames = np.stack(frames) # stacks the list of frames into a numpy array, making it suitable for batch processing in the model prediction
  print("=== Predicting actions ===")
  predicted_actions = IDMAgent.predict_actions(frames) # pass a batch of frames into IDM, and get the output actions

  for i in range(len(frames)):
    for y, (action_name, action_array) in enumerate(predicted_actions.items()):
      print(f"{action_name}: {action_array[0, i]} ({recorded_actions[i][action_name]}), ", end = "")
    print("\n")
'''

# preprocessing
env_action_to_agent

agent_action_to_env

img_to_tensor

In [12]:
import glob
import torch as th
import torchvision.transforms.functional as TF
from torch import nn
from torch.nn import functional as F
from torch import optim

# transform of env action and agent action
env = gym.make("MineRLBasaltFindCave-v0")

NOOP = env.action_space.no_op()

# binary encoding of env_action
# forward, back, left, right, sneak, sprint(run), jump, ESC = 2^7, ..., 2^0
ACTION_LIST = ["forward", "back", "left", "right", "sneak", "sprint", "jump", "ESC"]
ACTION_LIST_Rev = ACTION_LIST.copy()
ACTION_LIST_Rev.reverse()

def env_action_to_agent(env_action: dict):
  target_action_C = int(0) # for classification
  target_action_R = env_action["camera"] # for regression
  for act in ACTION_LIST:
    target_action_C *= 2
    target_action_C += 1 if env_action[act] == 1 else 0
  if target_action_C == 0 and np.array_equal(target_action_R, np.zeros(2)):
      isNoop = True
  else:
      isNoop = False
  return [target_action_C, target_action_R, isNoop]

def agent_action_to_env(agent_action_C, agent_action_R):
  target_action = NOOP
  for act in ACTION_LIST_Rev:
    target_action[act] = 1 if agent_action_C % 2 == 1 else 0
    agent_action_C //= 2  # Use integer division to keep agent_action_C as an integer
  target_action["camera"] = agent_action_R
  return target_action

def img_to_tensor(frames, size): # cv2 read in (H,W,C)=(360,640,3)
  target_tensor = th.empty((0, 3, size[1], size[0]), dtype = th.float32) # create an uninitialized tensor of the specified shape and data type (float32).
  for frame in frames:
    #frame = frame[:, 140:500, :] # because video resolution is 360*640*3, crop the size of W from 640 to 360 to make it a square
                     # but I think this will lose some information, maybe test it later?
    frame = cv2.resize(frame, size)
    frame = TF.to_tensor(frame)  # Convert to tensor and scale to [0, 1], also convert to (3, 277, 277)
    frame = frame.unsqueeze(0)  # Add a batch dimension # (1, 3, 277, 277s)
    target_tensor = th.cat((target_tensor, frame), dim = 0) # (i++, 3, 277, 277)
  return target_tensor


# SimpleNet

In [13]:
# NN architecture

class SimpleNet(nn.Module):

  def __init__(self, output_size = 256):
    super().__init__()
    self.conv1 = nn.Conv2d(in_channels = 3, out_channels = 64, kernel_size = 10, stride = 5) # 17 * 31
    self.conv2 = nn.Conv2d(in_channels = 64, out_channels = 64, kernel_size = 2, padding = 1) # 17 * 31

    self.fc1 = nn.Linear(in_features = 9 * 16 * 64, out_features = 4096)
    self.fc2 = nn.Linear(in_features = 4096, out_features = output_size)

  def forward(self, input): # 160 * 90 * 3 RGB tensor
    x = F.relu(self.conv1(input))
    x = F.relu(self.conv2(x))
    x = F.max_pool2d(x, kernel_size = 2, stride = 2) # 9 * 16
    x = th.flatten(x, start_dim = 1)
    x = F.relu(self.fc1(x))
    x = self.fc2(x)
    return x # int between [0, 255]

# AlexNet
forClassification and forRegression

In [14]:
import torch
import torch.nn as nn

class AlexNet(nn.Module):
    def __init__(self, output_size = 256): # input: batch_size * 3 * 227 * 227 tensor
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(in_channels = 3, out_channels = 96, kernel_size = 11, stride = 4),
            nn.PReLU(),
            #nn.ReLU(inplace = True),
            nn.MaxPool2d(kernel_size = 3, stride = 2),
            nn.Conv2d(in_channels = 96, out_channels = 256, kernel_size = 5, padding = 2),
            nn.PReLU(),
            #nn.ReLU(inplace = True),
            nn.MaxPool2d(kernel_size = 3, stride = 2),
            nn.Conv2d(in_channels = 256, out_channels = 384, kernel_size = 3, padding = 1),
            nn.PReLU(),
            #nn.ReLU(inplace = True),
            nn.Conv2d(in_channels = 384, out_channels = 384, kernel_size = 3, padding = 1),
            nn.PReLU(),
            #nn.ReLU(inplace = True),
            nn.Conv2d(in_channels = 384, out_channels = 256, kernel_size = 3, padding = 1),
            nn.PReLU(),
            #nn.ReLU(inplace = True),
            nn.MaxPool2d(kernel_size = 3, stride = 2),
        )
        self.classifier = nn.Sequential(
            nn.Linear(in_features = 256*6*6, out_features = 4096),
            nn.PReLU(),
            #nn.ReLU(inplace = True),
            nn.Dropout(0.5),
            nn.Linear(in_features = 4096, out_features = 4096),
            nn.PReLU(),
            #nn.ReLU(inplace = True),
            nn.Dropout(0.5),
            nn.Linear(in_features = 4096, out_features = output_size),
            # nn.Softmax(dim = 1) # Typically softmax is not applied here when using nn.CrossEntropyLoss
        )

    def forward(self, x): # input: batch_size * 3 * 227 * 227 tensor
        x = self.features(x)
        x = th.flatten(x, start_dim = 1)
        x = self.classifier(x)
        return x

# FindCave Agent
training

In [None]:
import random
import queue

class FindCaveAgent():

  def __init__(self, learning_rate = 0.0001, NetC = "", NetR = ""):

    # For Classification (8 button)
    if NetC == "AlexNet":
      self.policyC = AlexNet(output_size = 256).cuda()
      self.sizeC = (227, 227)
    else:
      self.policyC = SimpleNet(output_size = 256).cuda()
      self.sizeC = (160, 90)
    self.optimizerC = optim.Adam(self.policyC.parameters(), lr = learning_rate)
    self.criterionC = nn.CrossEntropyLoss()

    # For regression (camera)
    if NetC == "AlexNet":
      self.policyR = AlexNet(output_size = 2).cuda()
      self.sizeR = (227, 227)
    else:
      self.policyR = SimpleNet(output_size = 2).cuda()
      self.sizeR = (160, 90)
    self.optimizerR = optim.Adam(self.policyR.parameters(), lr = learning_rate)
    self.criterionR = nn.MSELoss()

  def train(self, batch_size = 64, n_epochs = 10):
    self.policyC.train()
    self.policyR.train()

    video_paths = glob.glob("/content/MineRLBasaltFindCave-v0/*.mp4")
    json_paths = [vp.replace(".mp4", ".jsonl") for vp in video_paths]


    batch_count = 0
    batch_loss_C = 0
    batch_loss_R = 0

    # testing
    test_prev_batch_target_actions_R = []
    test_prev_batch_output_R = []
    test_prev_batch_frames = []

    for epoch in range(n_epochs):
      print(f"Epoch {epoch+1}")
      cap_list = []
      json_data_list = []
      cur_json_index = []
      for video_path, json_path in zip(video_paths, json_paths):
        cap_list.append(cv2.VideoCapture(video_path))
        with open(json_path) as jf:
          json_lines = jf.readlines()
          json_data = "[" + ",".join(json_lines) + "]"
          json_data = json.loads(json_data)
          json_data_list.append(json_data)
        cur_json_index.append(0)

      while len(cap_list) >= batch_size:
        batch_cap = random.sample(cap_list, batch_size)
        batch_index = [cap_list.index(cap) for cap in batch_cap]

        batch_frames = []
        batch_cur_json_index = []
        batch_target_actions_C = []
        batch_target_actions_R = []
        for index in batch_index:
          frame = cap_list[index].read()[1]
          env_action = json_action_to_env_action(json_data_list[index][cur_json_index[index]])[0]
          target_action = env_action_to_agent(env_action)
          if target_action[2]:
            continue
          batch_frames.append(frame)
          batch_target_actions_C.append(target_action[0])
          batch_target_actions_R.append(target_action[1])

        if len(batch_frames) > 0:

          batch_frames_C = img_to_tensor(batch_frames, self.sizeC).cuda()
          batch_frames_R = img_to_tensor(batch_frames, self.sizeR).cuda()

          batch_target_actions_C = th.tensor(batch_target_actions_C, dtype=torch.long).cuda()
          batch_target_actions_R = th.tensor(batch_target_actions_R, dtype=torch.float).cuda()

        # training
        self.optimizerC.zero_grad()
        batch_output_C = self.policyC(batch_frames_C)
        loss_C = self.criterionC(batch_output_C, batch_target_actions_C)
        loss_C.backward()
        self.optimizerC.step()

        self.optimizerR.zero_grad()
        batch_output_R = self.policyR(batch_frames_R)
        loss_R = self.criterionR(batch_output_R, batch_target_actions_R)
        loss_R.backward()
        self.optimizerR.step()

        # print the average lossC and lossR among 100 batches
        batch_loss_C += loss_C.item()
        batch_loss_R += loss_R.item()
        batch_count += 1
        if batch_count % 100 == 0:
          #if batch_count % 1000 == 0:
          #  print(batch_output_R)
          print(f'Batch {batch_count-100} to {batch_count-1}, LossC: {batch_loss_C}, LossR: {batch_loss_R}')
          batch_loss_C = 0
          batch_loss_R = 0

        # deleting if index exceed
        del_indices = []
        for index in batch_index:
          cur_json_index[index] += 1
          if cur_json_index[index] >= len(json_data_list[index]):
            del_indices.append(index)

        del_indices.sort(reverse=True)

        for index in del_indices:
          cap_list[index].release()
          del cap_list[index]
          del json_data_list[index]
          del cur_json_index[index]

      # release the remaining less than batch_size captures in cap_list
      for cap in cap_list:
        cap.release()

  def predict(self, observe):

    with th.no_grad():

      obs_tensor = img_to_tensor([observe, ], self.sizeC).cuda()
      resultC = self.policyC(obs_tensor).squeeze().argmax().cpu().numpy()
      obs_tensor = img_to_tensor([observe, ], self.sizeR).cuda()
      resultR = self.policyR(obs_tensor).squeeze().cpu().numpy()
      env_action = agent_action_to_env(resultC, resultR)

      if render:
        print(env_action)

    return env_action

  def save_model_weights(self, path=""):
    # Save the state dictionaries of models and optimizers
    th.save({
        'policyC_state_dict': self.policyC.state_dict(),
        'optimizerC_state_dict': self.optimizerC.state_dict(),
        'policyR_state_dict': self.policyR.state_dict(),
        'optimizerR_state_dict': self.optimizerR.state_dict(),
    }, path)

  def load_model_weights(self, path=""):
    # Load the state dictionaries of models and optimizers
    checkpoint = th.load(path)
    self.policyC.load_state_dict(checkpoint['policyC_state_dict'])
    self.optimizerC.load_state_dict(checkpoint['optimizerC_state_dict'])
    self.policyR.load_state_dict(checkpoint['policyR_state_dict'])
    self.optimizerR.load_state_dict(checkpoint['optimizerR_state_dict'])
'''
TA = FindCaveAgent(learning_rate = 0.001, NetC = "AlexNet", NetR = "AlexNet")
TA.train(batch_size = 64, n_epochs = 10)
TA.save_model_weights(path="/content/drive/MyDrive/pythonLib/minerl_weights_lr_0_001_NetC_AlexNet_NetR_AlexNet.pth")
TA = FindCaveAgent(learning_rate = 0.001, NetC = "SimpleNet", NetR = "SimpleNet")
TA.train(batch_size = 64, n_epochs = 10)
TA.save_model_weights(path="/content/drive/MyDrive/pythonLib/minerl_weights_lr_0_001_NetC_SimpleNet_NetR_SimpleNet.pth")
TA = FindCaveAgent(learning_rate = 0.00055, NetC = "SimpleNet", NetR = "SimpleNet")
TA.train(batch_size = 64, n_epochs = 10)
TA.save_model_weights(path="/content/drive/MyDrive/pythonLib/minerl_weights_lr_0_00055_NetC_SimpleNet_NetR_SimpleNet.pth")
'''
TA = FindCaveAgent(learning_rate = 0.0001, NetC = "SimpleNet", NetR = "SimpleNet")
TA.train(batch_size = 64, n_epochs = 10)
TA.save_model_weights(path="/content/drive/MyDrive/pythonLib/minerl_weights_lr_0_0001_NetC_SimpleNet_NetR_SimpleNet.pth")

In [None]:
#from google.colab import runtime
#runtime.unassign()

# testing

In [21]:
def create_video(frames, count, path = "/content/drive/MyDrive/AIFinalResult", name = "myVideo"):
  if not os.path.isdir(path):
    os.mkdir(path)
  video_name = f"{name}{count}"
  width = frames[0].shape[1]
  height = frames[0].shape[0]
  fourcc = cv2.VideoWriter_fourcc(*'MP4V')
  out = cv2.VideoWriter(path + "/" + video_name + ".mp4", fourcc, 20, (width, height))
  for frame in frames:
    out.write(cv2.cvtColor(frame, cv2.COLOR_RGB2BGR))
  out.release()

In [None]:

env = gym.make("MineRLBasaltFindCave-v0")

disp = Display(visible=0, backend="xvfb")
disp.start();



import gym
import matplotlib.pyplot as plt

def testing(agent, env, render=False, video_count=0):

    obs = env.reset()
    pov = obs["pov"]

    img_buffer = []

    done = False
    cumulative_reward = 0

    while not done and len(img_buffer) < 2000:
        ac = agent.predict(pov)
        obs, reward, done, info = env.step(ac)
        pov = obs["pov"]

        cumulative_reward += reward

        if render:
            plt.imshow(pov)
            plt.show()
            plt.clf()  # Important to reduce the usage of RAM

        if video:
          img_buffer.append(cv2.resize(pov, (480, 270))) # Can adjust the size of frame to reduce RAM, original size is (640, 360)


    if video:
        create_video(img_buffer, video_count)
    del img_buffer

    print(f"Total Cumulative Reward: {cumulative_reward}")
    return cumulative_reward


render = False
video = True
num_runs = 10
cumulative_rewards = []

for i in range(num_runs):
    cumulative_reward = testing(TA, env, render=render, video_count = i)
    cumulative_rewards.append(cumulative_reward)

average_cumulative_reward = sum(cumulative_rewards) / num_runs
print(f"Average Cumulative Reward over {num_runs} runs: {average_cumulative_reward}")


# Others

In [None]:
disp = Display(visible=0, backend="xvfb") #註 set up a virtual display, allows the minecraft backend to writh to something
                       #  even though colab does not have an actual display
disp.start();

In [None]:
env = gym.make("MineRLBasaltFindCave-v0") #註 one of the environments: find a cave

In [None]:
env.action_space.sample().keys()  #註 action space that the minecraft agent can do

In [None]:
# Have a look at a few actions we might do:
for _ in range(10):
  print( env.action_space.sample() )

In [None]:
t0=time.time()
obs = env.reset()  # First obs is thrown away...
            #註 reset the environment
print(f"{(time.time()-t0):.2f}sec for env.reset")
# 275.65sec = 4mins for first time, 80.73sec second time (due to compilation of java files?)

In [None]:
# Now that Steve has been spawned, do some actions...
t0=time.time()

done, iter = False, 0
while not done:
    ac = env.action_space.noop() #註 'ac' is initialized with a no-operation action('noop')
    # Spin around to see what is around us
    ac["camera"] = [0, 0]  # (pitch, yaw) deltas in degrees : +30 => turn to right
    ac['attack'] = [1]

    t1=time.time()
    obs, reward, done, info = env.step(ac) #註 obs is 'next observation'
    #print(obs, reward, info)  # NB: Yikes : obs is only the image!
    #  obs = Dict(pov:Box(low=0, high=255, shape=(360, 640, 3)))
    #print(pov.shape) # (360, 640, 3)  Image spec agrees with docs!
    print(f"{(time.time()-t1):.2f}sec for env.step")  # Approx 0.25sec per step

    pov = obs["pov"] #註 pov(point of view) is an image which is extracted from the obs(observation)

    #env.render()  # This does an internal cv2.imshow that colab rejects
    #cv2_imshow(pov[:, :, ::-1])
    #cv2.waitKey(1)

    plt.imshow(pov) #註 在圖表中繪製圖片(繪製pov)
    plt.show() #註 顯示圖表
    iter +=1
    if iter>12: done=True

f"{(time.time()-t0):.2f}sec for whole spin"

In [None]:
# Set up a simple testing function
def action_step(action):
  ac = env.action_space.noop()
  ac.update(action)
  obs, reward, done, info = env.step(ac)
  plt.imshow(obs["pov"])
  plt.show()

In [None]:
action_step({})
action_step(dict(inventory=[1]))
action_step(dict(camera=[0, +30]))
action_step(dict(camera=[-10, -30]))
action_step(dict(camera=[+10, 0]))
action_step(dict(inventory=[1]))  # Put inventory away? = Yes, if it is showing

In [None]:
#action_step({'inventory':[1]})  # Put inventory away? = NOT jump, sneak, use, hotbar.X, back
action_step({})  # NOOP

In [None]:
# Set up a simple calibration function
import cv2
from google.colab.patches import cv2_imshow

def action_step_calibrate(x_off,y_off):
  ac = env.action_space.noop()
  ac.update(dict(camera=[y_off, x_off]))
  obs, reward, done, info = env.step(ac)
  im = obs["pov"][100:250, 200:400,:]
  cv2_imshow(cv2.cvtColor(im, cv2.COLOR_RGB2BGR))
  ac = env.action_space.noop()
  ac.update(dict(camera=[-y_off, -x_off]))  # Move back
  obs, reward, done, info = env.step(ac)

In [None]:
action_step({})
action_step(dict(inventory=[1]))

action_step_calibrate(0, 0)
for x_off in [+0.62, +1.61, +3.22, +5.81, +10.0]:
  print(f"x_off={x_off}")
  action_step_calibrate(x_off,0)
  action_step_calibrate(-x_off,0)
for y_off in [+0.62, +1.61, +3.22, +5.81, +10.0]:
  print(f"y_off={y_off}")
  action_step_calibrate(0, y_off)
  action_step_calibrate(0, -y_off)

action_step(dict(inventory=[1]))  # Put inventory away? = Yes, if it is showing

In [None]:
! nvidia-smi

In [None]:
env.close()

In [None]:
disp.stop();

In [None]:
# THE END! - We'll be using this set-up in the future!