# Octopath Traveler Weakness Analysis

## Motivation

Octopath Traveler is a video game. All enemies have a set of weaknesses, and hitting those weaknesses deal 1.5x damage (when the enemy is not broken). Additionally, enemies have shields, and only by hitting weaknesses can you break enemy shields. When an enemy shield is broken, the enemy loses their action(s) for the current turn, and their next turn before they get their action back. Additionally, when an enemy shield is broken, they are subject to a "broken" state until they recover their shield, making them take x2 incoming damage.

In this project, I will create a CNN to predict enemy weaknesses by their sprite.

Of course, this is hardly the most pragmatic topic, and I am not arguing that this is going to be extremely helpful (considering datamines for every enemy weakness already exists), but this can serve as a good introduction to CNNs and transfer learning.

## Code - Part 1: Representing data

I will take EnemyDB, join it with the CharacterResourceDB table, and associate each enemy's weakness with a set of weaknesses and a sprite.

In [1]:
import pandas as pd
import sklearn
import numpy as np
import sys
import os
import json
import re

In [2]:
enemyDB = json.load(open('EnemyDB.json'))[0]
resourceDB = json.load(open('CharacterResourceDB.json'))[0]
textDB = json.load(open('GameTextEN.json', encoding="utf8"))[0]

del enemyDB['export_type']
del resourceDB['export_type']
del textDB['export_type']

In [3]:
enemydf = pd.DataFrame.from_dict(enemyDB).transpose()
resourcedf = pd.DataFrame.from_dict(resourceDB).transpose()
textdf = pd.DataFrame.from_dict(textDB).transpose()

In [4]:
cols = enemydf.columns

Literally all these columns, except for the enemy name and weakness should be dropped.

In [5]:
enemydf = enemydf.drop(cols.drop(['DisplayNameID_151_570F1B1240B0BB42C31AA7A4F6CFAF72','ResourceLabel_295_2D43957442A6F48343BB639C79635FAE','AttributeResist_282_87881AFB4F4DD148EA692D8F63B73E16','WeaponResist_283_0BB9356646F9B928F5FC1584A458ABBC']), axis=1)

Now we remove all names that are null

In [6]:
enemydf = enemydf.mask(enemydf.eq('None')).dropna()

Now we one-hot encode.

Attributes go: Fire, Ice, Lightning, Wind, Light, Dark.

Weapons go: Sword, Spear, Dagger, Axe, Bow, Staff.

In [7]:
enemydf.index.name = 'Enemy_ID'
enemydf.head()

Unnamed: 0_level_0,DisplayNameID_151_570F1B1240B0BB42C31AA7A4F6CFAF72,ResourceLabel_295_2D43957442A6F48343BB639C79635FAE,AttributeResist_282_87881AFB4F4DD148EA692D8F63B73E16,WeaponResist_283_0BB9356646F9B928F5FC1584A458ABBC
Enemy_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
ENE_BOS_WAR_C01_010,ENN_BOS_WAR_C01_010,ENE_BOS_WAR_C01_010,"[EATTRIBUTE_RESIST::eNONE, EATTRIBUTE_RESIST::...","[EATTRIBUTE_RESIST::eWEAK, EATTRIBUTE_RESIST::..."
ENE_BOS_WAR_C01_020,ENN_BOS_WAR_C01_020,ENE_BOS_WAR_C01_020,"[EATTRIBUTE_RESIST::eNONE, EATTRIBUTE_RESIST::...","[EATTRIBUTE_RESIST::eWEAK, EATTRIBUTE_RESIST::..."
ENE_BOS_WAR_C02_010,ENN_BOS_WAR_C02_010,ENE_BOS_WAR_C02_010,"[EATTRIBUTE_RESIST::eNONE, EATTRIBUTE_RESIST::...","[EATTRIBUTE_RESIST::eNONE, EATTRIBUTE_RESIST::..."
ENE_BOS_WAR_C02_020,ENN_BOS_WAR_C02_020,ENE_BOS_WAR_C02_020,"[EATTRIBUTE_RESIST::eNONE, EATTRIBUTE_RESIST::...","[EATTRIBUTE_RESIST::eNONE, EATTRIBUTE_RESIST::..."
ENE_BOS_WAR_C04_010,ENN_BOS_WAR_C04_010,ENE_BOS_WAR_C04_010,"[EATTRIBUTE_RESIST::eNONE, EATTRIBUTE_RESIST::...","[EATTRIBUTE_RESIST::eWEAK, EATTRIBUTE_RESIST::..."


In [8]:
enemydf[['Magic', 'Fire','Ice','Lightning','Wind','Light','Dark']] = pd.DataFrame(enemydf['AttributeResist_282_87881AFB4F4DD148EA692D8F63B73E16'].tolist(), index=enemydf.index)
enemydf[['Sword', 'Spear', 'Dagger', 'Axe', 'Bow', 'Staff','Physical']] = pd.DataFrame(enemydf['WeaponResist_283_0BB9356646F9B928F5FC1584A458ABBC'].tolist(), index=enemydf.index)

enemydf = enemydf.drop(['AttributeResist_282_87881AFB4F4DD148EA692D8F63B73E16',	'WeaponResist_283_0BB9356646F9B928F5FC1584A458ABBC','Magic','Physical'],axis=1)
enemydf = enemydf.rename({'DisplayNameID_151_570F1B1240B0BB42C31AA7A4F6CFAF72':'DisplayName','ResourceLabel_295_2D43957442A6F48343BB639C79635FAE':'ResourceLabel'},axis=1)

In [9]:
def find_if_possible(name):
    try:
        return textdf.loc[name]['Text']['string']
    except:
        return 'None'

enemydf['DisplayName'] = enemydf['DisplayName'].apply(find_if_possible)
enemydf = enemydf.drop(enemydf[enemydf['DisplayName'].str.contains('None')].index)

In [10]:
mapping = {"EATTRIBUTE_RESIST::eWEAK":1,"EATTRIBUTE_RESIST::eNONE":0}
enemydf = enemydf.replace(mapping)

In [11]:
enemydf = enemydf.reset_index()
enemydf.head()

Unnamed: 0,Enemy_ID,DisplayName,ResourceLabel,Fire,Ice,Lightning,Wind,Light,Dark,Sword,Spear,Dagger,Axe,Bow,Staff
0,ENE_BOS_WAR_C01_010,Ritsu Mishuyo,ENE_BOS_WAR_C01_010,1,0,0,0,0,0,1,0,0,0,1,0
1,ENE_BOS_WAR_C01_020,Ritsu's Footman,ENE_BOS_WAR_C01_020,1,0,0,0,0,0,1,1,0,0,1,0
2,ENE_BOS_WAR_C02_010,Bandelam the Reaper,ENE_BOS_WAR_C02_010,1,0,0,0,1,0,0,1,1,1,0,0
3,ENE_BOS_WAR_C02_020,Borneau,ENE_BOS_WAR_C02_020,0,0,0,0,0,0,0,0,0,0,0,0
4,ENE_BOS_WAR_C04_010,Rai Mei,ENE_BOS_WAR_C04_010,1,0,0,0,1,0,1,0,0,0,0,1


Now to combine with resourcedb

In [12]:
def find_if_possible(name):
    try:
        return resourcedf.loc[name]['ActionOrderIconL']['asset_path_name']
    except:
        return 'None'

enemydf['asset_path'] = enemydf['ResourceLabel'].apply(find_if_possible)
enemydf = enemydf.drop(enemydf[enemydf['asset_path'] == 'None'].index)
enemydf = enemydf.drop('ResourceLabel',axis=1)

In [13]:
enemydf['asset_path'] = enemydf['asset_path'].apply(lambda x: ".\\enemyicons\\" + (x[57:])[:int(len(x[57:])/2)] + '.png')
# edge case:
enemydf = enemydf.drop(enemydf[enemydf['asset_path']=='.\\enemyicons\\UiTX_Battle_Oder_Select_Ritsu.png'].index)
enemydf.reset_index(drop=True, inplace=True)

In [14]:
enemydf.to_csv('dataset.csv')

## Part 2: CNN creation

In [28]:
import pandas as pd
import sklearn
import numpy as np
import sys
import os
import json
import re
from PIL import Image

import tensorflow as tf
import keras

df = pd.read_csv('dataset.csv')

Confirming all images are of the same size

In [8]:
size = (0,0)
for i in df['asset_path']:
    if os.path.exists(i):
        if Image.open(i).size != size:
            print(Image.open(i).size)
            size = Image.open(i).size
    else:
        print(i)

(128, 128)


In [9]:
tf.test.is_built_with_cuda()

False

In [10]:
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

In [11]:
tf.config.list_logical_devices()

[LogicalDevice(name='/device:CPU:0', device_type='CPU')]

I have a AMD GPU so I can't do much about this ):

For now I will continue writing the code, assuming I can train with CUDA avalible, and on a later date I will train with a GPU from UMIACS once I figure out how to connect to it.

In [29]:
import keras.utils as image

def arrange_data(dfin):
    
    image_data = []
    img_paths = np.asarray(dfin['asset_path'])
    
    for i in range(len(img_paths)):
        img = image.load_img(img_paths[i],target_size=(128, 128, 3))
        img = image.img_to_array(img)
        img = img/255
        image_data.append(img)
        
        
    X = np.array(image_data)
    Y = np.array(dfin[['Fire','Ice','Lightning','Wind','Light','Dark','Sword','Spear','Dagger','Axe','Bow','Staff']])
    
    print("Shape of images:", X.shape)
    print("Shape of labels:", Y.shape)
    
    return X, Y

from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size=0.1)



x_train, y_train = arrange_data(dfin = train)


x_test, y_test = arrange_data(dfin = test)

Shape of images: (640, 128, 128, 3)
Shape of labels: (640, 12)
Shape of images: (72, 128, 128, 3)
Shape of labels: (72, 12)


In [31]:
from keras.applications import VGG16
from keras import models, layers, optimizers
from keras.preprocessing.image import ImageDataGenerator

vgg_conv = VGG16(weights='imagenet', include_top=False, input_shape=(128, 128, 3))

# freeze last 4 layers
for layer in vgg_conv.layers[:-4]:
    layer.trainable = False


model = models.Sequential()

model.add(vgg_conv)

num_classes = 12

model.add(layers.Flatten())
model.add(layers.Dense(1024, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(num_classes, activation='sigmoid'))

model.summary()

model.compile(optimizer=optimizers.RMSprop(lr=1e-4), loss='binary_crossentropy', metrics=['accuracy'])

EPOCHS=50
BS = 64

# data augmentation to reduce overfit
aug = ImageDataGenerator(rotation_range=20, zoom_range=0.15,width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,horizontal_flip=True, fill_mode="nearest")

history = model.fit_generator(aug.flow(x_train, y_train, batch_size=BS),validation_data=(x_test, y_test), steps_per_epoch=len(x_train) // BS, epochs=EPOCHS)

model.save('Model_4d.h5')

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 vgg16 (Functional)          (None, 4, 4, 512)         14714688  
                                                                 
 flatten_2 (Flatten)         (None, 8192)              0         
                                                                 
 dense_4 (Dense)             (None, 1024)              8389632   
                                                                 
 dropout_2 (Dropout)         (None, 1024)              0         
                                                                 
 dense_5 (Dense)             (None, 12)                12300     
                                                                 
Total params: 23,116,620
Trainable params: 15,481,356
Non-trainable params: 7,635,264
_________________________________________________________________
Epoch 1/50


  history = model.fit_generator(aug.flow(x_train, y_train, batch_size=BS),validation_data=(x_test, y_test), steps_per_epoch=len(x_train) // BS, epochs=EPOCHS)


Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


I'll let this run to completion for the sake of presentation, but this ONLY uses the CPU (AKA non-CUDA) to train, which leads to absolutely terrible results.