# BASALT 2022 Inventory Item Classifier

## Background
I am working on potential solutions for https://www.aicrowd.com/challenges/neurips-2022-minerl-basalt-competition. The goal of this classifier is to extract a bit of information about the inverntory. In this case, I am attempting to fill a dictionary of all the items in the inventory. When I play minecraft, an understanding of what is in my inventory as I play is very important. My model would have no way of knowing what is in its inventory without having it open otherwise. 

## Data Mining
In order to get the data to train this model on, I used a list of items that are in the 1.16 minecraft version. With that, I looped through the list and started a new gym environment that spawned in the game with each item at each quanitity. After I had images of all the inventories from this loop, I broke it into pieces of each inventory square. From there, I saved that image file as the name of the item and the quantity. After I had that, I went right into the code below.

## Model Code

### Imports

In [1]:
import os
import numpy as np
from PIL import Image
from sklearn.naive_bayes import GaussianNB
from detecto import core, utils, visualize
import matplotlib.pyplot as plt
import cv2
import gdown

### Helper Functions

In [2]:
def create_dataset_items(img_folder):
    
    img_data_array=[]
    class_name=[]
    for file in os.listdir(img_folder):    
        if ".jpg" in file:
            image_path= os.path.join(img_folder, file)
            image= np.array(Image.open(image_path))
            image = image.astype('float32')
            image /= 255  
            img_data_array.append(image)
            class_name.append(file.split("-")[0])
    return img_data_array , class_name

In [3]:
def load_item_train_data():
    trainPath = "../assets/datasets/Item Classifier Data/train/"

    xTrainData, yTrainData = create_dataset_items(trainPath)

    uniqueOutputs = []
    for y in yTrainData:
        if not y in uniqueOutputs:
            uniqueOutputs.append(y)

    toNumDict = {uniqueOutputs[i]: i for i in range(len(uniqueOutputs))}
    fromNumDict = {i: uniqueOutputs[i] for i in range(len(uniqueOutputs))}


    yNumTrain = [toNumDict[y] for y in yTrainData]


    xTrainData = np.array(xTrainData, np.float32)
    y_train = np.array(list(map(int,yNumTrain)), np.int64)

    n_samples_train = len(xTrainData)
    x_train = xTrainData.reshape((n_samples_train, -1))

    return x_train, y_train, fromNumDict, toNumDict

In [4]:
def load_item_test_data():
    testPath = "../assets/datasets/Item Classifier Data/test/"

    xTestData, yTestData = create_dataset_items(testPath)

    xTestData = np.array(xTestData, np.float32)
    y_test = yTestData

    n_samples_test = len(xTestData)
    x_test = xTestData.reshape((n_samples_test, -1))

    return x_test, y_test

### Build Model
Keep in mind, I tested this on 4-5 different machine learning models outside of this notebook. For this problem, SkLearn's Gaussian Naive Bayes model worked best and that is what I used.

In [5]:
model = GaussianNB()

### Train Model

In [6]:
x_train, y_train, fromNumDict, toNumDict = load_item_train_data()

model.fit(x_train, y_train)

### Test Model

In [7]:
x_test, y_test = load_item_test_data()

predicted = []
for pred in model.predict(x_test):
    predicted.append(fromNumDict[pred])
            
            
numCorrect = 0
for i in range(0, len(y_test)):
    if not y_test[i] == predicted[i]:
        print(y_test[i], predicted[i])
    else:
        numCorrect += 1
    
print("\nAccuracy of Item Classifier: ", numCorrect/len(y_test))

red_mushroom brown_mushroom
empty stone_pressure_plate
red_mushroom brown_mushroom
tripwire_hook red_mushroom
brown_mushroom red_mushroom
brown_mushroom red_mushroom
dispenser dropper
dispenser dropper
empty tripwire_hook
stone_pressure_plate red_mushroom
dispenser dropper
dispenser dropper
empty stone_pressure_plate
stone_pressure_plate red_mushroom
chest trapped_chest
chest trapped_chest
dispenser dropper
stone_pressure_plate red_mushroom
brown_mushroom red_mushroom
chest trapped_chest
dispenser dropper
dispenser dropper
chest trapped_chest
dispenser dropper
chest trapped_chest
stone_pressure_plate red_mushroom
chest trapped_chest
dispenser dropper
empty tripwire_hook
heavy_weighted_pressure_plate light_weighted_pressure_plate
empty tripwire_hook
brown_mushroom red_mushroom
red_mushroom brown_mushroom
empty stone_pressure_plate
stone_pressure_plate red_mushroom
dispenser dropper
empty stone_pressure_plate
dispenser dropper
dispenser dropper
tripwire_hook red_mushroom
tripwire_hook re

### Conclusion
This model was very effective at solving the item classification problem. While 98% accuracy is how the model scored in testing, it is actually higher than that because some items look exactly the same. For example, chests and trapped chests look exactly the same so it isn't fair to penalize the model for this.