## Setup

### Loading
We first import `nbtschematic` to be able to load `.schematic` files. It uses `nbtlib` to parse the nbt file and some classes inherented from `numpy` classes to store the data. 

In [2]:
from nbtschematic import SchematicFile
sf = SchematicFile.load("schematics/apple.schematic")
sf

<SchematicFile 'Schematic': Schematic({'Blocks': ByteArray([Byte(0), Byte(-97), Byte(0), Byte(-97), Byte(-97), Byte(-97), Byte(0), Byte(-97), Byte(0), Byte(-97), Byte(-97), Byte(-97), Byte(-97), Byte(-97), Byte(-97), Byte(-97), Byte(-97), Byte(-97), Byte(0), Byte(-97), Byte(0), Byte(-97), Byte(-97), Byte(-97), Byte(0), Byte(-97), Byte(0), Byte(0), Byte(0), Byte(0), Byte(0), Byte(35), Byte(0), Byte(0), Byte(0), Byte(0)]), 'Materials': String('Alpha'), 'Data': ByteArray([Byte(0), Byte(14), Byte(0), Byte(14), Byte(14), Byte(14), Byte(0), Byte(14), Byte(0), Byte(14), Byte(14), Byte(14), Byte(14), Byte(14), Byte(14), Byte(14), Byte(14), Byte(14), Byte(0), Byte(14), Byte(0), Byte(14), Byte(14), Byte(14), Byte(0), Byte(14), Byte(0), Byte(0), Byte(0), Byte(0), Byte(0), Byte(13), Byte(0), Byte(0), Byte(0), Byte(0)]), 'TileEntities': List[BlockEntity]([]), 'Entities': List[Entity]([]), 'Length': Short(3), 'WEOffsetX': Int(0), 'WEOffsetY': Int(-2), 'WEOriginZ': Int(26), 'WEOffsetZ': Int(-2), 'Hei

### Parsing
What matters to us is the `blocks` property of the `SchematicFile` object. It is a 3D array of `Block` objects. Each `Block` object is a Byte type holding the block id. To be able to use it in our model, we need to convert it into a numpy `np.array`.

In [3]:
import numpy as np

np.asarray(sf.blocks)

array([[[  0, -97,   0],
        [-97, -97, -97],
        [  0, -97,   0]],

       [[-97, -97, -97],
        [-97, -97, -97],
        [-97, -97, -97]],

       [[  0, -97,   0],
        [-97, -97, -97],
        [  0, -97,   0]],

       [[  0,   0,   0],
        [  0,  35,   0],
        [  0,   0,   0]]], dtype=int8)

## First model - cube with 9^3 blocks and 16 block types

The first model is a simple cube with 9^3 blocks and 16 block types. 

**Objective:** Get the model to classify the blocks correctly, we will only provide schematics of cubes with ONLY 1 block type, but in random positions (We will basically fill the cube with 1 block type, and then remove some blocks randomly).

**Why:** This is to make sure we are able to train the model to classify the blocks correctly, then we will move on to more complex models where it will actually be classifying multiple block structures.

### Valid blocks

We will only be using the following blocks:
**Air, Dirt, Oak Log, Oak Leaves, Stone Brick, Cobblestone, Glass, Sandstone, Redstone Lamp, Iron Bars, Stone Brick, Bricks, Block of Quartz, White Wool, Bookshelf, White Terracotta, Nether Brick**


I already have a schematic file with all these blocks lined up in a row, so we will just load that and use it to get the block ids.

In [8]:
all_blocks_schematic = SchematicFile.load("schematics/allblocks.schematic")
# We reverse the array because I want the air block to be in the front, it doesn't really matter though
blocks = np.asarray(all_blocks_schematic.blocks).flatten()[::-1]
blocks

array([   0,    3,   17,   18,   97,    4,   20,   24,  123,  101,   45,
       -101,   35,   47,  -97,  112], dtype=int8)

### One Hot Encoding the blocks

We will use a one hot encoding to represent the blocks. We will have a 16 length vector, with each index representing a block type. The index of the block type will be set to 1, and the rest will be 0.

**Why:** Block types is a categorical variable, and we need to represent it in a way that the model can understand. One hot encoding is a good way to do this. If we were to use a simple integer encoding, the model would think that the block types are ordinal, and that the block type with the highest integer is the best block type. This is not the case, so we use one hot encoding.