This notebook is dedicated to establishing a reliable method of transferring batches of data to and from Google Drive and storing metadata in an accessible format

John Marangola
11/2/2021

We begin by sketching out how data should be clearly and efficiently stored as follows:

In order to standardize on a simple and very useful convention, we define an enum for the pieces on the chess board. 

In [28]:
import pandas as pd
import numpy as np
from enum import Enum

class ChessPiece(Enum):
    PAWN = 1
    ROOK = 2
    KNIGHT = 3
    KING = 4
    QUEEN = 5
    BISHOP = 6
    EMPTY = 7
     
piece = ChessPiece.PAWN
if piece is ChessPiece.PAWN:
    print("This is a pawn!")
if piece != ChessPiece.KNIGHT:
    print("Not a knight!")




This is a pawn!
Not a knight!


In order to avoid remembering ambiguous conventions such as T/F for colors of square and color of piece (Which takes time to remember and makes de bugging hard), we use a similiar standard enum for colors of things (pieces and squares).

In [9]:
class Color(Enum):
    ORANGE = 1
    BLUE = 2
    BLACK = 3
    WHITE = 4

piece_1_color = Color.ORANGE
piece_2_color = Color.BLUE
print("pieces are opponents") if piece_1_color != piece_2_color else "pieces are allies"

pieces are opponents


Now we find a clear convention for labelling positions on the board. If you are unfamiliar with chess take a look at this image that visually explains so-called "algebraic" notation:


In [20]:
import urllib.request
from PIL import Image

urllib.request.urlretrieve(
  "https://upload.wikimedia.org/wikipedia/commons/thumb/b/b6/SCD_algebraic_notation.svg/1200px-SCD_algebraic_notation.svg.png", "SCD_algebraic_notation.svg")
  
img = Image.open("SCD_algebraic_notation.svg")
img.show()

For the sake of simplicity, we will define positions as "LN" where L is the letter associated with the position and N is the number associated with the position ie:

In [25]:
position1 = "e2"
position1_alt = "E2"
position1_alt = position1_alt.lower()
print(f"automatic case convesion works: {position1 == position1_alt}")

position2 = "g1"
print(f"position2 equals position1: {position2 == position1}")

automatic case convesion works: True
position2 equals position1: False


This appears to be robust. Since the convention in chess is always <letter><number> it is illogical to even worry about things such as 2e and e2 not being equivalent. Now lets move on to the storing all the metadata for a single piece. We decided that the metadata fields that should be recorded for each image are:
    1. Piece type
    2. Piece color (or lack of)
    3. Position
    4. Color of tile

We can therefore define a function that recieves these fields as parameters:

In [32]:

# (Skip type validation for now)
def print_metadata(piece_type, piece_color, position, tile_color):
    print(piece_type.name)
    print(f"piece color: {piece_color.name}")
    print(f"position: {position.lower()}")
    print(f"tile color: {tile_color.name}")

piece_color = Color.ORANGE
piece_type = ChessPiece.ROOK
position = "E5"
tile_color = Color.BLACK

print_metadata(piece_type, piece_color, position, tile_color)


ROOK
piece color: ORANGE
position: e5
tile color: BLACK


Clearly, we can never have any pieces other than {ROOK, KING, QUEEN, KNIGHT, ..., BISHOP} or the allowed colors. Everything is always in the correct format when saved and we will save space by only writing integers to the csv instead of numerous strings for instance:

In [37]:
demo_color = Color.BLACK
# "write" operation:
print(demo_color.value)
# Get the demo color back from # it is written as:
print(Color(3))


3
Color.BLACK
