# CrystalVision

Computer Vision of the *Final Fantasy Trading Card Game (FFTCG)* by Allonte Barakat


## Proposal

Card games have been around since the 9th century AD; a boon from the new technology of woodblock printing [(Wilkinson, 1895)](https://zenodo.org/record/1448960). These games have made it across centuries and across continents giving use the recognizable French-suited 52-card deck - cards that have relatively simple anatomy; each one has a rank, a suit, and is the relatively the same if flipped 180-degrees.

Introduced in 1993 with *Magic: The Gathering*, trading card games are great advancement on card games and a great learning tools for learning building, strategy, and dynamic responses [(Turkay et al., 2012)](https://doi.org/10.1016%2Fj.sbspro.2012.06.130). Today, the anatomy of cards in games like the *Final Fantasy Trading Card Game* are complex yet easily human recognizable.

As outlined in David Forsyth and Jean Ponce's *Computer Vision: A Modern Approach*, recognition is one of Computer Vision's typical tasks. This act of recognition can come in the differing varieties of object classification, identification, and detection. For this project, I will be tackling recognition as best as possible. Beginning with classification step, I will make different models using the same inputs mapping to different outputs based on the card's anatomy, such as, *element*, *type*, *power*, *artist*, etc. I will also research on multi-class classification and see what techniques (if such exist publically) could be utilized. I hypothesize that the identity of a card can be derived by taking all these elements and traversing a graph-database. The larger stretch goal of being able to perform object detection via video (webcam) feed would enable the ultimate real-world application of playing games across languages or displaying the current prices for the purpose of trading.

Another interesting potential, if integrated with AR, could provided more experiental novelties by overlaying clips from the game or a AR-avatar over the card. The full-scale recognition, through the use of a CNN (likely ResNet model), could help with an in medias res AI analysis of potential strategies to aid a play to make the best plays. The environment being so stochastic, at least in comparison to a chess board, makes such applications, even in terms of model-representations interesting.

## Inspiration


By taking the standard deck as the basis, this person has used non-ML detection to identify the cards. By using a simple/stable background and applying gaussian blur on a grayscale image, they then find contours after some thresholding. After finding such contours, the image is then warped to the standard facing card and a manual comparison between stored images of the rank/suit for the deck is then compared to determine what card it is.


This is a very manual effort that is only feasible for a specific deck-printing and the limited nature of the deck. In TCGs, hundreds of new cards get released every few months.

<iframe width="560" height="315" src="https://www.youtube.com/embed/m-QPjO-2IkA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

This person has attempted detection in similar way but applying it a small to *Magic: the Gathering* cards (per set). In essence by greatly reducing the pool of possible cards, polling the top amount of pixels that would the cards *name*, and applying OCR the card can be identified. This approach heavily relies on the legibility (high enough DPI/megapixels) and can be prone to error depending on the OCR technique and/or library.


Once again, a very manual effort that cannot differentiate between languages or different art styles or printing processes -- all of which directly affect a card's value.

<iframe width="560" height="315" src="https://www.youtube.com/embed/BZGhRSajybk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

## Card Anatomy

The anatomy of a card are the identifiable features. With good graphic design, the major features are the most visually prominent. *Final Fantasy Trading Card Game* (FFTCG) has over 2600 unique cards and at least 11 features - where I would consider, at minimum, 5 to be major features: *name*, *element*, *type*, *cost*, *power*. These would be the primary things I would train, as the more minor features such *illustrator* would be useful to break identification ties.

### Name
In the upper portion of the card, on top of stylized boxes to mirror the card type, is the name of the card.

### Element
In FFTCG, there are 8 primary elements as follows: *lightning*, *fire*, *wind*, *ice*, *earth*, *water*, *light*, and *dark*. This is often one of the most major considerations in deck building and evolving a strategy. Second to that, there are some cards that can be more than one element at the same time, for this project, I will consider not adding these to the dataset for simplicity.

### Cost
While you can see the element by the color of the frame, the symbol behind the text, and the color of the crystal in the upper-left corner -- cost can only be see in the number in the upper-left corner.

<table>
    <tr>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/5-019L_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/8-035H_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/18-040H_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/2-093H_eg.jpg" /></td>
    </tr>
    <tr>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/6-096C_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/18-097R_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/8-133H_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/11-130L_eg.jpg" /></td>
    </tr>
</table>

### Type
The main card *types* are: *forward*, *backup*. *summon*, or *monster*. In general there are no cards that current exist that are the same time at the same type. The closest are *monsters* that can become a *forward* by some effect written on the card.

<table>
    <tr>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/18-026L_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/2-041H_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/9-025H_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/16-038H_eg.jpg" /></td>
    </tr>
</table>

### Power

Power always exists on *forward* types and some *monster* types. Bother having differing font-effects to signify the nature of always having the feature or sometimes having the feature.

<table>
    <tr>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/16-007R_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/16-010H_eg.jpg" /></td>
    </tr>
</table>


### Corner Cases
As the game progresses, it is entirely possible to run in the corner case of identifing all the same features but not able to do the final step of resolving to some unique id (such as 1-210S and 10-101L below). Some other uniqueness idenitifer would also need to be used, perhaps a minor feature such as *illustrator* or *job*.

<table>
    <tr>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/1-210S_eg.jpg" /></td>
        <td><img src="https://fftcg.cdn.sewest.net/images/cards/full/10-101L_eg.jpg" /></td>
    </tr>
</table>

## Data

To support FFTCG, Square Enix has a [*Card Browser*](https://fftcg.square-enix-games.com/na/card-browser) on its site. From here we can pull offical images (in 429x600 full-size or 179x250 thumb-size) in various languages and printings. By pooling and training on such data, it is possible to be current with each new set or special printing. Moreover, the exposed api allows us to use their feature-query data. For example:

In [None]:
{
    "Code": "12-002H",
    "Element": "\u706b",
    "Rarity": "H",
    "Cost": "3",
    "Power": "",
    "Category_1": "FFEX",
    "Category_2": "",
    "Multicard": "",
    "Ex_Burst": "",
    "Name": "\u30a2\u30de\u30c6\u30e9\u30b9",
    "Type": "\u53ec\u559a\u7363",
    "Job": "",
    "Text": "\u30aa\u30fc\u30c8\u30a2\u30d3\u30ea\u30c6\u30a31\u3064\u3092\u9078\u3076\u3002\u305d\u308c\u306e\u52b9\u679c\u3092\u7121\u52b9\u306b\u3059\u308b\u3002\u305d\u308c\u3092\u767a\u52d5\u3057\u3066\u3044\u308b\u306e\u304c\u30d5\u30a9\u30ef\u30fc\u30c9\u306e\u5834\u5408\u3001\u305d\u306e\u30d5\u30a9\u30ef\u30fc\u30c9\u306b8000\u30c0\u30e1\u30fc\u30b8\u3092\u4e0e\u3048\u308b\u3002",
    "Name_EN": "Amaterasu",
    "Type_EN": "Summon",
    "Job_EN": "",
    "Text_EN": "Choose 1 auto-ability. Cancel its effect. If the cancelled auto-ability triggered from a Forward, deal that Forward 8000 damage.",
    "Name_DE": "Amaterasu",
    "Type_DE": "Beschw\u00f6rung",
    "Job_DE": "",
    "Text_DE": "W\u00e4hle 1 Auto-F\u00e4higkeit aus. Annulliere deren Effekt. Falls die annullierte Auto-F\u00e4higkeit die eines St\u00fcrmers war, f\u00fcge diesem St\u00fcrmer 8000 Schaden zu.",
    "Name_ES": "Amaterasu",
    "Type_ES": "Invocaci\u00f3n",
    "Job_ES": "",
    "Text_ES": "Elige 1 habilidad de apoyo. Cancela su efecto. Si la habilidad de apoyo cancelada pertenec\u00eda a un Delantero, infl\u00edgele a ese Delantero 8000 puntos de da\u00f1o.",
    "Name_FR": "Amaterasu",
    "Type_FR": "Invocation",
    "Job_FR": "",
    "Text_FR": "Choisissez 1 comp\u00e9tence auto. Annulez son effet. Si la comp\u00e9tence auto appartenait \u00e0 un Avant, infligez 8000 points de d\u00e9g\u00e2ts \u00e0 cet Avant.",
    "Name_IT": "Amaterasu",
    "Type_IT": "Evocazione",
    "Job_IT": "",
    "Text_IT": "Scegli 1 autoabilit\u00e0. Annullane l'effetto. Se l'autoabilit\u00e0 annullata apparteneva a un'Avanguardia, infliggi 8000 punti di danno a quell'Avanguardia.",
    "Set": "Opus XII",
    "Text_NA": "Choose 1 auto-ability. Cancel its effect. If the cancelled auto-ability triggered from a Forward, deal that Forward 8000 damage.",
    "Job_NA": "",
    "Type_NA": "Summon",
    "Name_NA": "Amaterasu",
    "images": {
        "thumbs": [
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_eg.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_eg_FL.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_de.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_es.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_fr.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_it.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_de_FL.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_es_FL.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_fr_FL.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/thumbs/12-002H_it_FL.jpg"
        ],
        "full": [
            "https://fftcg.cdn.sewest.net/images/cards/full/12-002H_eg.jpg",
            "https://fftcg.cdn.sewest.net/images/cards/full/12-002H_eg_FL.jpg"
        ]
    }
}

## Technologies

The programming lanuage to be used in this project is *Python 3.10.x*.

For data analytics, I will use *pandas* (and *numpy*) libraries as a way to query the API data and traverse identifier information to a unique entry.

For image processing, I will use *opencv* (and *pillow*) libraries to preprocess image data if needed (such as downsampling).

For machine learning, I will use *tensorflow* and *keras* libraries to develop a simple CNN for the various identifier models. If the models correct (and time allows), a ResNet model would have to be used so that it could be easily plugged in to OpenCV so that it could be possible easily do object detection. *sklearn* will be useful in splitting training and test data evenly amoungst all (or as many as possible) unique major-identifying features.

To explore new unseen data (i.e. real-world cropped images), I will need to pull from the interenet. Some could be images from languages, art, and/or printings (such as if it is foil-printed) I have not trained on.
