Skip to content

SvenShade/UnicodeAnalogies

Repository files navigation

Unicode Analogies

This repo contains code for our CVPR 2023 paper.

Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge
Steven Spratley, Krista Ehinger, Tim Miller
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Analogical reasoning enables agents to extract relevant information from scenes, and efficiently navigate them in familiar ways. While progressive-matrix problems (PMPs) are becoming popular for the development and evaluation of analogical reasoning in computer vision, we argue that the dominant methodology in this area struggles to expose the lack of meaningful generalisation in solvers, and reinforces an objectivist stance on perception – that objects can only be seen one way – which we believe to be counterproductive. In this paper, we introduce the Unicode Analogies (UA) challenge, consisting of polysemic, character-based PMPs to benchmark fluid conceptualisation ability in vision systems. Writing systems have evolved characters at multiple levels of abstraction, from iconic through to symbolic representations, producing both visually interrelated yet exceptionally diverse images when compared to those exhibited by existing PMP datasets. Our framework has been designed to challenge models by presenting tasks much harder to complete without robust feature extraction, while remaining largely solvable by human participants. We therefore argue that Unicode Analogies elegantly captures and tests for a facet of human visual reasoning that is severely lacking in current-generation AI.

An example of a progressive matrix problem generated by UA.

Repository organisation

In the experiments folder, you will find the official dataset splits, plus the code required to train models and to reproduce all experiments in our paper. In the generation folder, you will find folders containing Unicode characters (fonts, images, and annotations), as well as code to generate new splits.

Running experiments

The dataset_splits folder contains subfolders for each of the 5 experiments, as they appear in the paper. Each experiment folder contains subfolders for the different splits generated, and in turn, those folders each contain 5 folds. To reproduce experiments from the paper, we provide .sh files containing all relevant commands. For example, running ./experiment_1.sh will train context-blind, Resnet, and Rel-Base models, on all splits and their folds for experiment 1. Once complete, it will write all results to the test_results folder, including the number of problems trained and tested, test accuracies during training, training duration, and a full breakdown of model accuracy over the different problem types/concepts it encountered (e.g. gestalt closure, relational position, ink level, to name a few). Note: If you want to train the SCL model, please extract the two folders in scl.zip to the root folder. It includes a version of the authors' library they used to implement their model.

Creating new datasets

In generation/UA.py, you will find code to import annotated images and generate new UA splits. You have control over the random seed, image size (all Unicode characters have been rendered at 500x500 pixels and will downsample to 80x80 by default), train-test split ratio, number of problems to generate, hold-out, context shift, extrapolation, k folds, and more. All code for generating the split types used in our paper is included. Upon familiarising yourself with this code, you should be able to customise it for your own splits. Please refer to our paper for further details on parameters for generating splits.

Extending the UA framework

UA assembles problems from annotation folders, which means that it isn't limited to the images we've sorted. One can generate images of new characters in different fonts (see generation/generate_imgs.py for an example of how to do this), or, one can create their own schema and label other types of images (see the generation/Annotations folder as an example of how to organise images for the program).

Supplementary material

Additional implementation details, as well as sample problems, can be found in the supplementary material.

Citation

If you find our work helpful, please cite us :)

@InProceedings{Spratley_2023_CVPR,
    author    = {Spratley, Steven and Ehinger, Krista A. and Miller, Tim},
    title     = {Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {19082-19091}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published