gCOG

Generic COG (gCOG): A Compositional Generalization Dataset to Evaluate Multimodal Reasoning

Contact: takuya.ito@ibm.com

Reference: Ito T, Dan S, Rigotti M, Kozloski J, Campbell M (2024). On the generalization capacity of neural networks during generic multimodal reasoning. International Conference on Learning Representations (ICLR). http://arxiv.org/abs/2401.15030.

Last update: 2/12/2024

Dependencies

Dependencies: environment.yml (conda) and requirements.txt (pip)

Installation and setup

Clone the github repository
Set up the environment and dependencies, if needed (conda env create -f environment.yml and pip -r install requirements.txt)
Install package from base directory: pip install -e .

Overview

A classification task paradigm to evaluate compositional generalization. The purpose of this task is to measure how models compositionally generalize over precise task features, such as:

Systematic compositional generalization: Task operations (e.g., re-using previously seen task operations in novel settings)
Productive compositional generalization: Task complexity (e.g., depth / complexity of a task tree).
Distractor generalization: Stimulus/noise distractors (inclusion of irrelevant task information in the stimulus image).

This task was originally derived from the COG dataset and its predecessors (Yang et al. (2018)), but includes different task operators and specific dataloaders with explicit training/testing splits. Dataset and splits are provided in two categories:

Abstract/categorical tokens (as evaluated in the paper)
Images in pixel and instructions in language representations (see demos/PixelWordDataset/ for examples)
The primary code associated with the task is housed in gcog/task/

Demo notebooks with details on specific splits

There are 4 demo notebooks corresponding to each compositional split presented in the paper (distractor, systematicity (depth 1), systematicity (depth 3), productive generalization), as well as a generic notebook that demos the underlying task framework. The demos provide sample code for how to generate train/test splits.

1) Task framework: demos/Demo_TaskGeneration.ipynb

2) Distractor generalization demo: demos/Demo_DistractorGeneralization_Fig3split.ipynb.

3) Systematic generalization (task tree depth 1) demo: demos/Demo_Systematicity_OpSys_Fig4Asplit.ipynb

4) Systematic generalization (task tree depth 3) demo: demos/Demo_Systematicity_CompTreeSubsets_Fig4Dsplit.ipynb

5) Productive generalization demo: demos/Demo_Productivity_CompTree_Fig5split.ipynb

In addition, there are a separate set of demo notebooks that provide instructions for how to import and/use dataloaders that generate task samples in the image pixel and language instructions (i.e., not categorical tokens): demos/PixelWordDataset/

Description of task operators

Exist: Asks if a specific object exists. Specified with a color and shape (letter).

Example: "Is there a 'red c'?"
Returns: True or False.

GetColor/GetShape: Asks agent to return either a color or shape of an object with a specified attribute.

Example: "Get the shape of the green object"
Example: "Get the color of the letter a".
Returns: A string attribute (e.g., "a" or "red").
N.B.: There are checks in the task program to ensure that this question is not ill-posed (i.e., only a single correct answer).

Go: Asks the agent to return the location of a specified object.

Example: "Get the location of the 'red c'"
Returns: A tuple (x, y) coordinates.

AddEven: Asks the agent to add the location values of an object(s), and asks if the sum is even.

Example: "Is the sum of the coordinate values of the 'red c' even?"
Answer: If the 'red c' is on coordinate (4, 6), then the correct answer is True (4 + 6 = 10 is even)
Example: "Is the sum of the coordinate values of the 'red c' and the 'blue k' even?"
Answer: If the 'red c' is on coordinate (4, 6), and the 'blue k' is on (1, 4), then the correct answer is False (4 + 6 + 1 + 4 = 15 is odd)

AddOdd: Asks the agent to add the location values of an object(s), and asks if the sum is odd.

Example: "Is the sum of the coordinate values of the 'red c' odd?"
Answer: If the 'red c' is on coordinate (4, 6), then the correct answer is False (4 + 6 = 10 is even)
Example: "Is the sum of the coordinate values of the 'red c' and the 'blue k' odd?"
Answer: If the 'red c' is on coordinate (4, 6), and the 'blue k' is on (1, 4), then the correct answer is True (4 + 6 + 1 + 4 = 15 is odd)

MultiplyEven: Asks the agent to multiply the location values of an object(s), and asks if the product is even.

Example: "Is the product of the coordinate values of the 'red c' even?"
Answer: If the 'red c' is on coordinate (4, 6), then the correct answer is True (4 * 6 = 24 is even)
Example: "Is the product of the coordinate values of the 'red c' and the 'blue k' even"
Answer: If the 'red c' is on coordinate (4, 6), and the 'blue k' is on (1, 3), then the correct answer is True (4 * 6 * 1 * 4 = 96 is even)

MultiplyOdd: Asks the agent to multiply the location values of an object(s), and asks if the product is odd.

Example: "Is the product of the coordinate values of the 'red c' odd?"
Answer: If the 'red c' is on coordinate (4, 6), then the correct answer is False (4 * 6 = 24 is even)
Example: "Is the product of the coordinate values of the 'red c' and the 'blue k' odd"
Answer: If the 'red c' is on coordinate (4, 6), and the 'blue k' is on (1, 3), then the correct answer is False (4 * 6 * 1 * 4 = 96 is even)

If-Else This is a connector operator, which connects two subtrees together by inputting a boolean, and then branching into two different directions. If the boolean is True (e.g., the output of an Exist operator), then the agent should follow the left branch. If False, the agent should follow the right branch. This connector operator enables queries to be arbitrarily long.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
demos		demos
gcog		gcog
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demos

demos

gcog

gcog

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

environment.yml

environment.yml

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

gCOG

Contact: takuya.ito@ibm.com

Last update: 2/12/2024

Dependencies

Installation and setup

Overview

Demo notebooks with details on specific splits

Description of task operators

About

Releases

Packages

Contributors 2

Languages

License

IBM/gcog

Folders and files

Latest commit

History

Repository files navigation

gCOG

Contact: takuya.ito@ibm.com

Last update: 2/12/2024

Dependencies

Installation and setup

Overview

Demo notebooks with details on specific splits

Description of task operators

About

Resources

License

Stars

Watchers

Forks

Languages