# Generating Generics

In [14]:
import torch
from semantic_memory import memory

from collections import defaultdict

In [15]:
# Create world with the desired files (refer to README for details about the files)

world = memory.Memory(
    concept_path="../data/concept_senses.csv",
    feature_path="../data/xcslb_compressed.csv",
    matrix_path="../data/concept_matrix.txt",
    feature_metadata="../data/feature_lexicon.csv",
)
world.create()

521it [00:00, 2929.70it/s]


## A function to retrieve generics

Most generics in the eXtended CSLB dataset (XCSLB) can be thought of mostly as majority-characteristic--ones that occur in a majority of the members of a category. Some of these might also be L-principled, but it is difficult to automatically charaterize them at the moment.

Logic = consider features that occur in at least `threshold` amount of concepts for a given category. The concepts that possess the property become **instances** of the generic, while those that do not become **exceptions**.

Higher order categories can be used from the taxonomy that comes with the world.

In [16]:
def get_generics(category, threshold = 0.85):
    generics = defaultdict(lambda: defaultdict(list))
    members = world.taxonomy[category].descendants()

    coverage = world.vectors(members).sum(0)/len(members)

    candidate_features = torch.bitwise_and(coverage >= threshold, coverage < 1.0).nonzero().flatten()

    subspace = world.vectors(members)[:, candidate_features]

    idx = {
        'instances': (subspace != 0.0).nonzero().tolist(),
        'exceptions': (subspace == 0.0).nonzero().tolist()
    }

    for k, v in idx.items():
        for concept, feature in v:
            feature = world.features[candidate_features[feature].item()]
            concept = members[concept]
            generics[feature][k].append(concept)

    for k, v in generics.items():
        v.default_factory = None
    generics.default_factory = None

    return generics

Consider higher order categories that are meaningful and have at least 5 members, and aren't too broad (less than 200 members). Here's how to retrieve them from the taxonomy (subset of wordnet):

In [17]:
categories = defaultdict(dict)

for k, v in world.taxonomy.items():
    category_name = k.split(".")[0].replace("_", " ").strip()
    num_descendants = len(v.descendants())
    if category_name not in world.concepts and num_descendants > 5 and num_descendants < 200:
        categories[category_name] = {
            'taxonomy_node': k,
            'descendants': num_descendants
        }
        
categories.default_factory = None

## Actually generating the generics!

Here is a demo for generating generics for birds with 0.8 as the threshold.

I have no particular recommendations how one should decide on a threshold, but am open to any changes should they be reasonable.

In [19]:
gen = get_generics('bird.n.01', 0.8)

for feature, generics in gen.items():
    print('-'*80)
    print(f"GENERIC: A bird {feature}")
    print(f"EXCEPTIONS:")
    for e in generics['exceptions']:
        print(f'{world.lexicon[e].article} {world.feature_lexicon[feature].negation}')
    # Optionally, you can access the instances by running 
    # the above loop for generics['instances'] as well
    print('-'*80)

--------------------------------------------------------------------------------
GENERIC: A bird can be airborne
EXCEPTIONS:
a cockerel cannot be airborne
a penguin cannot be airborne
an emu cannot be airborne
an ostrich cannot be airborne
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
GENERIC: A bird can build nests
EXCEPTIONS:
a peacock cannot build nests
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
GENERIC: A bird can fly
EXCEPTIONS:
a cockerel cannot fly
a penguin cannot fly
an emu cannot fly
an ostrich cannot fly
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
GENERIC: A bird can stand on one leg
EXCEPTIONS:
a penguin cannot stand on one leg