In [1]:
import copy
import random

# Exploring Nested Features in KerML

This is a notebook where I am fooling around with possibilities for using the sequence semantics of SysML v2 to explore different kinds of relationships between things in a universe. The "user model" of SysML v2 (M1, where libraries live) is the source of constraints and rules for combining the things, denoted by individual characters, into sequences.

One of the reasons I used characters as symbols for these individual things is it also helps to look at the sequences and individual items themselves, which is what happens with strings (sequences of characters treated as an object).

If Greek characters, capital letters, and numbers are added, this allows for a universe of atoms with about 80 members. This probably plenty to work with giving how quickly combinatorial spaces expand.

# Section 1 - Exploring the size of spaces

This is a section to just play around with the combinatorics of arranging the items of a universe into sequences to see how quickly things get out of hand.

### Section 1.1 - Universe Setup

Define the universe of things. The key to keeping models under control in terms of size is to try and keep this universe small. Population of the universe is thus probably better as a bottoms-up effort than a top-down one.

In [2]:
number_things = 80

Maxium depth of feature to consider.

In [3]:
feature_depth = 5

To make it easier to deal with some of the conceptual issues of the labeling of sequences, it may be helpful to assign numbers to sequences as indexed.

### Section 1.2 - Completely default types

The most basic calculations are for unconstrained results.

In [4]:
anything_possibilities = 0
for i in range(1, feature_depth + 1):
    anything_possibilities = anything_possibilities + number_things ** i
    
print("{:,d}".format(anything_possibilities))

3,318,278,480


## Section 1.3 - Exploring Library Elements

After the initial play with the sizes of sequences, this section tries to look at how the constraints on sequences brings the size of the space down.

#### Section 1.3.1 - Classifiers and Minimal Interpretations

The minimum interpretations of classifiers are of length 1 (any sequence in a Classifier has its 1-tail inside also, which means only the 1-tails are not tails of other sequences).

In [5]:
number_things_in_class = 20
assert number_things_in_class < number_things

classifier_possibilities = number_things_in_class

for i in range(2, feature_depth + 1):
    classifier_possibilities = classifier_possibilities + number_things ** i
    
print("{:,d}".format(classifier_possibilities))

3,318,278,420


#### Section 1.3.2 - Basics of Nesting Features

One key consideration is that "Anything" has no minimum interpretation so it cannot be used in defining either side of a FeaturePair. "Things" can be used however. Since things is all possible sequences in of things in the universe, its minimal interpretation is a sequence of length 2.

In [6]:
things_size = number_things ** 2
print("{:,d}".format(things_size))

6,400


##### Section 1.3.2.1 - Defaults for nesting features in pure KerML

The rules above lead to a "default" nested feature (one feature with another as its featuringType, and neither with a specific Type applied) of sequences of lengths no less than 4, all concatenations of sequences from the minimal interpretation of "things."

The base rules for feature sequences then allow for the insertion of any atom from the universe between numbers of the minimal 2-sequences from the minimal interpretation of things.

In [7]:
number_of_5_length_sequences = things_size * number_things
print("{:,d}".format(number_of_5_length_sequences))

512,000


##### Section 1.3.2.2 - Defaults for nesting features in SysML

In SysML, all part properties must be typed by Part Definition or a specialization. Since Part Definitions are Classifiers, their minimal interpretations are of length one. So the nesting will result in sequences of length no less than 2, simply combinations of all 1-sequences of Part Definition.

## Section 2 - Sets from the Library

Start capturing library sets and specialize appropriately.

### Section 2.1 - Occurrences

Occurrences specialize Classifier, and obey the same rules. Minimal interpretations are of length 1.

### Section 2.2 - Links

Links specialize objects which specialize occurrences. The links that are often the most useful are the binary links as they do a lot of heavy lifting in libraries to link entities.
Generic links have attributes that are two or more participants, which can be pulled from the Anything set of sequences.

#### Section 2.2.2 - BinaryLinks

The binary links are simpler to deal with in that there are exactly two participants. Each binary link effectively pairs sequences.

#### Section 2.2.2.1 - Nesting and ends

The sequences within the interpretations of different features of the Association are useful for puzzling out end semantics.

Here we generate the list of possibilities. Only 5 atoms are used to keep the numbers small enough for exploration.

In [8]:
nesting_universe_instances = ['a', 'b', 'c', 'd', 'e']

Construct the full set of sequences possible with these atoms to be the Anything set.

In [9]:
nesting_universe_anything = [[]]
for i in range(1, 5):
    if i > 1:
        old_template = copy.deepcopy(nesting_universe_anything[i - 2])
        nesting_universe_anything.append([])
    for j, instance in enumerate(nesting_universe_instances):
        if i == 1:
            nesting_universe_anything[0].append(instance)
        else:
            for el in old_template:
                nesting_universe_anything[i - 1].append(instance + el)
            
len(nesting_universe_anything[3])

625

Select all sequences that go with the notional binary Association classifier (ending in 'a'):

In [10]:
assoc_seqs = []
for level in nesting_universe_anything:
    for seq in level:
        if seq[-1] == 'a':
            assoc_seqs.append(seq)
            
len(assoc_seqs)

156

Let the participant feature 1 have a Type that is a Classifier with minimal interpretation of {\[b\], \[c\]} and participant feature 2 have a Type that is a Classifier with minimal interpretation of {\[d\], \[e\]}.

In [11]:
BC_seqs = []
for level in nesting_universe_anything:
    for seq in level:
        if seq[-1] == 'b' or seq[-1] == 'c':
            BC_seqs.append(seq)
            
len(BC_seqs)

312

In [12]:
DE_seqs = []
for level in nesting_universe_anything:
    for seq in level:
        if seq[-1] == 'd' or seq[-1] == 'e':
            DE_seqs.append(seq)
            
len(DE_seqs)

312

Select all feature-length sequences that aren't classified by our Association but have it as a featuring type, meaning that 'a' is the left-most item in the sequence.

In [13]:
part1_seqs = []
for level in nesting_universe_anything:
    for seq in level:
        if len(seq) > 1 and (seq[-1] == 'b' or seq[-1] == 'c') and seq[0] == 'a':
            part1_seqs.append(seq)
            
len(part1_seqs)

62

In [14]:
part2_seqs = []
for level in nesting_universe_anything:
    for seq in level:
        if len(seq) > 1 and (seq[-1] == 'd' or seq[-1] == 'e') and seq[0] == 'a':
            part2_seqs.append(seq)
            
len(part2_seqs)

62

In [15]:
part1_min_seqs = []
for seq in part1_seqs:
    if (len(seq) == 2):
        part1_min_seqs.append(seq)

In [16]:
part2_min_seqs = []
for seq in part2_seqs:
    if (len(seq) == 2):
        part2_min_seqs.append(seq)

In [17]:
part2_min_seqs

['ad', 'ae']

#### Section 2.2.2.2 - Continued example for ends

The leap from the participant feature (part1 and part2) sequences and the end sequences is to further sub-select the sequences that fall under *both* the link and the other end item.

The above model has effectively been

Association A
   participant part1 : BC (maps to end de : DE under BC)
   participant part2 : DE (maps to end bc : BC under DE)

The end1 sequences select out from part1 those sequences with d or e above them.

In [41]:
end1_seqs = []

for part1 in part1_seqs:
    if len(part1) > 2:
        if part1[1] == 'd' or part1[1] == 'e' or part1[2] == 'd' or part1[2] == 'e':
            end1_seqs.append(part1)
            
end1_seqs

['adb',
 'adc',
 'aeb',
 'aec',
 'aadb',
 'aadc',
 'aaeb',
 'aaec',
 'abdb',
 'abdc',
 'abeb',
 'abec',
 'acdb',
 'acdc',
 'aceb',
 'acec',
 'adab',
 'adac',
 'adbb',
 'adbc',
 'adcb',
 'adcc',
 'addb',
 'addc',
 'adeb',
 'adec',
 'aeab',
 'aeac',
 'aebb',
 'aebc',
 'aecb',
 'aecc',
 'aedb',
 'aedc',
 'aeeb',
 'aeec']

#### Section 4.2.3 Links

For general links, there can be more than two participants. The rules for mapping from the instances of the links and the participants to instances matched to features of end types is less clear. Here we can explore that a bit with an example inspired by *The Last Dance*, the NBA finals between the Seattle Supersonics and Chicago Bulls. Then, for fun, a follow-on game where a key matchup between Gary "The Glove" Payton and Michael Jordan is repeated when Jordan is on another team in another year.

To use the team, year, player example, here are some abbreviations:
b - Bulls
s - Sonics
p - Gary Payton
j - Michael Jordan
6 - 1996
2 - 2002
w - Wizards
a - Association

In [23]:
link_universe_instances = ['a', 'b', 's', 'p', 'j', '6', '2', 'w']

In [24]:
link_universe_anything = [[]]
link_universe_flat = []
for i in range(1, 5):
    if i > 1:
        old_template = copy.deepcopy(link_universe_anything[i - 2])
        link_universe_anything.append([])
    for j, instance in enumerate(link_universe_instances):
        if i == 1:
            link_universe_anything[0].append(instance)
            link_universe_flat.append(instance)
        else:
            for el in old_template:
                link_universe_anything[i - 1].append(instance + el)
                link_universe_flat.append(instance + el)

In [25]:
def is_team(instance):
    if instance[-1] == 'b' or instance[-1] == 's' or instance[-1] == 'w':
        return True
    return False

In [26]:
def is_player(instance):
    if instance[-1] == 'p' or instance[-1] == 'j':
        return True
    return False

In [27]:
def is_year(instance):
    if instance[-1] == '2' or instance[-1] == '6':
        return True
    return False

In [28]:
def is_link(instance):
    if instance[-1] == 'a':
        return True
    return False

In [29]:
teams = [instance for instance in link_universe_flat if is_team(instance)]
players = [instance for instance in link_universe_flat if is_player(instance)]
years = [instance for instance in link_universe_flat if is_year(instance)]
links = [instance for instance in link_universe_flat if is_link(instance)]

The nesting for the features above is interesting in how we can interpret it (depending on features provided in the larger user model). For example, 'b6j' could represent Michael Jordan as he was a player in 1996 for the Bulls.
We should note that there a number of sequences that may not make sense like 'bjj' (Jordan referencing himself while on the Bulls?) This might point to some useful default semantics for SysML part usage and the like.
Note that we have no restrictions yet, so there are several combinations where players appear to be linked to the wrong team. But we have said what the legal combinations are ... this may be oppositions rather than playing on the team.

In [30]:
players[40:50]

['bpp', 'bpj', 'bjp', 'bjj', 'b6p', 'b6j', 'b2p', 'b2j', 'bwp', 'bwj']

As before, the participants of our association restrict the universe of sequences to those that have the association minimum interpretation ('a') as the head and a minimal interpretation ('p', 'j' for players) as the tail.

In [31]:
linked_players = [linked_player for linked_player in players if linked_player[0] == 'a']
linked_teams = [linked_team for linked_team in teams if linked_team[0] == 'a']
linked_years = [linked_year for linked_year in years if linked_year[0] == 'a']

In [32]:
linked_years[10:20]

['aj6', 'aj2', 'a66', 'a62', 'a26', 'a22', 'aw6', 'aw2', 'aaa6', 'aaa2']

If we look at any one of these ends, they basically have a way to navigate to things of a given classifier either directly or through the other dimension. We can start by making a series of random links.

In [33]:
trilinks = []
for i in range(0, 100):
    rand_trilink = [links[random.randint(0, len(links) - 1)],
                    linked_players[random.randint(0, len(linked_players) - 1)],
                    linked_teams[random.randint(0, len(linked_teams) - 1)],
                    linked_years[random.randint(0, len(linked_years) - 1)]]
    trilinks.append(rand_trilink)

In [34]:
trilinks

[['pjja', 'aapj', 'aaas', 'as66'],
 ['bawa', 'aawj', 'a2ss', 'ap26'],
 ['b2pa', 'a2bj', 'ajsw', 'aaw6'],
 ['2jba', 'ajpp', 'a26b', 'aw6'],
 ['awwa', 'aswj', 'awbw', 'a666'],
 ['p2pa', 'a26p', 'awb', 'a6p6'],
 ['pbpa', 'aajj', 'a2ss', 'ap6'],
 ['wpwa', 'aasp', 'aw2w', 'asw6'],
 ['ppja', 'asbp', 'asps', 'asp6'],
 ['swaa', 'apaj', 'as2s', 'abs2'],
 ['26a', 'aapj', 'asbb', 'aba2'],
 ['jba', 'ajbp', 'a2js', 'a6a6'],
 ['62sa', 'ajp', 'abws', 'aaa6'],
 ['jp6a', 'abpp', 'aabw', 'aba6'],
 ['w2aa', 'aw2p', 'ajwb', 'a2p2'],
 ['pbpa', 'aspj', 'aspb', 'aa2'],
 ['26pa', 'awp', 'asaw', 'a6s6'],
 ['266a', 'aajp', 'ap6w', 'aja2'],
 ['ppaa', 'aap', 'aabw', 'ap26'],
 ['sbwa', 'abwj', 'apjb', 'aba2'],
 ['jssa', 'as6p', 'as6w', 'ab66'],
 ['japa', 'aaj', 'asbs', 'awa2'],
 ['pja', 'a6wp', 'asb', 'a6a2'],
 ['wpaa', 'asbj', 'a6b', 'asa2'],
 ['2waa', 'aa2p', 'ajjs', 'abj2'],
 ['6wpa', 'a6bp', 'aw6w', 'apw6'],
 ['aasa', 'a6jj', 'a2bb', 'awa2'],
 ['jbba', 'ajjp', 'aa2w', 'aj2'],
 ['s2wa', 'a6ap', 'asjw', 'aaa2'],

Next we consider how we made specify one of the ends for this link. It would make sense to have the Teams classifier be able to a Feature that calls out all players associated with the team, and perhaps to further filter this by year.

In [35]:
team_trilinks = sorted(trilinks,key=lambda x: x[2])
team_trilinks

[['sjba', 'aap', 'a22w', 'a62'],
 ['2jba', 'ajpp', 'a26b', 'aw6'],
 ['sj2a', 'a2j', 'a2b', 'aaa6'],
 ['s2ja', 'abpj', 'a2b', 'ap22'],
 ['aasa', 'a6jj', 'a2bb', 'awa2'],
 ['aawa', 'abbp', 'a2bb', 'ap6'],
 ['jbja', 'apwp', 'a2bs', 'a6j2'],
 ['baaa', 'aj6p', 'a2bw', 'aa2'],
 ['paba', 'aajj', 'a2bw', 'a2j6'],
 ['jba', 'ajbp', 'a2js', 'a6a6'],
 ['2aa', 'assp', 'a2jw', 'aaa6'],
 ['p2aa', 'a6sj', 'a2ps', 'a622'],
 ['asja', 'a6wj', 'a2s', 'a6a6'],
 ['jaja', 'apsp', 'a2sb', 'ap2'],
 ['bawa', 'aawj', 'a2ss', 'ap26'],
 ['pbpa', 'aajj', 'a2ss', 'ap6'],
 ['bbsa', 'a26p', 'a2sw', 'aa6'],
 ['2ssa', 'aabj', 'a2w', 'a2b2'],
 ['pjaa', 'ajbj', 'a2ww', 'a266'],
 ['p22a', 'awbj', 'a62b', 'abb2'],
 ['bpaa', 'aa2j', 'a62b', 'abs2'],
 ['wswa', 'aasj', 'a62b', 'abp2'],
 ['sjpa', 'a2pp', 'a66b', 'a6p2'],
 ['wpaa', 'asbj', 'a6b', 'asa2'],
 ['s2wa', 'asjj', 'a6bs', 'aw6'],
 ['ppwa', 'a6pj', 'a6bw', 'a6j2'],
 ['wppa', 'awsp', 'a6jb', 'a666'],
 ['jbja', 'asaj', 'a6js', 'ab26'],
 ['apwa', 'awpp', 'a6jw', 'a626'],
 [

In [36]:
team_sliced = {}
for uni in set([p[2] for p in team_trilinks]):
    local = []
    for trilink in trilinks:
        if trilink[2] == uni:
            local.append(trilink[1])
    team_sliced.update({uni: local})
    
team_sliced

{'awjs': ['ajpj'],
 'aaww': ['assj'],
 'a2ps': ['a6sj'],
 'a6bw': ['a6pj'],
 'a62b': ['awbj', 'aa2j', 'aasj'],
 'awb': ['a26p'],
 'awbb': ['a22j'],
 'as2w': ['aj6j'],
 'abwb': ['aw6j'],
 'abps': ['abap'],
 'a6ws': ['appp'],
 'a6w': ['ap'],
 'asaw': ['awp'],
 'a26b': ['ajpp'],
 'asb': ['a6wp'],
 'aww': ['a2p'],
 'apjb': ['abwj'],
 'awab': ['aw6p'],
 'a2ww': ['ajbj'],
 'appb': ['appp', 'ap6j'],
 'a22w': ['aap'],
 'aa2b': ['ajp'],
 'ajjs': ['aa2p'],
 'a2s': ['a6wj'],
 'abab': ['ap6p'],
 'ap6w': ['aajp', 'ap2p'],
 'aa2w': ['ajjp', 'abbp'],
 'aaw': ['a6ap'],
 'ajwb': ['aw2p'],
 'a2sw': ['a26p'],
 'awbw': ['aswj'],
 'a2b': ['a2j', 'abpj'],
 'ajsb': ['a2ap'],
 'asjw': ['a6ap'],
 'asab': ['aaj'],
 'a2sb': ['apsp'],
 'a6b': ['asbj'],
 'abbs': ['a6jj'],
 'aasb': ['asaj'],
 'ajww': ['awbp'],
 'apsb': ['awp'],
 'a2bw': ['aj6p', 'aajj'],
 'a6jw': ['awpp'],
 'aa6b': ['a6bp'],
 'abws': ['ajp'],
 'asps': ['asbp'],
 'a2bb': ['a6jj', 'abbp'],
 'a6js': ['asaj'],
 'aabw': ['abpp', 'aap', 'a2aj'],
 'ajs': 