# PLAsTiCC v2 taxonomy

_Alex Malz (GCCL@RUB)_ , _Rob Knop (raknop@lbl.gov)_

The purpose of this notebook is to outline a "bit"mask (really, the deciaml equivalent) schema for hierarchical classes of LSST alerts.
The bitmask corresponds to a "best" classification to be included in the alert.
Each digit in the bitmask, however, corresponds to a vector of classification probabilities, confidence flags, or scores that can be used to subsample the alert stream.
Persistent features could be queried for the subsampled objects from a separate database, which could be used for further selection.

In [1]:
from treelib import Node, Tree
import string

## Housekeeping

We need to think about how to sort through the classification information.
`directory` and `index` are very simplistic starting points.
It'll be easier when we have a better idea of what subsampling operations we'll perform.

In [2]:
directory = {}
index = {}

### Generating the integer codes

The idea is that every level of the tree corresponds to one digit in the bitmask.
The number of objects in the

In [3]:
### map real to 1 and bogus to 0 - doesn't need a 0 for alert digit

digs = string.digits + string.ascii_letters

def int2base(x, base):
    # 
    if x < 0:
        sign = -1
    elif x == 0:
        return digs[0]
    else:
        sign = 1

    x *= sign
    digits = []

    while x:
        digits.append(digs[int(x % base)])
        x = int(x / base)

    if sign < 0:
        digits.append('-')

    # digits.reverse()

    return ''.join(digits)

## Building a phylogenetic tree

Given the hierarchical class relationships, make a tree diagram (and record some hopefully useful information).

In [4]:
# def branch(tree, parent, children, prepend=["Other"], append=None, directory=directory, index=index):
#     level = 0
#     tmp = parent
#     while tree.ancestor(tmp) is not None:
#         level += 1
#         tmp = tree.ancestor(tmp)
#     directory[parent] = {}
#     if prepend is not None:
#         proc_pre = [parent + "/" + pre for pre in prepend]
#         children = proc_pre + children
#     if append is not None:
#         proc_app = [parent + "/" + appe for app in append]
#         children = children + proc_app
#     bigbase = len(children)
#     for i, child in enumerate(children):
#         directory[parent][child] = i
#         index[child] = index[parent] + int2base(i, bigbase)# broken + index[parent]
#         tree.create_node(index[child]+" "+child, child, parent=parent)
#     print(f"{parent} {level}")
# #     index[parent] = '0' + index[parent]
#     return(bigbase, directory, index)

In [5]:
maxdep = 2
def branch(tree, parent, children, prepend=["Other"], append=None, directory=directory, index=index):
    if prepend is not None:
        proc_pre = [parent + "/" + pre for pre in prepend]
        children = proc_pre + children
    if append is not None:
        proc_app = [parent + "/" + appe for app in append]
        children = children + proc_app
    tmp = parent
    level = 0
    while tree.ancestor(tmp) is not None:
        level += 1
        tmp = tree.ancestor(tmp)
    directory[parent] = {}
    for i, child in enumerate( children ):
        directory[parent][child] = i
#         print(index[parent], type(index[parent]))
        if index[parent] != '':
            index[child] = str(int(index[parent]) + (i+1)* 10 ** (maxdep-level))
        else:
            index[child] = str((i+1)* 10 ** (maxdep-level))
        tree.create_node(index[child]+" "+child, child, parent=parent)

It would be better to start with something like `directory` than to build it as we go along, but, hey, this is a hack.

In [6]:
tree = Tree()

# index["Alert"] = ''#int2base(0, 1)
# tree.create_node(index["Alert"] + " " + "Alert", "Alert")

# branch(tree, "Alert", ["Bogus", "Real"])#, prepend=["Unclassified"])

# index["Real"] = ''#int2base(0, 1)
# tree.create_node(index["Real"] + " " + "Alert/Real", "Real")

# branch(tree, "Real", ["Static", "Moving"])#, prepend=['Unclassified'])

index["Static"] = ''#int2base(0, 1)
tree.create_node(index["Static"] + " " + "Alert/Real/Static", "Static")

# need spot for residual, choose not to classify -- metacategory? possibly rename to "Flagged"?
index["Meta"] = "0"# * (maxdep+1)
tree.create_node(index["Meta"] + " Meta", "Meta", parent = "Static")
branch(tree, "Meta", ["Residual", "NotClassified"])

branch(tree, "Static", ["Non-Recurring", "Recurring"])

branch(tree, "Recurring", ["Periodic", "Non-Periodic"])

branch(tree, "Periodic", ["Cepheid", "RR Lyrae", "Delta Scuti", "EB", "LPV/Mira"])

branch(tree, "Non-Periodic", ["AGN"])

branch(tree, "Non-Recurring", ["SN-like", "Fast", "Long"])

branch(tree, "SN-like", ["Ia", "Ib/c", "II", "Iax", "91bg"])

branch(tree, "Fast", ["KN", "M-dwarf Flare", "Dwarf Novae", "uLens"])

branch(tree, "Long", ["SLSN", "TDE", "ILOT", "CART", "PISN"])

tree.show()

 Alert/Real/Static
├── 0 Meta
│   ├── 10 Meta/Other
│   ├── 20 Residual
│   └── 30 NotClassified
├── 100 Static/Other
├── 200 Non-Recurring
│   ├── 210 Non-Recurring/Other
│   ├── 220 SN-like
│   │   ├── 221 SN-like/Other
│   │   ├── 222 Ia
│   │   ├── 223 Ib/c
│   │   ├── 224 II
│   │   ├── 225 Iax
│   │   └── 226 91bg
│   ├── 230 Fast
│   │   ├── 231 Fast/Other
│   │   ├── 232 KN
│   │   ├── 233 M-dwarf Flare
│   │   ├── 234 Dwarf Novae
│   │   └── 235 uLens
│   └── 240 Long
│       ├── 241 Long/Other
│       ├── 242 SLSN
│       ├── 243 TDE
│       ├── 244 ILOT
│       ├── 245 CART
│       └── 246 PISN
└── 300 Recurring
    ├── 310 Recurring/Other
    ├── 320 Periodic
    │   ├── 321 Periodic/Other
    │   ├── 322 Cepheid
    │   ├── 323 RR Lyrae
    │   ├── 324 Delta Scuti
    │   ├── 325 EB
    │   └── 326 LPV/Mira
    └── 330 Non-Periodic
        ├── 331 Non-Periodic/Other
        └── 332 AGN



Yeah, not sure these are really useful. . .

In [7]:
print(directory)

{'Meta': {'Meta/Other': 0, 'Residual': 1, 'NotClassified': 2}, 'Static': {'Static/Other': 0, 'Non-Recurring': 1, 'Recurring': 2}, 'Recurring': {'Recurring/Other': 0, 'Periodic': 1, 'Non-Periodic': 2}, 'Periodic': {'Periodic/Other': 0, 'Cepheid': 1, 'RR Lyrae': 2, 'Delta Scuti': 3, 'EB': 4, 'LPV/Mira': 5}, 'Non-Periodic': {'Non-Periodic/Other': 0, 'AGN': 1}, 'Non-Recurring': {'Non-Recurring/Other': 0, 'SN-like': 1, 'Fast': 2, 'Long': 3}, 'SN-like': {'SN-like/Other': 0, 'Ia': 1, 'Ib/c': 2, 'II': 3, 'Iax': 4, '91bg': 5}, 'Fast': {'Fast/Other': 0, 'KN': 1, 'M-dwarf Flare': 2, 'Dwarf Novae': 3, 'uLens': 4}, 'Long': {'Long/Other': 0, 'SLSN': 1, 'TDE': 2, 'ILOT': 3, 'CART': 4, 'PISN': 5}}


In [8]:
print(index)

{'Static': '', 'Meta': '0', 'Meta/Other': '10', 'Residual': '20', 'NotClassified': '30', 'Static/Other': '100', 'Non-Recurring': '200', 'Recurring': '300', 'Recurring/Other': '310', 'Periodic': '320', 'Non-Periodic': '330', 'Periodic/Other': '321', 'Cepheid': '322', 'RR Lyrae': '323', 'Delta Scuti': '324', 'EB': '325', 'LPV/Mira': '326', 'Non-Periodic/Other': '331', 'AGN': '332', 'Non-Recurring/Other': '210', 'SN-like': '220', 'Fast': '230', 'Long': '240', 'SN-like/Other': '221', 'Ia': '222', 'Ib/c': '223', 'II': '224', 'Iax': '225', '91bg': '226', 'Fast/Other': '231', 'KN': '232', 'M-dwarf Flare': '233', 'Dwarf Novae': '234', 'uLens': '235', 'Long/Other': '241', 'SLSN': '242', 'TDE': '243', 'ILOT': '244', 'CART': '245', 'PISN': '246'}


## Building a structure for hierarchical classification

The whole point of this, for me, is for the classification to have corresponding posterior probabilities, or at least confidence flags or scores, because I'd want to use them to rapidly select follow-up candidates.
[This](https://community.lsst.org/t/projects-involving-irregularly-shaped-data/4466) looks potentially relevant.
I guess it could also be used for packaging up additional features into an alert without bloating it up too much.