# PLAsTiCC v2 taxonomy

_Alex Malz (GCCL@RUB)_ (add your name here)

The purpose of this notebook is to outline a bitmask schema for hierarchical classes of LSST alerts.
The bitmask corresponds to a "best" classification to be included in the alert.
Each digit in the bitmask, however, corresponds to a vector of classification probabilities, confidence flags, or scores that can be used to subsample the alert stream.
Persistent features could be queried for the subsampled objects from a separate database, which could be used for further selection.

In [None]:
from treelib import Node, Tree
import string

## Housekeeping

We need to think about how to sort through the classification information.
`directory` and `index` are very simplistic starting points.
It'll be easier when we have a better idea of what subsampling operations we'll perform.

In [None]:
directory = {}
index = {}

### Generating the integer codes

The idea is that every level of the tree corresponds to one digit in the bitmask.
The number of objects in the

In [None]:
### map real to 1 and bogus to 0 - doesn't need a 0 for alert digit

digs = string.digits + string.ascii_letters

def int2base(x, base):
    # 
    if x < 0:
        sign = -1
    elif x == 0:
        return digs[0]
    else:
        sign = 1

    x *= sign
    digits = []

    while x:
        digits.append(digs[int(x % base)])
        x = int(x / base)

    if sign < 0:
        digits.append('-')

    # digits.reverse()

    return ''.join(digits)

## Building a phylogenetic tree

Given the hierarchical class relationships, make a tree diagram (and record some hopefully useful information).

In [None]:
# def branch(tree, parent, children, prepend=["Other"], append=None, directory=directory, index=index):
#     level = 0
#     tmp = parent
#     while tree.ancestor(tmp) is not None:
#         level += 1
#         tmp = tree.ancestor(tmp)
#     directory[parent] = {}
#     if prepend is not None:
#         proc_pre = [parent + "/" + pre for pre in prepend]
#         children = proc_pre + children
#     if append is not None:
#         proc_app = [parent + "/" + appe for app in append]
#         children = children + proc_app
#     bigbase = len(children)
#     for i, child in enumerate(children):
#         directory[parent][child] = i
#         index[child] = index[parent] + int2base(i, bigbase)# broken + index[parent]
#         tree.create_node(index[child]+" "+child, child, parent=parent)
#     print(f"{parent} {level}")
# #     index[parent] = '0' + index[parent]
#     return(bigbase, directory, index)

In [None]:
maxdep = 2
def branch(tree, parent, children, prepend=["Other"], append=None, directory=directory, index=index):
    if prepend is not None:
        proc_pre = [parent + "/" + pre for pre in prepend]
        children = proc_pre + children
    if append is not None:
        proc_app = [parent + "/" + appe for app in append]
        children = children + proc_app
    tmp = parent
    level = 0
    while tree.ancestor(tmp) is not None:
        level += 1
        tmp = tree.ancestor(tmp)
    directory[parent] = {}
    for i, child in enumerate( children ):
        directory[parent][child] = i
#         print(index[parent], type(index[parent]))
        if index[parent] != '':
            index[child] = str(int(index[parent]) + (i+1)* 10 ** (maxdep-level))
        else:
            index[child] = str((i+1)* 10 ** (maxdep-level))
        tree.create_node(index[child]+" "+child, child, parent=parent)

It would be better to start with something like `directory` than to build it as we go along, but, hey, this is a hack.

In [None]:
tree = Tree()

# index["Alert"] = ''#int2base(0, 1)
# tree.create_node(index["Alert"] + " " + "Alert", "Alert")

# branch(tree, "Alert", ["Bogus", "Real"])#, prepend=["Unclassified"])

# index["Real"] = ''#int2base(0, 1)
# tree.create_node(index["Real"] + " " + "Alert/Real", "Real")

# branch(tree, "Real", ["Static", "Moving"])#, prepend=['Unclassified'])

index["Static"] = ''#int2base(0, 1)
tree.create_node(index["Static"] + " " + "Alert/Real/Static", "Static")

# need spot for residual, choose not to classify -- metacategory? possibly rename to "Flagged"?
index["Meta"] = "0"# * (maxdep+1)
tree.create_node(index["Meta"] + " Meta", "Meta", parent = "Static")
branch(tree, "Meta", ["Residual", "NotClassified"])

branch(tree, "Static", ["Non-Recurring", "Recurring"])

branch(tree, "Recurring", ["Periodic", "Non-Periodic"])

branch(tree, "Periodic", ["Cepheid", "RR Lyrae", "Delta Scuti", "EB", "LPV/Mira"])

branch(tree, "Non-Periodic", ["AGN"])

branch(tree, "Non-Recurring", ["SN-like", "Fast", "Long"])

branch(tree, "SN-like", ["Ia", "Ib/c", "II", "Iax", "91bg"])

branch(tree, "Fast", ["KN", "M-dwarf Flare", "Dwarf Novae", "uLens"])

branch(tree, "Long", ["SLSN", "TDE", "ILOT", "CART", "PISN"])

tree.show()

Yeah, not sure these are really useful. . .

In [None]:
print(directory)

In [None]:
print(index)

## Building a structure for hierarchical classification

The whole point of this, for me, is for the classification to have corresponding posterior probabilities, or at least confidence flags or scores, because I'd want to use them to rapidly select follow-up candidates.
[This](https://community.lsst.org/t/projects-involving-irregularly-shaped-data/4466) looks potentially relevant.
I guess it could also be used for packaging up additional features into an alert without bloating it up too much.