# Heirarchical Tagging

## Tags

Create a collection of tag comprised of strings separated by a ``::``

In [126]:
from collections import defaultdict

tags = [
    "author::Felder-Rousseau",
    "author::Jones",
    "author::Jones::Chapter 3",
    "exercise::differential equations",
    "exercise",
    "author::Jones::Chapter 1",
    "exercise::differential equations",
    "author::Smith::page 23",
]

for k, tag in enumerate(tags):
    print(k, tag)

0 author::Felder-Rousseau
1 author::Jones
2 author::Jones::Chapter 3
3 exercise::differential equations
4 exercise
5 author::Jones::Chapter 1
6 exercise::differential equations
7 author::Smith::page 23


## Indexing

Scan the tags, splitting on ``::``, and using the resulting tuple to index a dictionary. The dictionary values are a list of items referring to the source of the tag. In this case we use line numbers, in an indexing application we could use a link to notebook cell.

In [52]:
dd = defaultdict(list)

cnt = 0
for tag in tags:
    key = tuple(tag.split("::"))
    dd[key].append(cnt)
    cnt += 1
dd

defaultdict(list,
            {('author', 'Felder-Rousseau'): [0],
             ('author', 'Jones'): [1],
             ('author', 'Jones', 'Chapter 3'): [2],
             ('exercise', 'differential equations'): [3, 6],
             ('exercise',): [4],
             ('author', 'Jones', 'Chapter 1'): [5],
             ('author', 'Smith', 'page 23'): [7]})

## Sorting

The keys of the dictionary can be sorted with a multi-index.

In [114]:
sort_fcn = lambda key: [s.lower() for s in key]
for key in sorted(dd.keys(), key=sort_fcn):
    print(key)

('author', 'Felder-Rousseau')
('author', 'Jones')
('author', 'Jones', 'Chapter 1')
('author', 'Jones', 'Chapter 3')
('author', 'Smith', 'page 23')
('exercise',)
('exercise', 'differential equations')


## Printing

In [139]:
n = max(len(key) for key in dd.keys())
print(n, "deep heirarchy\n")

prev_fields = [""]*n
for key in sorted(dd.keys(), key=sort_fcn):
    lvl = 0
    for field in key:
        if field != prev_fields[lvl]:
            if lvl == len(key)-1:
                print("    "*lvl + f"* {field}")
            else:
                print("    "*lvl + f"* {field}")
        if lvl == len(key) - 1:
            for val in dd[key]:
                print("    "*(lvl+1) + f"[tagged on line {val}]()")
            print("")
        prev_fields[lvl] = field
        lvl += 1

3 deep heirarchy

* author
    * Felder-Rousseau
        [tagged on line 0]()

    * Jones
        [tagged on line 1]()

        * Chapter 1
            [tagged on line 5]()

        * Chapter 3
            [tagged on line 2]()

    * Smith
        * page 23
            [tagged on line 7]()

* exercise
    [tagged on line 4]()

    * differential equations
        [tagged on line 3]()
        [tagged on line 6]()



## Linking to references

In [158]:
references =  {
    ("author", "Felder-Rousseau"): "All you ever wanted to know about chemical engineering",
    ("author", "Jones"): "Even more about chemical engineering",
    ("author", "Smith"): "[Comic Relief](https://xkcd.com)"
}

In [159]:
n = max(len(key) for key in dd.keys())
print(n, "deep heirarchy\n")

prev_fields = [""]*n
for key in sorted(dd.keys(), key=sort_fcn):
    lvl = 0
    for field in key:
        if field != prev_fields[lvl]:
            prev_fields[lvl] = field
            search_key = tuple(prev_fields[0:lvl+1])
            if search_key in references.keys():
                print(search_key, references[search_key])
            if lvl == len(key)-1:
                print("    "*lvl + f"* {field}")
            else:
                print("    "*lvl + f"* {field}")
        if lvl == len(key) - 1:
            for val in dd[key]:
                print("    "*(lvl+1) + f"[tagged on line {val}]()")
            print("")
        lvl += 1

3 deep heirarchy

* author
('author', 'Felder-Rousseau') All you ever wanted to know about chemical engineering
    * Felder-Rousseau
        [tagged on line 0]()

('author', 'Jones') Even more about chemical engineering
    * Jones
        [tagged on line 1]()

        * Chapter 1
            [tagged on line 5]()

        * Chapter 3
            [tagged on line 2]()

('author', 'Smith') [Comic Relief](https://xkcd.com)
    * Smith
        * page 23
            [tagged on line 7]()

* exercise
    [tagged on line 4]()

    * differential equations
        [tagged on line 3]()
        [tagged on line 6]()

