# The Cylc GraphParser

The GraphParser is an important class in Cylc. This notebook shows its documentation and a few examples of how it can be used.

In [1]:
from cylc.graph_parser import GraphParser
from cylc.version import CYLC_VERSION

print("Cylc version     : {}".format(CYLC_VERSION))
print("GraphParser docs :")
print(GraphParser.__doc__)

Cylc version     : 7.7.1-243-gc7c11
GraphParser docs :
Class for extracting dependency information from cylc graph strings.

    For each task in the graph string, results are stored as:
        self.triggers[task_name][expression] = ([expr_task_names], suicide)
        self.original[task_name][expression] = original_expression

    (original_expression is separated out to allow comparison of triggers
    from different equivalent expressions, e.g. family vs member).

    This is currently intended to process a single multi-line graph string
    (i.e. the content of a single graph section). But it could be extended to
    store dependencies for the whole suite (call parse_graph multiple times
    and key results by graph section).

    The general form of a dependency is "EXPRESSION => NODE", where:
        * On the right, NODE is a task or family name
        * On the left, an EXPRESSION of nodes involving parentheses, and
          logical operators '&' (AND), and '|' (OR).
        *

In [2]:
graph_parser = GraphParser()
graph_parser.parse_graph("a => b")
print(graph_parser.original)

{'a': {'': ''}, 'b': {'a:succeed': 'a:succeed'}}


GraphParser stores state. So calling it multiple times may lead to unwanted outputs. The best option is to create a new instance.

In [3]:
graph_parser = GraphParser()
graph_parser.parse_graph("a:finish => b:succeed => c")
print(graph_parser.original)

{'a': {'': ''}, 'c': {'b:succeed': 'b:succeed'}, 'b': {'(a:succeed|a:fail)': 'a:finish'}}


The graph created by GraphParser is a [Directed acyclic graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph) composed of nodes, e.g. *a, b,* and *c.* The arrows (i.e. *=>*) denote the edges. So there is an edge between *a* and *b*, and another edge between *b* and *c*.

The colon mark (i.e. *\:*) is used to show that there is a special dependency between the two nodes. *a:finish => b* can be read as *&ldquo;a triggers b only when it finishes&rdquo;*.

For the current version of Cylc the supported triggers are:

In [4]:
[getattr(graph_parser, trigger)[1:] for trigger in dir(graph_parser) if trigger.startswith('TRIG')]

['fail', 'finish', 'succeed']

As you may have noticed from the previous example, *finish* is equivalent to either *succeed* or *fail*. The trigger names can be, however, any value. For the GraphParser, it simply means that you have used some custom value, but it won't fail to parse the graph.

In [5]:
graph_parser = GraphParser()

graph_parser.parse_graph("a:fire => b:wakeup => c")
print(graph_parser.original)
print("")
print(graph_parser.triggers)
print("")
## TODO: this function is never used in Cylc... and its output is not really useful. We can probably remove it?
graph_parser.print_triggers()

{'a': {'': ''}, 'c': {'b:wakeup': 'b:wakeup'}, 'b': {'a:fire': 'a:fire'}}

{'a': {'': ([], False)}, 'c': {'b:wakeup': (['b:wakeup'], False)}, 'b': {'a:fire': (['a:fire'], False)}}

('\nTASK:', 'a')
(' ', 'TRIGGER:', '')
('  from', '')
('\nTASK:', 'c')
(' ', 'TRIGGER:', 'b:wakeup')
('    +', 'b:wakeup')
('  from', 'b:wakeup')
('\nTASK:', 'b')
(' ', 'TRIGGER:', 'a:fire')
('    +', 'a:fire')
('  from', 'a:fire')


## Task Families

We can group tasks (i.e. nodes) in logical families. This allows us to trigger multiple nodes at once.

In [6]:
fam_map = {'FAM': ['m1', 'm2'], 'BAM': ['b1', 'b2']}

graph_parser = GraphParser(fam_map)
graph_parser.parse_graph("FAM:succeed-all => BAM")

print(graph_parser.original)
print("")
print(graph_parser.triggers)

{'m1': {'': ''}, 'b1': {'(m1:succeed&m2:succeed)': 'FAM:succeed-all'}, 'b2': {'(m1:succeed&m2:succeed)': 'FAM:succeed-all'}, 'm2': {'': ''}}

{'m1': {'': ([], False)}, 'b1': {'(m1:succeed&m2:succeed)': (['m1:succeed', 'm2:succeed'], False)}, 'b2': {'(m1:succeed&m2:succeed)': (['m1:succeed', 'm2:succeed'], False)}, 'm2': {'': ([], False)}}


The task family above, is interpreted by Cylc's GraphPraser, and is can be equally written as:

In [7]:
graph_parser = GraphParser()
graph_parser.parse_graph("""
    (m1 & m2) => b1
    (m1 & m2) => b2
""")

print(graph_parser.original)
print("")
print(graph_parser.triggers)

{'m1': {'': ''}, 'b1': {'(m1:succeed&m2:succeed)': '(m1:succeed&m2:succeed)'}, 'b2': {'(m1:succeed&m2:succeed)': '(m1:succeed&m2:succeed)'}, 'm2': {'': ''}}

{'m1': {'': ([], False)}, 'b1': {'(m1:succeed&m2:succeed)': (['m1:succeed', 'm2:succeed'], False)}, 'b2': {'(m1:succeed&m2:succeed)': (['m1:succeed', 'm2:succeed'], False)}, 'm2': {'': ([], False)}}


In the example above, the structure created for the graph is different, but the resulting triggers are the same. So the graph would be evaluated in the same manner by Cylc, but without families.

There are special modifiers for the triggers, that can be used only for families:

In [8]:
[getattr(graph_parser, trigger)[1:] for trigger in dir(graph_parser) if trigger.startswith('FAM_TRIG_EXT')]

['all', 'any']

So you can use any of the triggers displayed previously with these modifiers, as long as you are referring to a family.

In [9]:
triggers  = [getattr(graph_parser, trigger)[1:] for trigger in dir(graph_parser) if trigger.startswith('TRIG')]
modifiers = [getattr(graph_parser, trigger)[1:] for trigger in dir(graph_parser) if trigger.startswith('FAM_TRIG_EXT')]

for trigger in triggers:
    for modifier in modifiers:
        print("{}-{}".format(trigger, modifier))

fail-all
fail-any
finish-all
finish-any
succeed-all
succeed-any


## Parameterized graphs

Graphs in Cylc also support parameters.

In [10]:
graph_parser = GraphParser(None, ({'city': ['tokyo']}, {'city': '_%(city)s'}))
graph_parser.parse_graph("a => b<city>")
# Note there is no b, but b_tokyo instead
print(graph_parser.original)

{'a': {'': ''}, 'b_tokyo': {'a:succeed': 'a:succeed'}}
