# Rules

In [None]:
#| default_exp rules

In [None]:
#| hide
from nbdev.showdoc import show_doc
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


### Overview
Up until now, we've covered the parsing and matching phases of graph transformation. The user provides specifications for 3 NetworkX graphs which, when combined, denote a single **rule** of graph transformation:

1. **LHS** - A **pattern** graph which is searched in the input graph; For each match, we apply the transformation. Previous notebooks handled the way we use LHS Graphs to find matches in the graph.

2. **P** (optional) - Denotes the parts of the pattern which we want to **preserve** during the transformation. Nodes, edges and attributes that we want to preserve from LHS appear in P as well; Anything that appears in the input graph but not in P is thus removed during the transformation. 
    - When combined with LHS, P can also denote the **cloning** of LHS nodes. In the context of this library, the cloning of a LHS node $n$ is the creation of a new node (that appears in P) which has the same attributes as $n$, and whose edges are clones of the edges connected to $n$ in LHS.

3. **RHS** (optional) - Denotes the **interface of the graph after the transformation**. As a result, we can infer from it the nodes, edges and attributes that we add to the preserved parts, resulting in the final transformation. Nodes, edges and attributes that appear in RHS but not in P are added.
    - When combined with P, RHS can also denote the **merging** of P nodes. In the context of this library, the merging of P nodes $x_1, ..., x_k$ is the creation of a new node (that appears in RHS) whose attributes and edges are the merge of the corresponding components in $x_1,...,x_k$.

The following module defines the **Rule** class which, given these three graphs, allows finding the nodes/edges/attribute to add/remove/clone/merge as dictated by the rule. These findings will be used in the final module, which will finally execute the transformation itself.

### Requirements

In [None]:
#| export
from typing import *
from networkx import DiGraph

from graph_rewrite.core import GraphRewriteException, NodeName, EdgeName, _create_graph, draw
from graph_rewrite.match_class import *
from graph_rewrite.lhs import lhs_to_graph
from graph_rewrite.p_rhs_parse import RenderFunc, p_to_graph, rhs_to_graph

### Merge Policy
One of the features our rules allow is merging nodes. Nodes which appear in P are merged into new RHS nodes, along with their attributes. This might cause an ambiguity when merging P nodes which share one or more attributes with the same name (but different values) - What should be the value of that attribute in the new merged RHS node?

We offer two optional behaviors - policies for solving this conflict:
1. **choose_last** (default) - if the merged nodes with the shared attribute are $x_1,...,x_n$, then the value of the attribute in the merged node will be that of $x_n$.
2. **union** - if the merged node with the shared attribute are $x_1,...,x_n$, with the attribute values $v_1,...,v_n$ respectively, then the value of the attribute in the merged node will be $set(v_1,...,v_n)$.

In [None]:
#| export
class MergePolicy:
    """Static class for policies for solving conflicts when merging nodes with shared attributes.
    """
    @staticmethod
    def _merge_dicts(dict1: dict, dict2: dict, collision_policy: Callable[[Any, Any], Any]) -> dict:
        """A generic dictionary merger, which solves conflicts with a given collision policy.

        Args:
            dict1 (dict): A dictionary
            dict2 (dict): A dictionary
            collision_policy (Callable[[Any, Any], Any]): A function that recieves two values and returns a new value (for solving the conflict).

        Returns:
            dict: The merged dictionary according to the collision policy.
        """
        merged = {}
        for key in dict1.keys():
            if key in dict2.keys():
                merged[key] = collision_policy(dict1[key], dict2[key])
            else:
                merged[key] = dict1[key]
        for key in dict2.keys():
            if key not in dict1.keys():
                merged[key] = dict2[key]
        return merged
    
    @staticmethod
    def choose_last(dict1: dict, dict2: dict) -> dict:
        """Merge two dictionaries, such that for each attribute x they share, its merged value is dict2[x].

        Args:
            dict1 (dict): A dictionary
            dict2 (dict): A dictionary

        Returns:
            dict: The merged dictionary
        """
        return MergePolicy._merge_dicts(dict1, dict2, lambda v1, v2: v2)
    
    @staticmethod
    def union(dict1: dict, dict2: dict) -> dict:
        """Merge two dictionaries, such that for each attributbe x they share, its merged value is a list that contains both dict1[x] and dict2[x].

        Args:
            dict1 (dict): A dictionary
            dict2 (dict): A dictionary

        Returns:
            dict: The merged dictionary.
        """
        return MergePolicy._merge_dicts(dict1, dict2, 
                                           lambda v1, v2: [v1, v2])

### Rules Definition
The following is the complete definition of the Rule class (and the exceptions it might raise).

In [None]:
#| export
_exception_msgs = {
    "clone_non_existing": lambda p_node, lhs_node: f"Node {p_node} clones an non-existing node {lhs_node}.",
    "clone_illegal_id": lambda p_node, copy_num: f"Node {p_node} clone id {copy_num} is illegal.",
    "p_bad_format": lambda p_node: f"Node {p_node} has a bad formatted name.",
    "p_not_in_lhs": lambda p_node: f"Node {p_node} in P does not exist in LHS.",
    "p_edge_not_in_lhs": lambda p_s, p_t: f"Edge {(p_s, p_t)} in P does not exist (and doesn't clone any edge) in LHS.",
    "rhs_illegal_name": lambda rhs_node: f"Node {rhs_node} merges at least one non-existing P node.",
    "rhs_not_in_p": lambda p_node: f"Node {p_node} in P does not exist in RHS, nor merges into an RHS node.",
    "add_attrs_in_p_node": lambda p_node: f"P node {p_node} cannot add attributes.",
    "add_attrs_in_p_edge": lambda s_copy, t_copy: f"P edge ({s_copy},{t_copy}) cannot add attributes.",
    "remove_attrs_in_rhs_node": lambda rhs_node: f"RHS node {rhs_node} cannot remove attributes.",
    "remove_attrs_in_rhs_edge": lambda s, t: f"RHS edge ({s},{t}) cannot remove attributes.",
    "attrs_in_cloned_node": lambda p_node: f"Cloned node {p_node} in P should not explicitly mention attributes",
    "attrs_in_cloned_edge": lambda s_copy, t_copy: f"Cloned edge ({s_copy},{t_copy}) in P should not explicitly mention attributes"
}

In [None]:
#| export
class Rule:
    global _exception_msgs
    """A transformation rule, defined by 1-3 graphs:
    - LHS - defines the pattern to search for in the graph. This includes both single nodes and collections patterns, and the match object to go over all nodes and edges in the match easily.
    - P - defines what parts to preserve (and also defines clones).
    - RHS - defines what parts to add (and also defines merges).
    """
    def __init__(self, match: Match, single_nodes_lhs: DiGraph, collections_lhs: DiGraph = None, 
                 p: DiGraph = None, rhs: DiGraph = None,
                 merge_policy = MergePolicy.choose_last):
        self.match = match
        self.single_nodes_lhs = single_nodes_lhs
        self.collections_lhs = collections_lhs
        self.lhs = self._create_lhs_graph()
        self.p = p if p else self.lhs_p_copy(self.lhs)
        self.rhs = rhs if rhs else self.p.copy()
        self.merge_policy = merge_policy

        self._p_to_lhs, self._p_to_rhs = {}, {}
        self._merge_sym, self._clone_sym = '&', '*'
        self._create_p_lhs_hom()
        self._create_p_rhs_hom()

        self._rev_p_lhs = self._reversed_dictionary(self._p_to_lhs)
        self._rev_p_rhs = self._reversed_dictionary(self._p_to_rhs)
        self._validate_rule()

    # Utils
    def _create_lhs_graph(self):
        g = DiGraph()
        for node in self.match.get_pattern_nodes():
            g.add_node(node)
            if self.match.is_single(node):
                node_attrs = self.single_nodes_lhs.nodes[node]
            else:
                node_attrs = self.collections_lhs.nodes[node]
            g.nodes[node].update(node_attrs)
        for src, dst in self.match.get_pattern_edges():
            g.add_edge(src, dst)
            if self.match.is_single(src) and self.match.is_single(dst):
                edge_attrs = self.single_nodes_lhs.edges[src, dst]
            else:
                edge_attrs = self.collections_lhs.edges[src, dst]
            g.edges[src, dst].update(edge_attrs)
        return g

    def _create_p_lhs_hom(self):
        """Construct the homomorphism from P to LHS based on the rule.
        Handles cloned nodes.
        """
        for p_node in self.p.nodes():
            # If the p_node contains the cloning symbol, then it's a clone. Extract the clone it denotes (if any)
            if self._clone_sym in str(p_node):
                if len(str(p_node).split(self._clone_sym)) == 2:
                    lhs_node, copy_num = str(p_node).split(self._clone_sym)
                    # Clones must have the format "{node}*{copy_num}" where node is in LHS, copy_num is a number
                    if lhs_node not in self.lhs.nodes():
                        raise GraphRewriteException(_exception_msgs["clone_non_existing"](p_node, lhs_node))
                    elif not copy_num.isnumeric():
                        raise GraphRewriteException(_exception_msgs["clone_illegal_id"](p_node, copy_num))
                    else:
                        # Map the clone p node to the cloned lhs node
                        self._p_to_lhs[p_node] = lhs_node
                # Clones must have the format "{node}*{copy_num}"
                else:
                    raise GraphRewriteException(_exception_msgs["p_bad_format"](p_node))
            # Else, p_node is a preservation of an lhs_node or a collection with the same name
            elif p_node in self.lhs.nodes():
                self._p_to_lhs[p_node] = p_node
            # If it's neither, then the p_node is illegal (does not preserve / clone)
            else:
                raise GraphRewriteException(_exception_msgs["p_not_in_lhs"](p_node))
             
    def _create_p_rhs_hom(self):
        """Construct the homomorphism from P to RHS based on the rule.
        Handles merged nodes.
        """
        for rhs_node in self.rhs.nodes():
            # If the rhs_node has the merging symbol, then it's a merge. Extract the P nodes it merges (if any)
            if self._merge_sym in str(rhs_node):
                # Merged node must have the format "{}&{}&...&{}" where each argument is a P node
                if len(str(rhs_node).split(self._merge_sym)) > 1:
                    p_nodes = str(rhs_node).split(self._merge_sym)
                    # Check that the merge refrences only existing p nodes
                    if all([p_node in self.p.nodes() for p_node in p_nodes]):
                        # If so, map each p_node to the new merged rhs node
                        for p_node in p_nodes:
                            self._p_to_rhs[p_node] = rhs_node 
                    else:
                        raise GraphRewriteException(_exception_msgs["rhs_illegal_name"](rhs_node))
        for p_node in self.p.nodes():
            # Every node in P must be mapped to its preserved RHS node, or to the node that merges it (this case is already handled at this point)
            if p_node not in self._p_to_rhs.keys():
                if p_node in self.rhs.nodes():
                    self._p_to_rhs[p_node] = p_node
                else:
                    raise GraphRewriteException(_exception_msgs["rhs_not_in_p"](p_node))

    def _reversed_dictionary(self, dictionary: dict) -> dict[Any, set]:
        """Given a dictionary, return a dictionary which maps every
        value from the original dictionary to the set of keys 
        that are mapped to it.

        E.g., for {1: 'a', 2: 'a', 3: 'b'}, the reversing function
        returns {'a': {1,2}, 'b': {3}}.

        Args:
            dictionary (dict): A dictionary to reverse

        Returns:
            dict[Any, set]: A reversed dictionary as described
        """
        rev_dict: dict[Any, set] = {}
        for key, value in dictionary.items():
            if value not in rev_dict:
                rev_dict[value] = set()
            rev_dict[value].add(key)
        return rev_dict

    def _dict_difference(self, target: dict, other: dict) -> dict:
        """Given two dictionaries, create a new dictionary which "subtracts" the other dictionary from the target one:
        For each key in target, if it does not appear in the other dictionary, or appears there with a different value,
        then we map the key to the target value in the new dictionary. Otherwise, the key is not included in the new dictionary.

        Args:
            target (dict): A dictionary
            other (dict): A dictionary to subtract from target

        Returns:
            dict: The difference dictionary as explained
        """
        new_dict = {}
        for key in target:
            if key in other and target[key] != other[key]:
                new_dict[key] = target[key]
            elif key not in other:
                new_dict[key] = target[key]
        return new_dict
    
    def lhs_p_copy(self, G: DiGraph):
        '''Copy a graph (specifically lhs graph), to the format of a p graph.
        When the input p is None we choose p to be exactly the lhs, except the attributes are set to (None, None), to support 
        input of the shape NodeName[attribute1]-[attribute2]->NodeName2 with no need to specify the attribute value'''
        H = DiGraph()
        for node, attrs in G.nodes(data=True):
            H.add_node(node, **{key: (None, None) for key in attrs})
        for u, v, attrs in G.edges(data=True):
            H.add_edge(u, v, **{key: (None, None) for key in attrs})

        return H
    
    def format_value_attribute(self, d: dict):
        '''Change an input dictionary of {attribute_name:(type,value)} format to {attribute_name:value} format
        '''
        return {key:d[key][1] for key in d}

    # TODO: need to add a check that there are no contradictions caused by mapping the same input node to a single pattern node and a collection pattern node
    # For example: 
    # 1. Removing an node/edge as the single pattern node, but keeping it as the collection pattern node, or vice versa
    # 2. Removing an attribute from a node/edge in the single pattern node, but keeping it in the collection pattern node, or vice versa
    def _validate_lhs_p(self):
        """Validates the LHS->P homomorphism, and raises appropriate exceptions if it's invalid.
        """
        # We ensure that the attributes of the nodes and edges in P are a subset of the attributes of the corresponding nodes and edges in LHS.
        # This is done to ensure that we don't add attributes to nodes or edges in P that are not present in the corresponding LHS nodes or edges.
        
        
        lhs_nodes_attr_dict = {pattern_node: self.lhs.nodes[pattern_node] for pattern_node in self.lhs.nodes()}

        for lhs_pattern_node in lhs_nodes_attr_dict:
            lhs_attr_names = set(lhs_nodes_attr_dict[lhs_pattern_node].keys())
            p_copies_of_node = self._rev_p_lhs.get(lhs_pattern_node, set())
            for p_node in p_copies_of_node:
                p_node_attr_names = set(self.p.nodes(data=True)[p_node].keys())
                if not p_node_attr_names.issubset(lhs_attr_names):
                    raise GraphRewriteException(_exception_msgs["add_attrs_in_p_node"](p_node))

        # We ensure that the attributes of the edges in P are a subset of the attributes of the corresponding edges in LHS.
        # The reason for using *pattern_edge in the next line is to unpack the pattern_edge tuple into two variables
        lhs_edges_attr_dict = {pattern_edge: self.lhs.get_edge_data(*pattern_edge) for pattern_edge in self.lhs.edges()}

        for (src,dst) in lhs_edges_attr_dict:
            lhs_attr_names = lhs_edges_attr_dict[(src, dst)]
            p_copies_of_src, p_copies_of_dst = self._rev_p_lhs.get(src, set()), self._rev_p_lhs.get(dst, set())
            for s_copy in p_copies_of_src:
                for t_copy in p_copies_of_dst:
                    if (s_copy, t_copy) in self.p.edges():
                        p_attrs = set(self.p.get_edge_data(s_copy, t_copy).keys())
                        if not p_attrs.issubset(lhs_attr_names):
                            raise GraphRewriteException(_exception_msgs["add_attrs_in_p_edge"](s_copy, t_copy))

        # Edges in P must have a corresponding LHS edge
        for p_s, p_t in self.p.edges():
            if (self._p_to_lhs[p_s], self._p_to_lhs[p_t]) not in self.lhs.edges():
                raise GraphRewriteException(_exception_msgs["p_edge_not_in_lhs"](p_s, p_t))

    # TODO: need to add a check that there are no contradictions caused by mapping the same input node to a single pattern node and a collection pattern node
    # For example:
    # 1. Adding an attribute to a node in the single pattern node, and also adding it in the collection pattern node
    # 2. Assigning different values to the same attribute in the single pattern node and the collection pattern node
    def _validate_rhs_p(self):
        """Validates the RHS->P homomorphism, and raises appropriate exceptions if it's invalid.
        """

        # Nodes in RHS do NOT remove attributes that are in the corresponding P node(s).
        # Note that we ignore merged nodes here (which follow this rule automatically).
        for node_rhs in self.rhs.nodes():
            if node_rhs not in self.nodes_to_merge().keys():
                rhs_attrs = set(self.rhs.nodes(data=True)[node_rhs].keys())
                p_origins = self._rev_p_rhs.get(node_rhs, set())
                for node_p in p_origins:
                    p_attrs = set(self.p.nodes(data=True)[node_p].keys())
                    if not p_attrs.issubset(rhs_attrs):
                        raise GraphRewriteException(_exception_msgs["remove_attrs_in_rhs_node"](node_rhs))
        
        # Edges in RHS do NOT remove attributes that are in the corresponding P edge(s).
        # Note that we ignore edges created by a merge here (they follow this rule automatically).
        for s, t in self.rhs.edges():
            if s not in self.nodes_to_merge().keys() and t not in self.nodes_to_merge().keys():
                rhs_attrs = set(self.rhs.get_edge_data(s, t).keys())
                s_origins, t_origins = self._rev_p_rhs.get(s, set()), self._rev_p_rhs.get(t, set())
                for s_origin in s_origins:
                    for t_origin in t_origins:
                        if (s_origin, t_origin) in self.p.edges():
                            origin_attrs = set(self.p.get_edge_data(s_origin, t_origin).keys())
                            if not origin_attrs.issubset(rhs_attrs):
                                raise GraphRewriteException(_exception_msgs["remove_attrs_in_rhs_edge"](s,t))

    def _validate_rule(self):
        """Validates the rule - that is, checking that the homomorphisms are valid, and that clones mentioned in the rule
        are valid (if any exist).
        """

        self._validate_lhs_p()
        self._validate_rhs_p()
        
        clones = {item for clones_list in self.nodes_to_clone().values() for item in clones_list}
        # validate cloned nodes in P have no attributes mentioned (all attributes are copied automatically)
        for clone in clones:
            if self.p.nodes(data=True)[clone] != {}:
                raise GraphRewriteException(_exception_msgs["attrs_in_cloned_node"](clone))
        
        # validate cloned edges in P (edges with cloned endpoint) have no attributes mentioned
        for s, t, attrs in self.p.edges(data=True):
            if (s in clones or t in clones) and attrs != {}:
                raise GraphRewriteException(_exception_msgs["attrs_in_cloned_edge"](s, t))

    def _merge_node_attrs(self, rhs_node: NodeName, p_origins: list[NodeName]) -> dict:
        """Given a node in RHS that is a copy / a merge of one or more nodes in P,
        and the P nodes which it copies / merges, returns the dictionary of new attributes
        added to the merged node in RHS (That is, not including the attributes which stem from the merge, other than merged attributes which were overriden in RHS).

        Args:
            rhs_node (NodeName): A node in RHS
            p_origins (list[NodeName]): A list of P nodes which rhs_node merges.

        Returns:
            dict: A dictionary of added attributes (keys and values) of the merged RHS node.
        """
        merge_rhs_attrs = {}
        for p_origin in p_origins:
            new_rhs_attrs = self._dict_difference(self.rhs.nodes[rhs_node], self.p.nodes[p_origin])
            new_rhs_attrs = self.format_value_attribute(new_rhs_attrs)
            merge_rhs_attrs = self.merge_policy(merge_rhs_attrs, new_rhs_attrs)
        return merge_rhs_attrs

    def _merge_edge_attrs(self, rhs_edge: EdgeName, s_origins: list[NodeName], t_origins: list[NodeName]) -> dict:
        """Given an edge in RHS that is possibly a copy / a merge of one or more edges,
        and the copies / merges of both its endpoints, returns the dictionary of new attributes
        added to the merged edge in RHS (That is, not including the attributes which stem from the merge, other than merged attributes which were overriden in RHS).

        Args:
            rhs_edge (EdgeName): An edge in RHS
            s_origins (list[NodeName]): A list of P nodes which the source endpoint of the edge merges.
            t_origins (list[NodeName]): A list of P nodes which the target endpoint of the edge merges.

        Returns:
            dict: A dictionary of added attributes (keys and values) of the merged RHS edge.
        """
        merge_rhs_attrs = {}
        for s_origin in s_origins:
            for t_origin in t_origins:
                if (s_origin, t_origin) in self.p.edges():
                    new_rhs_attrs = self._dict_difference(
                        self.rhs.get_edge_data(*rhs_edge),
                        self.p.get_edge_data(s_origin, t_origin)
                    )
                    new_rhs_attrs = self.format_value_attribute(new_rhs_attrs)
                    merge_rhs_attrs = self.merge_policy(merge_rhs_attrs, new_rhs_attrs)
        return merge_rhs_attrs

    # The following functions are presented in the order of transformation.
    def nodes_to_clone(self) -> dict[NodeName, set[NodeName]]:
        """Find all LHS nodes that should be cloned in P, and for each node, find all its P clones.

        Returns:
            dict[NodeName, set[NodeName]]:
                A dictionary which maps each cloned node in LHS to a set
                of all nodes in P which are its clones.
        """
        # Find all LHS nodes which are mapped by more than one node in P (in the P->LHS Hom.)
        return {lhs_node: self._rev_p_lhs[lhs_node] for lhs_node in self.lhs.nodes() \
                            if len(self._rev_p_lhs.get(lhs_node, set())) > 1}
    
    # TODO: Ensure there is no double deletion of nodes/edges or attributes because of mapping the same input node to a single pattern node and a collection pattern node
    def nodes_to_remove(self) -> set[NodeName]:
        """Find all LHS nodes that should be removed.

        Returns:
            set[NodeName]: Nodes in LHS which should be removed.
        """
        # Find all LHS nodes which are not mapped by any node in P (in the P->LHS Hom.)
        return {lhs_node for lhs_node in self.lhs.nodes() if len(self._rev_p_lhs.get(lhs_node, set())) == 0}

    def edges_to_remove(self) -> set[EdgeName]:
        """Find all P edges that should be removed.

        Note: Does not include edges which one of their endpoints was removed by the rule,
        as during transformation, we begin by removing all removed nodes along with the connected edges.

        Returns:
            set[EdgeName]: Edges in P which should be removed.
        """
        edges_to_remove = set()
        candidate_edges = list(self.lhs.edges())
        for s, t in candidate_edges:
            # If one of the edge endpoints was removed, the edge was removed automatically so we skip it here
            if s not in self.nodes_to_remove() and t not in self.nodes_to_remove():
                s_copies, t_copies = self._rev_p_lhs.get(s, set()), self._rev_p_lhs.get(t, set())
                for s_copy in s_copies:
                    for t_copy in t_copies:
                        # For each "clone of edge (s,t)" that shouldn't be in P, remove it
                        if (s_copy, t_copy) not in self.p.edges():
                            edges_to_remove.add((s_copy, t_copy))
        return edges_to_remove

    def node_attrs_to_remove(self) -> dict[NodeName, set]:
        """For each P node, find all attributes of its corresponding LHS node
        which should be removed from it in P.

        Returns:
            dict[NodeName, set]: A dictionary from P nodes to attributes that should be
                removed from their corresponding LHS nodes.
        """
        attrs_to_remove = {}
        # Add LHS nodes_attr_to_remove
        for node_lhs in self.lhs.nodes():
            if node_lhs not in self.nodes_to_clone().keys(): # cloned nodes do not remove attrs
                p_copies = self._rev_p_lhs.get(node_lhs, set())
                for node_p in p_copies:
                # Find all attributes that are in the LHS node but not in the new P node
                    lhs_attrs = set(self.lhs.nodes[node_lhs].keys())
                    p_attrs = set(self.p.nodes[node_p].keys())
                    diff_attrs = lhs_attrs - p_attrs
                    if len(diff_attrs) != 0:
                        # Remove all such attributes from the P node
                        attrs_to_remove[node_p] = diff_attrs

        return attrs_to_remove
    
    def edge_attrs_to_remove(self) -> dict[EdgeName, set]:
        """For each P edge, find all attributes of its corresponding LHS edge
        which should be removed from it in P.

        Returns:
            dict[EdgeName, set]: A dictionary from P edges to attributes that should be
                removed from their corresponding LHS edges.
        """
        attrs_to_remove = {}
        for s, t in self.lhs.edges():
            s_copies, t_copies = self._rev_p_lhs.get(s, set()), self._rev_p_lhs.get(t, set())
            for s_copy in s_copies:
                for t_copy in t_copies:
                    # For each "clone of edge (s, t)" that is in P
                    if (s_copy, t_copy) in self.p.edges():
                        # Find all attribute names that are in the LHS edge but not in the new P edge
                        lhs_attrs = set(self.lhs.get_edge_data(s, t).keys())
                        p_attrs = set(self.p.get_edge_data(s_copy, t_copy).keys())
                        diff_attrs = lhs_attrs - p_attrs
                        if len(diff_attrs) != 0:
                            # Remove all such attributes
                            attrs_to_remove[(s_copy, t_copy)] = diff_attrs
        return attrs_to_remove

    def nodes_to_merge(self) -> dict[NodeName, set[NodeName]]:
        """Find all RHS nodes which are a merge of nodes in P, and for each node, find all P nodes that merge into it.

        Returns:
            dict[NodeName, set[NodeName]]: 
                A dictionary which maps each node in RHS that is a merge of P nodes,
                to a set of nodes in P which it merges.
        """

        # Find all RHS nodes that are mapped by more than one node in P (in the P->RHS Hom.)
        return {rhs_node: self._rev_p_rhs[rhs_node] for rhs_node in self.rhs.nodes() \
                            if len(self._rev_p_rhs.get(rhs_node, set())) > 1}

    def nodes_to_add(self) -> set[NodeName]:
        """Find all RHS nodes which should be added.

        Note: Does not include nodes in RHS that are created as a merge of P nodes.

        Returns:
            set[NodeName]: Nodes which should be added to RHS.
        """

        # Find all RHS nodes which are not mapped by any node in P (in the P->RHS Hom.)
        return {rhs_node for rhs_node in self.rhs.nodes() if len(self._rev_p_rhs.get(rhs_node, set())) == 0}

    def edges_to_add(self) -> set[EdgeName]:
        """Find all RHS edges that should be added. 

        Note: Does not include edges added to merged nodes.

        Returns:
            set[EdgeName]: Edges which should be added to RHS.
        """
        edges_to_add = set()
        for s, t in self.rhs.edges():
            # New edges from at least one new node (not including merged nodes)
            if s in self.nodes_to_add() or t in self.nodes_to_add():
                edges_to_add.add((s,t)) # surely a new edge
            else:
                s_origins, t_origins = self._rev_p_rhs.get(s, set()), self._rev_p_rhs.get(t, set())
                # New edges from existing P nodes
                if all([(s_origin, t_origin) not in self.p.edges() for s_origin in s_origins for t_origin in t_origins]):
                    edges_to_add.add((s,t))
        return edges_to_add

    # TODO: Ensure there is no double addition of attributes because of mapping the same input node to a single pattern node and a collection pattern node
    def node_attrs_to_add(self) -> dict[NodeName, dict]:
        """For each RHS node, find all attributes (and values) of its corresponding P node(s)
        which should be added to the RHS node.
        
        Returns:
            dict[NodeName, dict]: A dictionary that maps RHS nodes to their added attributes and values.
        """
        attrs_to_add = {}
        # Add RHS node attributes to add
        for node_rhs in self.rhs.nodes():
            if node_rhs in self.nodes_to_add():
                rhs_attrs = self.rhs.nodes(data=True)[node_rhs]
                if len(rhs_attrs) != 0:
                    attrs_to_add[node_rhs] = self.format_value_attribute(rhs_attrs)
            else:
                p_origins = self._rev_p_rhs.get(node_rhs, set())
                merged_p_attrs = self._merge_node_attrs(node_rhs, p_origins)
                if len(merged_p_attrs) != 0:
                    attrs_to_add[node_rhs] = merged_p_attrs
        return attrs_to_add

    def edge_attrs_to_add(self) -> dict[EdgeName, dict]:
        """For each RHS edge, find all attributes (and values) of its corresponding P edge(s)
        which should be added to the RHS edge.

        Returns:
            dict[EdgeName, dict]: A dictionary that maps RHS edges to their added attributes and values.
        """
        attrs_to_add = {}
        for s, t in self.rhs.edges():
            if s in self.nodes_to_add() or t in self.nodes_to_add():
                rhs_attrs = self.rhs.get_edge_data(s, t)
                if len(rhs_attrs) != 0:
                    attrs_to_add[(s, t)] = self.format_value_attribute(rhs_attrs)
            else:
                s_origins, t_origins = self._rev_p_rhs.get(s, set()), self._rev_p_rhs.get(t, set())
                merged_p_attrs = self._merge_edge_attrs((s, t), s_origins, t_origins)
                if len(merged_p_attrs) != 0:
                    attrs_to_add[(s, t)] = merged_p_attrs
        return attrs_to_add


### Tests and Examples
The following section overviews the different use cases of graph transformation, and the rules that the user should construct in order to execute them. Before we dive in, here are some test utilities:

#### Test Utils

In [None]:
def _assert_rule(match: Match, single_nodes_lhs: DiGraph = None, collections_lhs: DiGraph = None, 
                 p: DiGraph = None, rhs: DiGraph = None, merge_policy = None,
                 error: str = None,
                 nodes_clone: dict[NodeName, set[NodeName]] = None,
                 nodes_remove: set[NodeName] = None,
                 edges_remove: set[EdgeName] = None,
                 node_attrs_remove: dict[NodeName, set] = None, 
                 edge_attrs_remove: dict[EdgeName, set] = None, 
                 nodes_merge: dict[NodeName, set[NodeName]] = None, 
                 nodes_add: set[NodeName] = None, 
                 edges_add: set[EdgeName] = None,
                 node_attrs_add: dict[NodeName, dict] = None, 
                 edge_attrs_add: dict[EdgeName, dict] = None):
    merge_policy = merge_policy if merge_policy else MergePolicy.choose_last
    nodes_clone = nodes_clone if nodes_clone else {}
    nodes_remove = nodes_remove if nodes_remove else set()
    edges_remove = edges_remove if edges_remove else set()
    node_attrs_remove = node_attrs_remove if node_attrs_remove else {}
    edge_attrs_remove = edge_attrs_remove if edge_attrs_remove else {}
    nodes_merge = nodes_merge if nodes_merge else {}
    nodes_add = nodes_add if nodes_add else set()
    edges_add = edges_add if edges_add else set()
    node_attrs_add = node_attrs_add if node_attrs_add else {}
    edge_attrs_add = edge_attrs_add if edge_attrs_add else {}

    try:
        rule = Rule(match=match, single_nodes_lhs=single_nodes_lhs, collections_lhs=collections_lhs, p=p, rhs=rhs, merge_policy=merge_policy)
        assert rule.nodes_to_clone() == nodes_clone
        assert rule.nodes_to_remove() == nodes_remove
        assert rule.edges_to_remove() == edges_remove
        assert rule.node_attrs_to_remove() == node_attrs_remove
        assert rule.edge_attrs_to_remove() == edge_attrs_remove
        assert rule.nodes_to_merge() == nodes_merge
        assert rule.nodes_to_add() == nodes_add
        assert rule.edges_to_add() == edges_add
        assert rule.node_attrs_to_add() == node_attrs_add
        assert rule.edge_attrs_to_add() == edge_attrs_add
        assert error == None
    except GraphRewriteException as e:
        assert e.message == error

#### Preserve and Remove (Without Attributes)

We begin by defining a simple LHS graph, a pattern without attributes at all:

In [None]:
from graph_rewrite.lhs import lhs_to_graph
from graph_rewrite.matcher import find_matches
from graph_rewrite.match_class import Match

def _generate_match(input_graph: DiGraph, LHS: str, condition=lambda x: True, debug=False):
    """For testing purposes, generate a single match for a given input graph and LHS pattern.
    It returns a single match that satisfies the condition.
    
    Args:
        input_graph (DiGraph): The input graph
        LHS (str): The LHS pattern
        condition (Callable[[dict], bool], optional): A condition for the matches. Defaults to lambda x: True.
        debug (bool, optional): Whether to print the match. Defaults to False.
    """

    pattern, collection_pattern = lhs_to_graph(LHS)

    matches = list(find_matches(input_graph, pattern, collection_pattern))

    if len(matches) == 0:
        raise ValueError("No matches found")
    
    if debug:
        print(matches[0])
    
    return matches[0]

In [None]:
g, _ = lhs_to_graph("A->B,A->C") 
draw(g)

If we construct a rule without specifing P, then P defaults to a copy of LHS. If we don't specify RHS, then RHS defaults to a copy of P.
That is, when constructing a rule without P or RHS, we preserve everything from LHS (P contains all LHS elements) and add nothing (no difference between P and RHS), or in other words - a rule which does not change a thing.

In [None]:
match = _generate_match(g, "A->B,A->C", debug=True)

{'A': {'A'}, 'B': {'C'}, 'C': {'B'}}


In [None]:
_assert_rule(match, single_nodes_lhs=g)

As explained above, P defines which parts of the LHS we want to preserve in the transformation. Therefore, if something from LHS is missing in P, the rule infers that it should remove it.

For example, if P contains all nodes of LHS but none of its edges, the rule marks all LHS edges as edges to remove:

In [None]:
p = p_to_graph("A,B,C") 
_assert_rule(match, single_nodes_lhs=g, p=p, edges_remove={('A','B'), ('A','C')})

Surely, we can remove nodes in the same way. Note that when the transformation module removes a node, it also removes all the edges that are connected to it. Since in the transformation the removal of nodes is done before the removal of edges, whenever the rule finds a node that should be removed, it assumes (correctly) that the connected edges will be removed automatically and therefore, does not mark the connected edges as edges to remove.

Here, the P graph is missing node $A$ (that is connected to edges $(A,B), (A,C)$) as well as all edges in LHS (which are coincidentely exactly the edges connected to $A$). Therefore, while the transformation results in the removal of these edges as well, the rule only marks the node for removal:

In [None]:
p = p_to_graph("B,C") 
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_remove={'A'})

However, if $P$ does not contain node $B$ and all edges of the graph, edge $(A,B)$ isn't marked for removal (automatically removed along with node $B$), while edge $(A,C)$ is marked for removal:

In [None]:
p = p_to_graph("A,C")
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_remove={'B'}, edges_remove={('A','C')})

#### Clone Nodes

As explained in the beginning of this notebook, the combination of LHS and P allows **cloning** nodes. 

Assume that we want to clone an LHS node called $x$. In the P graph, we add a new node called $x*n$, where $n$ is some integer of the user's choosing that is destined to denote a unique number of the copy, and thus allows mutiple clones (though this notion of the copy number is not enforced by the library). 

A P node with such a name is automatically inferred to be a clone of the LHS node $x$ (if no such node exists, an exception is raised), and the transformation module will clone its edges and attributes as expected.
The Rule class identifies P nodes that are clones of LHS nodes, and creates a dictionary that maps LHS nodes to a set of its P clones (if there are any). The edges and attributes that are added by the cloning are not counted as added edges and attributes; The cloned node is not counted as an added node, but rather as a cloned one.

Assume that we want to have 2 clones of node $C$ in P, named $C*1, C*2$ respectively. In LHS, node $C$ is connected to edge $(A,C)$; Therefore, the cloning process will add $C*1, C*2, (A,C*1)$ and $(A,C*2)$:

In [None]:
p = p_to_graph("A->B,A->C,A->C*1,A->C*2") 
draw(p)
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_clone={'C': {'C', 'C*1', 'C*2'}})

Note that in this example, we kept $C$ in the P graph, and it was counted as a clone of the LHS node $C$. If we decided to not keep $C$ in the P graph, we can not mention it (and its connected edges) in P. Note that in such a case, the rule does not count $C$ as a removed node or $(A,C)$ as a removed edge: As far as the rule is concerned, the P version of $C$ is just a clone of LHS'S $C$ which we didn't create here.

In [None]:
p = p_to_graph("A->B,A->C*1,A->C*2") 
draw(p)
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_clone={'C': {'C*1', 'C*2'}})

We can clone different nodes simultaniously, and new edges will be created accordingly (including edges that are between two clones of different nodes). Here we clone $A$ for 3 times (including the P node $A$) and $B$ for 2 times (same); Since LHS contains the edge $(A,B)$, we create here completely new edges such as $(A*5, B*9)$. Note that as explained, the copy number (the $5$ in $A*5$) must be a number but the library does not check that it is a valid clone number:

In [None]:
p = p_to_graph("A->B, A->B*9, A*3->B, A*3->B*9, A*5->B, A*5->B*9, A->C, A*3->C, A*5->C") 
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_clone={'A': {'A', 'A*5', 'A*3'}, 'B': {'B*9', 'B'}})

Cloning can be combined with other operations. For example, clone $A$ and $B$ (both are cloned 2 times) and remove node $C$ (which is not a clone of anything, and therefore is counted as a removed node). Note that in the transformation module, the cloning is done before the nodes removal:

In [None]:
p = p_to_graph("A->B, A->B*9, A*3->B, A*3->B*9")
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_clone={'A': {'A', 'A*3'}, 'B': {'B*9', 'B'}}, nodes_remove={'C'})

As explained above, when cloning a node, all of its edges are cloned as well. Therefore, if we construct P with a clone but do not mention all of the edges that should be created by the clone, the rule will consider the missing edges as removed edges. Here, for example, we clone $C$ but exclude cloning the $(A,C)$ edge for clone $C*1$:

In [None]:
p = p_to_graph("A->B,A->C*2,C*1") 
draw(p)
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_clone={'C': {'C*1', 'C*2'}}, edges_remove={('A','C*1')})

#### Preserve, Remove and Clone (With Attributes)

Our graphs might have attributes, and so - we might want to apply what we've covered so far (preserving, removing and cloning) to handle attributes as well. We begin by initializing an LHS pattern with some attributes (note that the attributes are not plotted):

In [None]:
g = _create_graph(
    ['A','B','C'], [('A', 'B'),('A', 'C'),])
g.nodes['B']['attrB'] = 5
g.edges['A', 'C']['attrAC'] = 10
draw(g)

In [None]:
match = _generate_match(g, "A->B[attrB=5], A-[attrAC=10]->C", debug=True)

{'A': {'A'}, 'B': {'B'}, 'C': {'C'}}


Say that we want to remove a LHS node / edge which has attributes. Just as the removal of a node automatically removed its connected edges without marking these edges as removed - The removal of a node / edge automatically removes its attributes with it, and the rule does not count these attributes as removed attributes. 

In these examples, we remove node $B$ / edge $(A,C)$, both have attributes in LHS:

In [None]:
p = p_to_graph("A-[attrAC]->C")
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_remove={'B'})
p = p_to_graph("A->B[attrB],C") 
_assert_rule(match, single_nodes_lhs=g, p=p, edges_remove={('A','C')})

We can also remove attributes manualy, while keeping the containing node / edge. Just as in order to remove an LHS node we didn't mention it in P, we can remove a LHS attribute by not mentioning it in P. Here, we remove the attribute "attr_b" from $B$. The rule builds a dictionary that maps nodes to the names of the attributes removed from them:

In [None]:
p = p_to_graph("A->B,A-[attrAC]->C") 
_assert_rule(match, single_nodes_lhs=g, p=p, node_attrs_remove={'B': {'attrB'}})

If we clone a node with attributes, or a node that's connected to edges with attributes (as you well remember, edges connected to a cloned node are cloned as well), the attributes are cloned along with it - both attribute names and values.

In our library, cloning a node / edge with attributes causes a clone of **all** of the contained attributes, and the user **is not able** to choose which of the attributes should be cloned. In order to avoid ambiguity, when constructing a P graph that clones a node / edge with attributes, for each of the clones, **the attributes should not be mentioned in P** (they will all be cloned automatically). A violation of this constraint will cause the Rule class to raise an exception:

In [None]:
p = p_to_graph("A,B,C,B*1[attrB]") 
_assert_rule(match, single_nodes_lhs=g, p=p, error=_exception_msgs["attrs_in_cloned_node"]("B*1"))

p = p_to_graph("A->B, A->B*1, C, A-[attrAC]->C*1") 
_assert_rule(match, single_nodes_lhs=g, p=p, error=_exception_msgs["attrs_in_cloned_edge"]("A", "C*1"))

Here, we clone node $B$ as P nodes $B, B*1$ along with its attribute ```{attr_b: 5}```. Note how we do not mention the attributes of either $B, B*1$, and let the transformation module clone them by itself. The rule does not count these automatically-cloned attributes as added attributes:

In [None]:
p = p_to_graph("A->B, A->B*1, A-[attrAC]->C") 
_assert_rule(match, single_nodes_lhs=g, p=p, nodes_clone={'B': {'B', 'B*1'}})

#### Addition

We saw how we can use P to denote preservation, removal and cloning - with or without attributes - of LHS nodes. Up until now, RHS defaulted to a copy of P, denoting that we don't want to add anything to the graph. We will now overview the options RHS provides us. Begin by reseting LHS to be our initial, simple pattern without any attributes:

In [None]:
g = _create_graph(['A','B','C'], [('A', 'B'),('A', 'C'),])   
draw(g)

In [None]:
match = _generate_match(g, "A->B, A->C", debug=True)

{'A': {'A'}, 'B': {'C'}, 'C': {'B'}}


Say we want to add a new node to our graph, name it $D$. The rule finds all nodes that appear in RHS but not in P, and marks them as added nodes. Here, P is not specified (and thus default to LHS), and so $D$ is the only new node:

In [None]:
rhs = rhs_to_graph("A->B,A->C,D", match = None, render_funcs={}) 
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, nodes_add={'D'})

The same can be done with edges. We can add edges that use nodes which existed in P, as we do here with the new edge $(B,C)$:

In [None]:
rhs = rhs_to_graph("A->B,A->C,B->C", match = None, render_funcs={}) 
draw(rhs)
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, edges_add={('B','C')})

But we can also combine edges which use the new added nodes. Here, we combine the addition of a new node $D$, a new edge that uses it $(A,D)$ and a new edge that uses only P nodes $(B,C)$:

In [None]:
rhs = rhs_to_graph("A->B->C,A->C,A->D", match = None, render_funcs={}) 
draw(rhs)
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, nodes_add={'D'}, edges_add={('B','C'), ('A','D')})

The use of attributes in addition is very intuitive: Attributes that appear in RHS but not in P are considered as added attributes, including all attributes of any new node (excluding merges; we'll get to that shortly). 

For example, assume we update our LHS with an attribute added to edge $(A,C)$:

In [None]:
g = _create_graph(['A','B','C','D'], [('A', 'B'),('A', 'C', {'attrAC': "ac"}),('B','C'),('A','D')])   
lhs, _ = lhs_to_graph("A->B,A-[attrAC=\"ac\"]->C, A->D")
match = _generate_match(g, "A->B,A-[attrAC=\"ac\"]->C")

If we add a new attribute for existing node $A$, new node $D$, existing edge $(A,B)$ and new edge $(A,D)$:

In [None]:
rhs = rhs_to_graph("A[attrA=\"a\"]-[attrAB=\"ab\"]->B->C, A-[attrAD=\"ad\"]->D[attrD=\"d\"], A-[attrAC=\"ac\"]->C", match=None, render_funcs={}) 
_assert_rule(match, single_nodes_lhs=lhs, rhs=rhs, nodes_add={'D'}, edges_add={('A','D'), ('B','C')},
             node_attrs_add={'A': {'attrA': 'a'}, 'D': {'attrD': 'd'}},
             edge_attrs_add={('A','B'): {'attrAB': 'ab'}, ('A','D'): {'attrAD': 'ad'}, ('A', 'C'): {'attrAC': 'ac'}})

#### Merge Nodes

As explained in the beginning of this notebook, the combination of P and RHS allows **merging** nodes.

Assume that P contains nodes $x_1,...,x_k$, and that we want to **replace** them all with a single, merged node - merges all of their attributes and edges. In the RHS graph, we add a new node called $x_1 \& x_2 \& ... \& x_k$. A RHS node with such a name is automatically inferred to be a merge of the P nodes $x_1,...,x_k$ (if any of them does not exist in P, an exception is raised), and the transformation module will handle the merging of the edges and attributes as expected.

The Rule class identifies RHS nodes that are merges of P nodes, and creates a dictionary that maps RHS nodes to a set of P nodes which they merge (if there are any). The edges and attributes that are added by the merging are not counted as added edges and attributes; The nodes, edges and attributes that were removed (merged into the new node) are not marked as removed; The merged node is not counted as an added node, but rather as a merged one.

For the basic, attribute-less examples, we return to our basic LHS (Which will also serve as P here):

In [None]:
g = _create_graph(['A','B','C'], [('A', 'B'),('A', 'C'),])
draw(g)

In [None]:
match = _generate_match(g, "A->B, A->C", debug=True)

{'A': {'A'}, 'B': {'C'}, 'C': {'B'}}


We begin with a simple example, where we want to merge nodes $B$ and $C$ into a new node called $B\&C$. Note that in the resulting RHS graph, $B$ and $C$ do not exist anymore, and so are the connected edges $(A,B),(B,C)$. They were replace by the new merged node, and a new edge $(A,B\&C)$ which merges the two original edges. Although we removed and added different edges and nodes here, none of them are marked as added or removed by the rule:

In [None]:
rhs = rhs_to_graph("A->B&C", match=None, render_funcs={})
draw(rhs)
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, nodes_merge={'B&C': {'B','C'}})

If we merge nodes $A, B$, then the original edge $(A,B)$ is now replaced by $(A\&B, A\&B)$ - that is, a self loop:

In [None]:
# Merge node A and B
rhs = rhs_to_graph("A&B->A&B->C", match=None, render_funcs={})
draw(rhs)
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, nodes_merge={'A&B': {'A','B'}})

We can merge more than two nodes into a new one. Here we do so with all three nodes in P, resulting in a single node with a self loop:

In [None]:
# Merge node A, B and C
rhs = rhs_to_graph("B&C&A->B&C&A", match=None, render_funcs={})
draw(rhs)
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, nodes_merge={'B&C&A': {'A','B','C'}})

Let's see how attributes behave when merging nodes and edges which contain them.

In our library, merging a node / edge with attributes causes a merge of **all** of the contained attributes, similarily to the cloning process. Therefore, when merging, the merged node **should not mention the merged attributes** (automatically they will all be merged and added).

Begin with a simple case: Given the basic LHS with some attributes:

In [None]:
g= _create_graph(['A','B','C'], [('A', 'C'),('A', 'B')])
g.nodes['A']['attr'] = 1
g.edges['A', 'B']['attrAB']=2
draw(g)


In [None]:
match = _generate_match(g, "A[attr=1]->C,A-[attrAB=2]->B", debug=True)

{'A': {'A'}, 'C': {'C'}, 'B': {'B'}}


We merge nodes $A,B$, where $A$ has a single attribute and $B$ has no attributes. Therefore, we expect the merged $A\&B$ node to have this single attribute with the same value. Note that similarily to the behaviour of attributes in cloned nodes, the attributes added to the merged node by the merging process are not marked as added (and are not mentioned at all, as explained above):

In [None]:
rhs = rhs_to_graph("A&B->A&B->C", match=None, render_funcs={})
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, nodes_merge={'A&B': {'A','B'}})

Unlike cloning, merging **does** allow mentioning attributes in the merged node; The rule will read these attributes as new attributes which are added after the merging.
For example, if we want to add a new attribute to the merged node $A\&B$ after the merging, we do mention it in the RHS and the rule will identify it as an added attribute:

In [None]:
rhs = rhs_to_graph("A&B[attr2=2]-[attrAB2=\"ab\"]->A&B->C", match=None, render_funcs={})
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, nodes_merge={'A&B': {'A','B'}}, node_attrs_add={'A&B': {'attr2': 2}},
             edge_attrs_add={('A&B', 'A&B'): {'attrAB2': 'ab'}})

When the transformation will be executed, the resulting $A\&B$ node will have both attributes.

What if we want to override one of the merged attributes of $A\&B$? In such a case, we mention the overriding attribute in RHS, and it will be marked as an added attribute.

In [None]:
rhs = rhs_to_graph("A&B[attr=5]->A&B->C", match=None, render_funcs={})
_assert_rule(match, single_nodes_lhs=g, rhs=rhs, nodes_merge={'A&B': {'A','B'}}, node_attrs_add={'A&B': {'attr': 5}})

During transformation, first the merged node will inherit the original value of the attribute; Then, when adding new attributes to RHS, the new value of the attribute will override the old one.

Note that this allows a case where we merge two nodes with an attribute of the same name. What should the value of the attribute be in the merged node? Here's a problematic example case:

In [None]:
lhs, _ = lhs_to_graph("A[attr=\"a\"] -[attr2=\"ab\"]-> B[attr=\"b\"], A -[attr2=\"ac\"]-> C") 

Our library provides a few built-in tie breakers, via the static **MergePolicy** Functions. The standard (and default) option is **choose_last**, which just picks the value of the last node in the list of P nodes merged into the new RHS node. A more advanced option, **union**, sets the value of the attribute in the merged node to be a list that contains the attribute values from all different P nodes merged into that RHS node.

Since we check here only rules and not the graphs which they transform, we cannot test the merge policy here. Instead, it will be checked in the transform module.

#### Combine it all

We will now do some a few last tests, using multiple abilities of the Rule class, as the user will do with this library:

For the following LHS:

In [None]:
g = _create_graph(['A','B','C'], [('A', 'B'),('A','C')])
g.nodes['B']['attr'] = 1
g.nodes['B']['attr2'] = 2
draw(g)


In [None]:
match = _generate_match(g, "A->B[attr=1, attr2=2], A->C", debug=True)

{'A': {'A'}, 'B': {'B'}, 'C': {'C'}}


In [None]:
# Remove node B (with edges and attributes), add node D and connect it to C
p = p_to_graph("A->C")
rhs = rhs_to_graph("A->C->D", match=None, render_funcs={})
draw(rhs)
_assert_rule(match, single_nodes_lhs=g, p=p, rhs=rhs, nodes_remove={'B'}, nodes_add={'D'}, edges_add={('C','D')})

In [None]:
# Remove node C (with edges), remove attribute 'attr' from B, clone A two times,
# add node D, connect one of the clones to D, add attribute 'attr3' to B
p = p_to_graph("A*1->B, A*3->B[attr2]") 
rhs = rhs_to_graph("D->A*3->B[attr2, attr3=\"b3\"], A*1->B", match=None, render_funcs={}) 
draw(rhs)
_assert_rule(match, single_nodes_lhs=g, p=p, rhs=rhs, nodes_remove={'C'}, node_attrs_remove={'B': {'attr'}}, nodes_clone={'A': {'A*1', 'A*3'}},
             nodes_add={'D'}, edges_add={('D','A*3')}, node_attrs_add={'B': {'attr3': 'b3'}})

In [None]:
# Clone a node B, just to merge its clones back later
p = p_to_graph("A->B*1, A->B*2, A->C") 
rhs = rhs_to_graph("A->B*1&B*2, A->C", match=None, render_funcs={}) 
draw(rhs)
_assert_rule(match, single_nodes_lhs=g, p=p, rhs=rhs, nodes_clone={'B': {'B*1', 'B*2'}}, nodes_merge={'B*1&B*2': {'B*1', 'B*2'}})

More integration testing will be done in the Transform module.

# Export

In [None]:
#|hide
import nbdev; nbdev.nbdev_export()
     