# Match rules interface

This notebook demonstrates match rules.

Rules can be imported, or suggested from manual matches, model matches or cdf matches. A jupyter interface assists in adding new rules, and assessing if the rules are good or should be deleted. Any two of the matches produced by the rules and the imported lists of matches can be compared.

In [None]:
import json
from match_rule_helper import MatchRuleHelper
rule_helper = MatchRuleHelper(project = "contextualization")
#rule_helper = MatchRuleHelper.from_json(json.load(open("saved.json", "r")))

Run the cell below to select a root asset. This allows selecting sources and targets by running rule_helper.set_helper_resources(). Otherwise use set_sources/targets.

In [None]:
rule_helper.resource_helper.select_root_asset()

In [None]:
# Use selected root asset to select all timeseries and assets under that root
rule_helper.set_helper_resources()

# Alternatively uncomment below and supply the sources 
#rule_helper.set_sources() # Takes list of dict entities representing sources
#rule_helper.set_targets() # Takes list of dict entities representing targets

# Set what fields apart from id that will be taken into account in rules and displaying the entities
rule_helper.set_source_fields(["name"])
rule_helper.set_target_fields(["name"])

#rule_helper.add_match_set("two_matches", [(2257052857986, 3785195619230089), (2415984517454, 1820151336672073)])

If the sources have asset_ids, they are used to create the list of matches currently in CDF.

In [None]:
rule_helper.add_cdf_matches()

Run the cell below to add a list of matches to the helper from an unsupervised EM model. The list of matches can be used to generate rules later.

In [None]:
model = rule_helper.client.entity_matching.fit(rule_helper.reduced_sources, rule_helper.reduced_targets)
model_matches = model.predict().result
threshold = 0.6

# Reformat the EM matches to a format that the rule helper accepts.
filtered_matches = [
    {"source": matches["source"], **match} for matches in model_matches["items"] 
    for match in matches["matches"] if match["score"] > threshold
]
rule_helper.add_match_set("model_matches", filtered_matches)

In [None]:
# Match lists can be added on different formats:

format_1 = [(2257052857986, 3785195619230089), (2415984517454, 1820151336672073)]
format_2 = [
    {"sourceId": 2257052857986, "targetId": 3785195619230089}, 
    {"sourceId": 2415984517454, "targetId": 1820151336672073}
]
format_3 = [
    {"source": {"id": 2257052857986}, "target": {"id": 3785195619230089}},
    {"source": {"id": 2415984517454}, "target": {"id": 1820151336672073}},
]

#rule_helper.add_match_set("two_matches", format_1)

Running the cell below opens a user interface for editing lists of matches. These lists can be used to generate rules that are added to the rule set, and they can be compared to the matches that the rule set produce.

There is a default empty list of matches, and possibly other lists if rule_helper.add_match_set() has been called. The list can be changed by selecting "user match list".

Select a source and a target and click "Add match" to add it to the active list. Select a match and click "Remove match" in order to remove a match. The entities and matches are represented by the selected fields. E.g. if source field and target field are both "name", a match will display as a tuple (source name, target name). 

Use substring search to filter sources and targets. The source, target and matches dropdown are limited to the first 100 results.

In [None]:
rule_helper.edit_user_matches()


Running the cell below starts an interface that enables generating new rules from one of the lists, inspecting rules, and deleting or confirming rules. A status field shows if the rule set is running a job or is ready for suggesting more rules.

In [None]:
rule_helper.edit_rules()

Compare matches, either between the added match lists or between a match list and the matches produced by the rules.

In [None]:
rule_helper.compare()


In [None]:
# Access list of rules for exporting and using elsewhere. 
# Deleted rules will be present if not apply changes has been run
rule_helper.rule_editor.rules


In [None]:
# Import rules to the rule_helper
rule_helper.rule_editor.add_rules([])

In [None]:
# Save the rule_helper, reopening it later will run apply_rules again in order to get the rule matches.
with open("saved.json", "w") as f:
    f.write(json.dumps(rule_helper.to_json(), indent=2))