In [None]:
#Python3(3.6) Notebook
# Please begin by running this cell
from classes import *

### Note on determining conflcits

Determining a minimal set of componenents involved in a conflict is, in general, NP-hard. Heuristics are often used, for instance in circuits, upstream components is often one heuristic. All components in the system is another (less useful) heuristic. Probabilistic mode estimation uses conflicts in order to generate the best possible guess about the health of all the components, if conflicts aren't particularly minimal (in the worst case, if they involve all components) Probabilistic Mode Estimation will only enumerate the failure space in a-priori failure probability order. So, it's not wrong to poorly minimize conflicts, it just makes probabilistic mode estimation perform poorly.   

## Probabilistic Mode Estimation

Probabalistic Mode Estimation or Mode Identification utilizing probabilistic inference's goal is to determine the state of the components in the system. 

One option, as always, is that all the components are operating correctly, but when there are discrepencies, heursitics can yield sets of components that could be at fault in the system. Mode Estimation then will give a probabilistically ordered list of configurations that your system could be in, instead. 

One way to perform this operation is constraint-based A*. 

The below tree visualization shows how the conflicts act to trim the search tree from paths that don't contain at least one of the conflicts. At each level of the tree is a decision to assign one of the components.

**Trim paths whose unknowns aren't present in at least one of the conflicts, or aren't present in all of the conflicts? That would depend on the means of generating conflicts. You might need to pre-process with a kernel diagnoses tool if you want to put more than one conflict through, though if it's _Trim paths whose unknowns aren't present in at least one of the conflicts_, you're probably fine and save computation time anyway.** _That said, each mode identification should include all the components that could be failed, so you'd want every conflict to be represented by one of the components marked failed in the identifcation, and trim once it became clear your choice of components wasn't going to satisfy one of the constraints (remaining assignments not in conflicts)._

The output of probabilistic mode estimation is a enumerable set of component assignments. 

Let's say you got a fancy new boolean logic AND gate. Unfortunately, when you apply two True voltages to the inputs, the output is **zero**! What a scam you say, the component is totally bunk! 

Similarly, you bought your very own resistor network as found in lecture, and one of the outputs is wrong! Any of it or the components that lead to it could be *bogus*, you pontificate!

- **Symptom**: The discrepency between the value you expect and the value you find.
- **Conflict**: The set of components that are involved in a symptom. Generally found using heuristics.
  If you remove all constraints related to these components, the problem goes away (but so does your circuit!)
- **Diagnoses**: Colloqually used interchangably with conflicts, but has a more optomistic sense-- they're used to find "resolutions." 
- **Kernel diagnoses**: If you have multiple diagnoses, you can find all minimal combinations of conflicting components that explain your data. They're helpful because the act as a (good!) huersitic for determining resolutions. 
- **Resolution**: A bold claim that these are the components that have failed in your system. **May or may not be accurate!** You have to run DPLL to find out on your own system.  
  Usually you'll want to iterate through your resolutions in probability order, given a prior probability of component failure.
- **Component**: Well, you know, the parts the circuit is made of.
- **Mode**: The status of an individual component, as in "Mode Identification"
- **Model**: The truth values assigned to the propositions in a logic sentence evaluating to True.

In general, the insights you form as diagnoses from symptoms improve the process of Mode Identification (finding resolutions).

In [None]:
ands = "((A1 & A & C) ==> X) & ((A2 & B & D) ==> Y) & ((A3 & C & E) ==> Z)"
xors = "((X1 & ((X & ~Y) | (~X & Y))) ==> F) & ((X2 & ((Y & ~Z) | (~Y & Z))) ==> G)"
inputs = "A & B & C & ~D & E"
outputs = "~F & G" #~F is our identified symptom!
AND.from_string_to_cnf("&".join([ands, xors, inputs, outputs])).print()
print("You might remember this network from the lecture slides!")

Given the following components and failures, we'd like to be able to walk through the possible resolutions in probability order.

In [None]:
class_kernel_diagnoses = [[('A2', 0), ('A3', 0)], [('A2', 0), ('X2', 0)], [('A1', 0)], [('X1', 0)]]
class_possible_faults = {"A1": .99, "A2": .9, "A3": .99, "X1": .8, "X2": .8} # (Name, Probability it *works*)
class_components_order: [(str, float)] = list(class_possible_faults.items())

In [None]:
from collections import Iterable

# Inputs: possible_faults: prior probabilities of component failures
#         kernel_diagnoses: minimal combinations of conflicting components that explain the conflicts
#         components_order: the order to traverse the varaibles 
# Procedure: Perform constraint-based A*
def get_most_likely_a_star(possible_faults: dict, kernel_diagnoses, components_order: list) -> Iterable:
    max_queue: [(float, AStarNode)] = []
    tree = AStarNode(None, None, None, parent=None)
    tree.construct_tree(components_order)
    max_queue.append((0, tree))

    # Every conflict must be represented by one of the components marked failed in this resolution.
    # Kernel diagnostics encapsulates what it means to represent each conflict.
    # So to avoid an A* trim, we must be able to contain all the elements of at least one kernel diagnostic.
    def satisfies_a_kernel(tree):
        # make sure that the marked-failed *union* the not-yet assigned contain at least one kernel diagnostic
        assignments = tree.assignments
        assigned_variables = set([assignment[0] for assignment in assignments])
        unassigned_variables = set(possible_faults.keys()).difference(assigned_variables)
        faulty_variables = set([assignment[0] for assignment in assignments if assignment[1] is False])
        kernel_friendly_variables = unassigned_variables.union(faulty_variables)

        for kernel in kernel_diagnoses:
            if kernel_friendly_variables.issuperset(set([component[0] for component in kernel])):
                return True
        return False

    while len(max_queue) > 0:
        max_index = max(range(len(max_queue)), key=lambda i: max_queue[i][0])
        max_tree: AStarNode = max_queue.pop(max_index)[1]

        if max_tree.is_trimmed:
            continue
        elif not satisfies_a_kernel(max_tree):
            max_tree.set_trimmed(True)
            continue
        elif len(max_tree.children) == 0: # and satisfies a kernel
            yield max_tree
            continue

        left_child = max_tree.children[0]
        right_child = max_tree.children[1]

        max_queue.append((left_child.total_cost, left_child))
        max_queue.append((right_child.total_cost, right_child))

In [None]:
for tree_node in get_most_likely_a_star(class_possible_faults, class_kernel_diagnoses, class_components_order):
    print(str(tree_node.assignments) + ". Probability: " + str(tree_node.total_cost))

Above you'd see all the possible resolutions. The resolution with only **X1** failing is far more likely than the others!