# Skeleton Notebook:

**Context**: After an important document from the Generic Space Empire (GSC) was stolen, Non-Copyrighted Alien spies began complex attacks against the prize of the GSC, the space station. These attacks are more complex than before, requiring a compromising of a combination of components, often distributed across workstations. In order to defend against these attacks, the GSC has hired our team to develop a methodology to characterize the faults found within the system after an attack. 

**Motivation**: Mode is a term for behavior of a system. All components in a system have behaviors associated with modes, as well as modes that are unknown to the component. Mode estimation takes in the mode, state, and observable variables and tries to determine a diagnoses for what is causing the outcome state of the system. Mode estimation is a very useful tool for characterizing what has occured in a system that could produce a given end state.  


# Formulation

Mode estimation will help us identify  possible diagnoses for the system, but we still need to develop a model and a search mechanism for mode estimation to be performed on. We will formulate this problem as a search problem, with a constraint satisfaction problem model, which is described in greater detail below.

**Model**: To build our model, we plan to use a constraint satisfaction problem model. Constrain Satisfaction Models are used to describe real world problems. Our model will represent an input state, which is based off a set of variable assignments, and a list of conditions for a state to be a solution. These conditions will be defined by using a set of constraints on the aforementioned variables. 

For the sake of our problem, we will use a very simple CSP model, which will be trouble shooting circuits. For our model :

**Input:**
For each work station
 - number of circuits
 - nominal voltage required
 
 
 For the full fleet
 - number of workstations
 - total capacity 

**Output:** Determine for each work stage
 - voltage passed to system


**Purpose:** Assess if the circuitboard is working properly for the workstation

The function below will be used to develop a model based off this CSP. 

In [5]:
class model_generation():
    
    # CONSTRAINT Decision Variables
    # will be used to define the conditions for a state to be a solution [i.e. passes a given needed voltage]
    

    # Variable 
    # what variables will be used to define each of the work stations
    
    # Domains
    # possible assignments/values of each variable
    
    def __init__(self, variables, domains, constraints):
        self.variables = variables
        self.domains = domains
        self.parents = parents # used to track parents for mode propagation
        self.constraints = {}
        for var in variables:
            if var in constraints:
                self.constraints[var] = constraints[var]
            else:
                self.constraints[var] = []
    
    def assign(self, var,val,assignment):
        ## add {var:val} to assignmnet, override the old value if there was one
        assingment{var}=val

    
    def add_constraint(self, constraint):
        # add a constraint
        pass
    
    def consistent(self, variables, assignments):
        # given list of variables and their assignments
        # check if they are consistant with all constraints
        pass
    
    def search_assigment(self, assignments):
        # Given an empty or paritial assignement
        # try every possible assignment for each variable that does not violate constraints
        # stop when we find one possible solution to the question 
        #     in such case, len(assigment) == len(self.variable)
        # Or we searches through all combinations of assignments but no solution can be found, return None
        
        pass
    
    def satisfied(self, assignment):
        # check if given assignment satisfies the question and is consistent with the constraint
        # in other words, all variables are assigned to a value that not violate constraint
        pass
        
        

In order to diagnosis hidden failures in a model, it is vital to incorporate baseline reasoning for how the model will perform. Since we set up the model as a CSP, there is fundamental logic for how the different variables interact with each other. This logic is called a constraint. 

To determine hidden failures, the system must rely on two assumptions:
- set of diagnoses should be complete
- set of diagnoses should exploit all available information
From these assumptions, the model must generate a list of candidates. 


The function below generates a list of combinations based off of all consistent unknown modes. A consistent mode is a mode that fulfills the logical constraints defined in the CSP. The function below describes the logic we plan to employ to identify all combinations of consistent unknown modes

In [None]:
def p_mode_estimate(modes):
    # assume independence between modes and equal chance of observation given mode assignment
    # find probabilies for all the modes given their corresponding observation (aka posteri)
    
    ## initializes the probability [depth = 0]
    def prior_probability(mean,variance,input_val):
    # calculate the probaiblity of the input value based on the mean variance of the distriubtion
        return probs
    
    
    ## propogates teh probability for more complex cases    
    def calc_probability(prior_probability, input_state, path):
        return current_prob_estimate
    
    
    ## calculates the probability for all modes, and they will be ordered by most likely to least likely
    return probability_set
    
        
    


Due to unknown modes, there can be an exponential number of mode estimates. When given a solution space of multiple possible diagnoses, then mode esitmation will help determine which set was the most likely to happen. To reasonable scale the solution space, mode estimation involves computing the probability of each mode estimate, and then enumerating based off the mostly likely candidates.

However, mode estimation is only one part of the methodology to search and find the most likely diagnosis. The second aspect to being able to self-diagnosis is having a search algorithm to find states that the mode estimation can be performed on


# Simple Search

To start with diagnosing the system, our team will begin with a simple check of the model that looks for logical entailment. In this function, we would assume full knowledge and observability of the system. Then, we would search through the system to see if there was a clear cause that could be result in the error. For example, if work station 2 failed, and work station 2 is solely dependent on component A1, then component A1 would be the diagnosis for failure.In this method, we would be looking at the states of the system, to compare the state output with the end state output

In [6]:
def logic_entail_check(symptom, model):
    # given a set of symptoms within a model (aka the starting state)
    # this function will find one (possible if there is many)? diagnosis that could correlate with symptoms
    # if no clear answer is found, then the function will return an empty list
    
    # In other words, symptom now becomes our constraint, and the goal is to 
    # find possible assignment of each variable that is consistent with constraint
    # i.e. component A1 should be assigned to 0 (to indicate malfunctioning) in final assignment
    return diagnoses


This method is a very simple search algorithm, and would be computationally expensive, and in the case where an unknown mode is reseponsible for the system outcome, would produce a non-optimal solution. To improve the fidelty of our search algorithm, our system will need to compare the constraints of the model. 

# Modeling Constraints

A* is a best first search algorithm. With CBA*, A* is used to search over the decision variables. Decision varaibles are different from states, where states are only a partial assignment. A decision marks a transition from one to the next. An initial state has no assignment, and the goal state is a complete assignment with all the relevant decision variables. By searching only over the decision variables, we are able to skip decisions that logically dont make sense from the constraint in our model. Our code for CBA* is shown below

In [None]:
def cba_heuristic():
    
    queue=[]
    expanded_list =[]
    while queue !empty:
        assignment=queue.pop() # get first value in queue
        expanded_list.append(assignment)
        
        if assingmnet is full assignment to decision variables # potential goal
            if consistent(assignmnet):
                return assignment
        else:
            x_i=decision variable not in assigned in assignments
            neighbors= split_on_variabe(assignment,x_i)
            add each x_i in neighbors to queue if not in expanded_list
            
    
    return no solution

In [None]:
def split_on_variable(assignment,x)
#purpose choose our successor states as extensions of the current state by picking some of the varaibles
# that have not yet been assignment
    return [assignment U{x_i=d_j} for each d_j in x_i domain]

In [None]:
def consistent(assignment,goal):
    
    for each constraint in goal:
        #searches over non-decision variables and checks constraints

One of the drawbacks of CBA* is that is keeps searching over problematic areas of the state space. For example, even if one of the states at the higher depth levels contains a premise that is logically impossible, the children of that node will still be added to the queue and expanded. This creates additional expansions that are not needed. To address the issue of searching spaces that we don't need to, we can apply a pruning strategy to CBA* to imrpove the computational processing time.


# Conflict-Directed A* Search 

Conflicts refer to a partial set of assignments to decision variables which cannot all be true at once. To identify a conflict, we modify the consistent function that was written above.

In [None]:
def consistent_conflicts(assignment,gamma):
    
    
    #is concsistent is a boolean, and conflict is the conflict
    
    # for each assignment ci, for ci in constitutent_kernel, if its self-consistent
    return (is_consistent,conflict)

Combining it all together, Conflict directed A* is shown in the algorithm below

In [None]:
def cda_start()
    queue=[]
    rho=[] # list of conflicts
    expanded_list=[]
    
    while queue !empty:
        assignment=queue.pop() ## takes first element from assignment
        expanded_list.append(assignment)
        if assignment is full assignment to decision:# (i.e. if assignment matches teh goal)
            is_consistent, conflict = consistent(assignment)
            if is_consistent:
                return assignment
            else
                rho.add(conflict)
                new_queue =search_queue_for_conflict(queue,conflict)
        else:
            if assignment fixes rho
                xi=decision variable not assigned in assignment
                neighbors=split_on_variable(assignment,xi)
            else if assignment doesnt resolve conflict in rho:
                neighbors = consistent conflicts(assignment,gamma)
                for each in neighbors:
                    if not in expanded_list:
                        queue.add(each)
                        
    #if the code gets here, then its looked through everything but couldnt find it           
    return no solution

In [None]:
def search_queue_for_conflict(queue,conflict)

    return queue

**Result:** The outcome of the CDA* search is the best possible diagnoses for what caused the system, based on what is defined as a consistent result. The CDA* search process is more computationally effective than alternatives like the former simple search or CBA*.