The first algorithm that will be implemented is the system disagreement evaluator. To this extend, to implementation are provided. The first implementation calculates the overall system disagreement level at each time step while the second implementation considers the individual disagreement levels of each predictor. 

The following example was portrait in the disseratation proposal:

The following predictors have predicted the following values: Predictor I = 2, Predictor II = 5, Predictor III= 10. Predictor I average disagreement = $\mathit{|2-2| + |2-5| + |2-10|} = 11/3 = 3.67$. Predictor II average disagreement = $\mathit{|5-2| + |5-5| + |5-10|} = 8/3 = 2.67$ and finally Predictor III average disagreement = $\mathit{|10-2| + |10-5| + |10-10|} = 13/3 = 4.33$. These values provide the systems overall disagreement level = $\mathit{(3.67 + 2.67 + 4.33)/3} = 3.56$. 

At time $\mathit{t_1}$ the final decision is compared to the real value and new weights are assigned to all predictors depending on how far off their previous prediction was from the real value. These process is repeated indefinitely and weights are adjusted accordingly at each step.

In [21]:
import pandas as pd

In [53]:
d = {'Predictor I': [2, 4, 6, 6], 'Predictor II': [5, 5, 6, 5], 'Predictor III': [10, 5, 8, 8], 'Predictor IV': [2, 4, 6, 6]}
de = {'Predictor I': [2, 4, 6, 6], 'Predictor II': [5, 5, 6, 5], 'Predictor III': [10, 5, 8, 8]}
dg = {'Predictor I': [2, 4, 6, 6], 'Predictor II': [5, 5, 6, 5]}

four = pd.DataFrame(data=d)
three = pd.DataFrame(data=de)
two = pd.DataFrame(data=dg)



This first disagreement algorithm focuses on the calculation of the overall system disagreement, at each time step, between all predictors. This implementation does not show the details of each individual predictor.

In [23]:
def disagreement(data) -> list:
    '''Takes in a DataFrame containing forecasts of different predictors and
       calculates the disagreement score of the overall system.
    
        Parameters:
            data (df): individual predictors forecast output
        
        Returns:
            (df): containing overall system disagrement scores
    '''
    system_disagreement = []
    for k in range(data.shape[0]):
        individual_scores = []
        for i in range(data.shape[1]):
            for j in range(data.shape[1]):
                individual_scores.append(abs(data.iloc[k,i] - data.iloc[k,j]))
            
        system_disagreement.append(sum(individual_scores) / len(individual_scores))
        individual_scores.clear()
    
    output = pd.DataFrame()
    output['System Disagreement'] = system_disagreement
    return output

In [24]:
test = disagreement(df) # 4
test1 = disagreement(df8) #3
test2 = disagreement(df9) #2

new = disagreement(df)

In [25]:
new

Unnamed: 0,System Disagreement
0,3.375
1,0.5
2,0.75
3,1.125


The second implementation of the disagreement algorithm focuses on the individual predictors disagreement level with respect to all the other algorithms. This serves to be able to further understand which algorithm contributes to what extend to the overall system disagreement.

In [26]:
def predictor_score(data) -> list:
    '''Takes in a DataFrame and calculates each individual predictors disagreement
       scores.
    
        Parameters:
            data (df): individual predictors forecast output
        
        Returns:
            (df): containing all predictors individual 
    '''
    individual_score_collection = []
    for k in range(data.shape[0]):
        average_values = []
        for j in range(data.shape[1]):
            individual_scores = []
            for i in range(data.shape[1]):
                individual_scores.append(abs(data.iloc[k, j] - data.iloc[k, i]))
        
            average_values.append(sum(individual_scores) / len(individual_scores))
            individual_scores.clear()
            
        individual_score_collection.append(average_values)
    
    return pd.DataFrame(individual_score_collection)

In [27]:
test = predictor_score(df)
test2 = predictor_score(df8)
test3 = predictor_score(df9)

In [28]:
test

Unnamed: 0,0,1,2,3
0,2.75,2.75,5.25,2.75
1,0.5,0.5,0.5,0.5
2,0.5,0.5,1.5,0.5
3,0.75,1.25,1.75,0.75


In [29]:
# %run hello.py, this allows to import python scripts!!!

All weights are initialized with a value of 1. After the first real value has been observed, the error rate of each predictor is determined by taking the absolute difference between the forecast and the real value. All error rates are summed and each predictors individual error is divided by the whole error rate calculated prior. Next, from each fraction computed 1 is subtracted which yields the weights for the next forecast consensus. For example, Predictor I predicts 2, Predictor II predicts 5, Predictor III predicts 10. The consensus value is the average between these values since the initial weights are all set to 1 ($\mathit{(2 + 5 + 10)/3 = 5.67}$). Now, if the true value at $\mathit{t_1}$ is 6, following new weights will be assigned. First, calculate all error values: Predictor I = $\mathit{|6-2| = 4}$, Predictor II = $\mathit{|6 - 5| = 1}$ and Predictor III = $\mathit{|6-10| = 4}$. The total error equals 9. Hence, the new weight assigned to Predictor I is $\mathit{1 - (4/9) = 0.56}$, Predictor II is $\mathit{1 - (1/9) = 0.89}$ and Predictor III is $\mathit{1 - (4/9) = 0.56}$.

In [30]:
d = {'Predictor I': [2, 4, 6, 6], 'Predictor II': [5, 5, 6, 5], 'Predictor III': [10, 5, 8, 8]}
df = pd.DataFrame(data=d)
df

Unnamed: 0,Predictor I,Predictor II,Predictor III
0,2,5,10
1,4,5,5
2,6,6,8
3,6,5,8


In [31]:
d1 = {'Real Value': [6, 5, 6, 7]}
df1 = pd.DataFrame(data=d1)
df1

Unnamed: 0,Real Value
0,6
1,5
2,6
3,7


In [32]:
def formatting(target: list) -> list:
    '''Helper function to transform a list containing additional, unnecessary dataframe details into a pure list
       containing only target values.
       
       
        Parameters:
            target (list): list containing unnecessary additional information
        
        
        Returns:
            (list): list containing target values
    '''
    for i in range(len(target)):
        try:
            target[i] = target[i][0]
        except:
            target[i] = target[i]
    
    return target

In [33]:
def new_weights(preds: list, real_value: float) -> list:
    '''Helper function to calculated new weights, depending on t-1 forecast errors of predictors.
    
        Parameters:
            preds (list): t-1 predictions of each predictor
            real_value (float): real value at t
        
        
        Returns:
            (list): list containing the new weight values for each predictor
    '''
    if type(preds) != type(list):
        preds = list(preds)
        
    individual_error = []
    new_weights = []
    final_weights = []
    
    for i in range(len(preds)):
        individual_error.append(abs(preds[i] - real_value))
    
    total_error = sum(individual_error)
    for j in range(len(individual_error)):
        try:
            if sum(total_error) == 0:
                new_weights.append(1)
            else:
                new_weights.append(1-(individual_error[j]/total_error))
        except:
            if total_error == 0:
                new_weights.append(1)
            else:
                new_weights.append(1-(individual_error[j]/total_error))

        
    for k in range(len(new_weights)):
        final_weights.append((new_weights[k]/sum(new_weights)) * len(preds))
    
    return formatting(final_weights)

In [34]:
def new_weights_correcting(preds: list, real_value: float) -> list:
    '''Helper function to calculated new weights, depending on t-1 forecast errors of predictors.
    
        Parameters:
            preds (list): t-1 predictions of each predictor
            real_value (float): real value at t
        
        
        Returns:
            (list): list containing the new weight values for each predictor
    '''
    if type(preds) != type(list):
        preds = list(preds)
        
    final_weights = []
    
    for i in range(len(preds)):
        final_weights.append(real_value/preds[i])
    
    return formatting(final_weights)

In [35]:
def new_weights_test(preds: list, real_value: float) -> list:
    '''Helper function to calculated new weights, depending on t-1 forecast errors of predictors.
    
        Parameters:
            preds (list): t-1 predictions of each predictor
            real_value (float): real value at t
        
        
        Returns:
            (list): list containing the new weight values for each predictor
    '''
    if type(preds) != type(list):
        preds = list(preds)
        
    individual_error = []
    new_weights = []
    final_weights = []
    
    for i in range(len(preds)):
        individual_error.append(abs(preds[i] - real_value))
    
    total_error = sum(individual_error)
    for j in range(len(individual_error)):
        
        if sum([total_error]) == 0:
            new_weights.append(1)
        else:
            new_weights.append(1-(individual_error[j]/total_error))
        

        
    for k in range(len(new_weights)):
        final_weights.append((new_weights[k]/sum(new_weights)) * len(preds))
    
    return formatting(final_weights)

In [36]:
def new_weights_focused(preds: list, real_value: float) -> list:
    '''Helper function to calculated new weights, depending on t-1 forecast errors of predictors. Weights can only be 1 or 0.
    
        Parameters:
            preds (list): t-1 predictions of each predictor
            real_value (float): real value at t
        
        
        Returns:
            (list): list containing the new weight values for each predictor
    '''
    if type(preds) != type(list):
        preds = list(preds)
        
    individual_error = []
    new_weights = []
    final_weights = []
    
    for i in range(len(preds)):
        individual_error.append(abs(preds[i] - real_value))
    
    total_error = sum(individual_error)
    for j in range(len(individual_error)):
        if sum([total_error]) == 0: # new approach, substitutes try, except clauses. Needs testing
            new_weights.append(1)
        else:
            new_weights.append(1-(individual_error[j]/total_error))

    for k in range(len(new_weights)):
        if new_weights[k] == max(new_weights):
            final_weights.append(1)
        else:
            final_weights.append(0)
    
    return formatting(final_weights)

In [37]:
def consolidated_predictions_focused(data, real) -> list:
    '''Function to calculate the consolidated prediction value of all individual predictors.
       Takes the sole estimate of the individual predictor that best predicted in the past.
    
        Parameters:
            data (df): predictions values from each individual predictor
            real (df): actual value
        
        
        Returns:
            (list): list containing consolidated prediction value considering new weight assignments for each predictor
    '''
    final_predictions = []
    weight_history = []
    weights = [1] * data.shape[1]

    for j in range(data.shape[0]):
        temp = []
        for i in range(data.shape[1]):
            temp.append(data.iloc[j, i]*weights[i])
            
        final_predictions.append(sum(temp)/sum(weights))
        weight_history.append(weights)
        weights = new_weights_focused(data.iloc[j], real.iloc[j][0])

    
    return final_predictions

In [38]:
def consolidated_predictions(data, real) -> list:
    '''Function to calculate the consolidated prediction value of all individual predictors.
    
        Parameters:
            data (df): predictions values from each individual predictor
            real (df): actual value
        
        
        Returns:
            (list): list containing consolidated prediction value considering new weight assignments for each predictor
    '''
    final_predictions = []
    weight_history = []
    weights = [1] * data.shape[1]

    for j in range(data.shape[0]):
        temp = []
        for i in range(data.shape[1]):
            temp.append(data.iloc[j, i]*weights[i])
            
        final_predictions.append(sum(temp)/data.shape[1])
        weight_history.append(weights)
        weights = new_weights_correcting(data.iloc[j], real.iloc[j][0])

    
    return final_predictions

In [61]:
def consolidated_predictions_memory(data, real) -> list:
    '''Function to calculate the consolidated prediction value of all individual predictors. This function furthermore
       extends consolidated_predictions by keeping a memory of prior assigned weights. An average of all prior assigned
       weights is calculated and applied to calculate the final consolidation value.
    
        Parameters:
            data (df): predictions values from each individual predictor
            real (df): actual value
        
        
        Returns:
            (list): list containing consolidated prediction value considering new weight assignments for each predictor
    '''
    final_predictions = []
    
    initialize = [1] * data.shape[1]
    weight_history = [initialize]
    weights = []

    for j in range(data.shape[0]):
        temp = []
        for i in range(data.shape[1]):
            temp.append(data.iloc[j, i]*([sum(z) for z in zip(*weight_history)][i]/(j+1))) # j number of rows, total value to take average
        
        final_predictions.append(sum(temp)/data.shape[1])
        weights = new_weights_correcting(data.iloc[j], real.iloc[j])
        weight_history.append(weights)
        

    
    return final_predictions

In [40]:
def consolidated_predictions_anchor(data, real, anchor: int) -> list:
    '''Function to calculate the consolidated prediction value of all individual predictors. To prevent the
       algorithm from being limited to produce consolidation values within the min and max value predicted by
       the individual predictors, min and max anchors are launched that extend above the biggest and smallest value
       estimated.
    
        Parameters:
            data (df): predictions values from each individual predictor
            real (df): actual value
            bojes (int): how far should max, min prediction be extended
        
        
        Returns:
            (list): list containing consolidated prediction value considering new weight assignments for each predictor
    '''
    final_predictions = []
    weight_history = []
    
    weights = [1] * data.shape[1]
    weights.append(1)
    weights.append(1)

    for j in range(data.shape[0]):
        data['Max Anchor'] = anchor * max(data.iloc[j])
        data['Min Anchor'] = (1- (anchor - 1)) * min(data.iloc[j])
        temp = []
        for i in range(data.shape[1]):
            temp.append(data.iloc[j, i]*weights[i])
            
        final_predictions.append(sum(temp)/data.shape[1])
        weight_history.append(weights)
        weights = new_weights(data.iloc[j], real.iloc[j])
        del data['Max Anchor']
        del data['Min Anchor']

    
    return final_predictions

In [41]:
def average_consolidation(data) -> list:
    '''Function to calculate simple average of all predictor forecasts.
    
        Parameters:
            data (df): prediction values from each individual predictor
        
        
        
        Returns:
            (list): list containing average values of predictor forecasts
    '''
    result = []
    for i in range(data.shape[0]):
        result.append(sum(data.iloc[i])/data.shape[1])
    
    return result

In [42]:
average_consolidation(df9)

[3.5, 4.5, 6.0, 5.5]

In [43]:
consolidated = consolidated_predictions_memory(df,df1)
consolidated

[5.666666666666667, 4.694444444444445, 6.7407407407407405, 6.111111111111111]

In [44]:
df9

Unnamed: 0,Predictor I,Predictor II
0,2,5
1,4,5
2,6,6
3,6,5


In [45]:
df1

Unnamed: 0,Real Value
0,6
1,5
2,6
3,7


In [46]:
consolidatedfocused = consolidated_predictions_focused(df9,df1)
consolidatedfocused

[3.5, 5.0, 6.0, 5.5]

In [47]:
list_1 = [2, 5, 10]
value_2 = 6
new_weights_focused(list_1, value_2)
#new_weights(list_1, value_2)

[0, 1, 0]

In [48]:
consolidated_predictions(df,df1)

[5.666666666666667, 7.0, 7.166666666666667, 5.666666666666667]

In [49]:
consolidated_predictions(df,df1)

[5.666666666666667, 7.0, 7.166666666666667, 5.666666666666667]

In [50]:
list_1 = [2, 5, 10]
value_2 = 6
new_weights_correcting(list_1, value_2)

[3.0, 1.2, 0.6]

In [51]:
list_1 = [2, 6, 6]
value_2 = 6
new_weights_correcting(list_1, value_2)

[3.0, 1.0, 1.0]

In [31]:
df1.iloc[0]

Real Value    6
Name: 0, dtype: int64

In [56]:
consolidated_predictions(two ,df1)

[3.5, 9.0, 6.75, 5.5]

In [63]:
consolidated_predictions_memory(three, df1)

[5.666666666666667, 5.833333333333333, 7.944444444444444, 7.108333333333333]

In [11]:
import numpy as np
from numpy import array

In [13]:
def sequence_prep(input_sequence: array, sub_seq: int, steps_past: int, steps_future: int) -> array:
    '''Prepares data input into X and y sequences. Lenght of the X sequence is dertermined by steps_past while the length of y is determined by steps_future. In detail, the predictor looks at sequence X and predicts sequence y.
            Parameters:
                input_sequence (array): Sequence that contains time series in array format
                sub_seq (int): Further division of given steps a predictor will look backward.
                steps_past (int): Steps the predictor will look backward
                steps_future (int): Steps the predictor will look forward

            Returns:
                X (array): Array containing all looking back sequences
                y (array): Array containing all looking forward sequences
                modified_back (int): Modified looking back sequence length
        '''
    length = len(input_sequence)
    if length == 0:
        return (0, 0, steps_past // sub_seq)
    X = []
    y = []
    if length <= steps_past:
        raise ValueError('Input sequence is equal to or shorter than steps to look backwards')
    if steps_future <= 0:
        raise ValueError('Steps in the future need to be bigger than 0')

    for i in range(length):
        last = i + steps_past
        if last > length - steps_future:
            break
        X.append(input_sequence[i:last])
        y.append(input_sequence[last:last + steps_future])
    y = array(y)
    X = array(X)
    modified_back = X.shape[1]//sub_seq
    X = X.reshape((X.shape[0], sub_seq, modified_back, 1))
    return X, y, modified_back # special treatment to account for sub sequence division

test_price = np.array([28.12999916, 27.79999924, 27.79999924, 27.80999947, 27.48999977,27.70999908, 27.11000061])

solution_X = np.array([[[28.12999916],
        [27.79999924]],
       [[27.79999924],
        [27.79999924]],
       [[27.79999924],
        [27.80999947]],
       [[27.80999947],
        [27.48999977]],
       [[27.48999977],
        [27.70999908]]])

solution_y = np.array([[27.79999924],
       [27.80999947],
       [27.48999977],
       [27.70999908],
       [27.11000061]])

In [20]:
X, y, s = sequence_prep(test_price, 2, 2, 2)

In [21]:
X

array([[[[28.12999916]],

        [[27.79999924]]],


       [[[27.79999924]],

        [[27.79999924]]],


       [[[27.79999924]],

        [[27.80999947]]],


       [[[27.80999947]],

        [[27.48999977]]]])

In [22]:
y

array([[27.79999924, 27.80999947],
       [27.80999947, 27.48999977],
       [27.48999977, 27.70999908],
       [27.70999908, 27.11000061]])

In [23]:
s

1