# Exercise 4a
## 2 Red Cards Study
### 2.1 Loading and Cleaning the data

In [2]:
#Import libraries
import numpy as np
import pandas as pd
from scipy.sparse.linalg import lsqr

In [3]:
#Load dataset
df = pd.read_csv("CrowdstormingDataJuly1st.csv", sep=",", header=0)
print(df.columns)

Index(['playerShort', 'player', 'club', 'leagueCountry', 'birthday', 'height',
       'weight', 'position', 'games', 'victories', 'ties', 'defeats', 'goals',
       'yellowCards', 'yellowReds', 'redCards', 'photoID', 'rater1', 'rater2',
       'refNum', 'refCountry', 'Alpha_3', 'meanIAT', 'nIAT', 'seIAT',
       'meanExp', 'nExp', 'seExp'],
      dtype='object')


We sort out (irrelevant) features:
- player (playerShort uniquely identifies the player, so player is not needed)
- playerShort (the players' names are irrelevant for us, actually...)
- photoID (is only needed if we want to classify the skin color by ourselves)
- refCountry (we will assume that not the name of the referee country, but the values meanIAT and meanExp are the relevant features regarding the referee country)
- nIAT, seIAT (we will just assume the IAT values are a good estimation of the actual value, so we are not interested in the sample size used to determine the IAT)
- nExp, seExp (see above)
- yellowCards (our examination is only about red cards, not about yellow cards)
- club (this is a categorial feature with over 100 categories. The one-hot encoding would therefore create a large amount of new features which drastically increases the dimensionality of the problem. This extra effort is disproportionate to the importance of this feature for this question (maybe(!) some teams play more aggresive than others).)
- birthday (We decided against this feature because the data set does not contain information about the date of every single game. This makes it impossible to find out if players get more red cards when they are at a certain age, because the data from this dataset refer to the complete career of the players.)
- Alpha_3 (we neglect the nationality of the referee)
- games (This is a redundant feature because the total number of games is given by the sum of victories, ties and defeats)

In [4]:
df = df.drop(labels=["player", "playerShort", "photoID", "refCountry", "nIAT", "seIAT", "nExp", "seExp", "yellowCards", "club", "birthday", "Alpha_3", "games"], axis=1)
print(df.columns)

Index(['leagueCountry', 'height', 'weight', 'position', 'victories', 'ties',
       'defeats', 'goals', 'yellowReds', 'redCards', 'rater1', 'rater2',
       'refNum', 'meanIAT', 'meanExp'],
      dtype='object')


Next, we create new features out of existing ones or manipulate them:
- rating (we take the mean of rater1 and rater2 as our own rating)
- totalReds (we sum up the red and yellow-red cards to the total number of red cards and divide by the number of games, This is our response Y.)
- leagueCountry (replace the name of the country by one-hot encoding)
- refCount (counts the number of dyads for each referee. Relevant for later drops (see https://nbviewer.jupyter.org/github/mathewzilla/redcard/blob/master/Crowdstorming_visualisation.ipynb))
- We summarize some categories in position (Goalkeeper, Back, Midfielder, Forward (don't know anything about football, hopefully this makes sense.))

In [5]:
#take mean of the two skin color ratings
df["rating"] = (df["rater1"] + df["rater2"])/2
df = df.drop(labels=["rater1", "rater2"], axis=1)

#sum up red and yellow-red cards
df["percentageReds"] = (df["redCards"] + df["yellowReds"])/(df["victories"]+df["ties"]+df["defeats"])
df = df.drop(labels=["redCards", "yellowReds"], axis=1)

#onehot encoding for leagueCountry
onehot = pd.get_dummies(df.leagueCountry, prefix="Country")
df = df.drop(labels=["leagueCountry"], axis=1)
df = pd.concat([df,onehot], axis=1, sort=False)

#summarize positions and onehot encoding for positions
dic = {"Right Fullback":"Back",
       "Left Fullback":"Back",
       "Center Back":"Back",
       "Left Midfielder":"Midfielder", 
       "Right Midfielder":"Midfielder", 
       "Center Midfielder":"Midfielder", 
       "Defensive Midfielder":"Midfielder",
       "Attacking Midfielder":"Midfielder",
       "Left Winger":"Forward",
       "Right Winger":"Forward",
       "Center Forward":"Forward"}
df = df.replace({"position":dic})
onehot = pd.get_dummies(df.position, prefix="Position")
df = df.drop(labels=["position"], axis=1)
df = pd.concat([df,onehot], axis=1, sort=False)


#add a column which tracks how many games each ref is involved in
#taken from https://nbviewer.jupyter.org/github/mathewzilla/redcard/blob/master/Crowdstorming_visualisation.ipynb
df['refCount']=0
refs=pd.unique(df['refNum'].values.ravel()) #list all unique ref IDs
#for each ref, count their dyads
for r in refs:
    df.loc[df['refNum']==r,"refCount"]=len(df[df['refNum']==r])    

Now we go on with preparing the data set:
- remove rows that contain a NaN-value
- remove rows where refCount<22 (for explanation see https://nbviewer.jupyter.org/github/mathewzilla/redcard/blob/master/Crowdstorming_visualisation.ipynb. After this we can remove the features "refNum" and "refCount" because these were only kept for this step.)
- normalize the features ties, victories and defeats

In [6]:
#remove rows with NaN in "rating" 
df = df.dropna(axis=0)

#remove rows where the "refCount"<22
df = df.loc[df["refCount"]>21].reset_index()
df = df.drop(["refNum", "refCount", "index"], axis=1)

#normalize ties, victories and defeats
defeats = df["defeats"]/(df["defeats"]+df["ties"]+df["victories"])
ties = df["ties"]/(df["defeats"]+df["ties"]+df["victories"])
victories = df["victories"]/(df["defeats"]+df["ties"]+df["victories"])
df["defeats"] = defeats
df["ties"] = ties
df["victories"] = victories


In the following tasks we want to apply the LSQR-algorithm. In the lecture we always assumed centralized features and responses. So our last step is to centralize our data. The responses are given by the values in the column "totalReds"

In [7]:
df_mean = df.apply(np.mean, axis=0)
df = df - df_mean

In [8]:
df

Unnamed: 0,height,weight,victories,ties,defeats,goals,meanIAT,meanExp,rating,percentageReds,Country_England,Country_France,Country_Germany,Country_Spain,Position_Back,Position_Forward,Position_Goalkeeper,Position_Midfielder
0,-0.096034,-5.314496,-0.459157,-0.232627,0.691785,-0.365041,-0.023132,0.075924,-0.157667,-0.008371,0.713251,-0.154393,-0.303968,-0.254891,0.693521,-0.2114,-0.087468,-0.32795
1,4.903966,3.685504,0.540843,-0.232627,-0.308215,-0.365041,-0.023132,0.075924,-0.157667,-0.008371,0.713251,-0.154393,-0.303968,-0.254891,0.693521,-0.2114,-0.087468,-0.32795
2,-2.096034,-8.314496,-0.459157,-0.232627,0.691785,-0.365041,-0.023132,0.075924,0.717333,-0.008371,0.713251,-0.154393,-0.303968,-0.254891,-0.306479,-0.2114,-0.087468,0.67205
3,10.903966,3.685504,-0.459157,0.767373,-0.308215,-0.365041,-0.023132,0.075924,-0.032667,-0.008371,0.713251,-0.154393,-0.303968,-0.254891,-0.306479,-0.2114,0.912532,-0.32795
4,-2.096034,-6.314496,0.540843,-0.232627,-0.308215,-0.365041,-0.023132,0.075924,-0.282667,-0.008371,-0.286749,-0.154393,0.696032,-0.254891,0.693521,-0.2114,-0.087468,-0.32795
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
113122,5.903966,3.685504,-0.459157,0.767373,-0.308215,-0.365041,0.027810,0.035812,0.217333,-0.008371,0.713251,-0.154393,-0.303968,-0.254891,-0.306479,-0.2114,-0.087468,0.67205
113123,-4.096034,-9.314496,0.540843,-0.232627,-0.308215,0.634959,0.027810,0.035812,-0.282667,-0.008371,0.713251,-0.154393,-0.303968,-0.254891,-0.306479,-0.2114,-0.087468,0.67205
113124,7.903966,10.685504,-0.459157,-0.232627,0.691785,-0.365041,0.027810,0.035812,0.092333,-0.008371,0.713251,-0.154393,-0.303968,-0.254891,0.693521,-0.2114,-0.087468,-0.32795
113125,-13.096034,-4.314496,0.540843,-0.232627,-0.308215,-0.365041,0.027810,0.035812,-0.032667,-0.008371,-0.286749,-0.154393,0.696032,-0.254891,-0.306479,-0.2114,-0.087468,0.67205


### 2.2 Model Creation

In [138]:
#solve the problem using the lsqr algorithm (linear regression)
#extract features and responses from the DataFrame
Y = df["percentageReds"].to_numpy()
X = df.drop(labels=["percentageReds"], axis=1).to_numpy()

class LinearRegression():
    
    def __init__(self):
        self.beta = None

        #use lsqr algorithm
    def train(self, features, labels):
        self.beta = lsqr(features,labels)[0]

    def predict(self, x):
        x_mean = df_mean.drop(labels=["percentageReds"])
        y_mean = df_mean["percentageReds"]
        return y_mean + np.sum(self.beta*(x-x_mean)) 

#Test basic functionality
regression = LinearRegression()
regression.train(X,Y)
regression.predict([180, 77, 1.4, 0.8, 1, 0.4, 0.35, 0.5, 1, 0.3, 0.15, 0.3, 0.25, 0.3, 0.21, 0.1, 0.35])

0.008733024863475151

In [139]:
#solve the problem using regression forestsclass DecisionTree(Tree):
# base classes
class Node:
    pass

class Tree:
    def __init__(self):
        self.root = Node()
    
    def find_leaf(self, x):
        node = self.root
        while hasattr(node, "feature"):
            j = node.feature
            if x[j] <= node.threshold:
                node = node.left
            else:
                node = node.right
        return node
    
class RegressionTree(Tree):
    def __init__(self):
        super(RegressionTree, self).__init__()
        
    def train(self, data, labels, n_min=500):
        '''
        data: the feature matrix for all digits
        labels: the corresponding ground-truth responses
        n_min: termination criterion (don't split if a node contains fewer instances)
        '''
        N, D = data.shape
        D_try = np.max([int(np.sqrt(D))-2, 0]) # how many features to consider for each split decision

        # initialize the root node
        self.root.data = data
        self.root.labels = labels
        
        stack = [self.root]
        while len(stack):
            node = stack.pop()
            n = node.data.shape[0] # number of instances in present node
            if (n >= n_min):
                #randomly choose D_try-2 features
                feature_indices = np.random.choice(D, D_try, replace=False)
                feature_indices = np.append(feature_indices, [0,1,8])
                #split the node into two
                left, right = make_regression_split_node(node, feature_indices)
                #put the two nodes on the stack
                stack.append(left)
                stack.append(right)
            else:
                make_regression_leaf_node(node)
                
    def predict(self, x):
        leaf = self.find_leaf(x)
        return leaf.response

In [140]:
def make_regression_split_node(node, feature_indices):
    '''
    node: the node to be split
    feature_indices: a numpy array of length 'D_try', containing the feature 
                     indices to be considered in the present split
    '''
    n, D = node.data.shape

    # find best feature j (among 'feature_indices') and best threshold t for the split
    #(mainly copied from "density tree")
    e_min = float("inf")
    j_min, t_min = None, None

    for j in feature_indices:
        data_unique = np.sort(np.unique(node.data[:, j]))
        tj = (data_unique[1:] + data_unique[:-1])/2.0
        
        for t in tj:
            data_left = node.data[:, j].copy()
            labels_left = node.labels[data_left<=t].copy()
            data_left = data_left[data_left<=t]
            
            data_right = node.data[:, j].copy()
            labels_right = node.labels[data_right>t].copy()
            data_right = data_right[data_right>t]
            
            #compute mean label value on the left and right
            mean_left = np.mean(labels_left)
            mean_right = np.mean(labels_right)
            
            #compute sum of squared deviation from mean label
            measure_left = np.sum((labels_left - mean_left)**2)
            measure_right = np.sum((labels_right - mean_right)**2)       
            
            #Compute decision rule
            measure = measure_left + measure_right
            
            # choose the best threshold that minimizes gini
            if measure < e_min:
                e_min = measure
                j_min = j
                t_min = t
    
    # create children
    left = Node()
    right = Node()
    
    
    X = node.data[:, j_min]
    
    # initialize 'left' and 'right' with the data subsets and labels
    # according to the optimal split found above
    left.data = node.data[X<=t_min]# data in left node
    left.labels = node.labels[X<=t_min] # corresponding labels
    right.data = node.data[X>t_min]
    right.labels = node.labels[X>t_min]

    # turn the current 'node' into a split node
    # (store children and split condition)
    node.left = left
    node.right = right
    node.feature = j_min
    node.threshold = t_min

    # return the children (to be placed on the stack)
    return left, right    

In [141]:
def make_regression_leaf_node(node):
    '''
    node: the node to become a leaf
    '''
    # compute and store leaf response
    node.response = np.mean(node.labels) + df_mean["percentageReds"]

In [142]:
class RegressionForest():
    def __init__(self, n_trees):
        # create ensemble
        self.trees = [RegressionTree() for i in range(n_trees)]
    
    def train(self, data, labels, n_min=1000):
        for tree in self.trees:
            # train each tree, using a bootstrap sample of the data
            bootstrap_indices = np.random.choice(len(labels), len(labels))
            bootstrap_data = np.array([data[i] for i in bootstrap_indices])
            bootstrap_labels = np.array([labels[i] for i in bootstrap_indices])
            tree.train(bootstrap_data, bootstrap_labels, n_min=n_min)

    def predict(self, x):
        predictions = np.array([])
        for tree in self.trees:
            predictions = np.append(predictions, tree.predict(x))
        return np.mean(predictions)
    
    def merge(self, forest):
        self.trees = self.trees + forest.trees

In [55]:
#test of basic functionality
Y = df["percentageReds"].to_numpy()
X = df.drop(labels=["percentageReds"], axis=1).to_numpy()

forest = RegressionForest(n_trees=5)
forest.train(X, Y, n_min=500)

In [151]:
#determine the error via cross validation

#define function that determines the sum squared error
def compute_error(model, test_features, test_labels):
    mean_squared_error = 0
    n = len(test_features)
    
    for i in range(n):
        mean_squared_error = mean_squared_error + (test_labels[i] - model.predict(test_features[i]))**2
    
    return mean_squared_error/n

Y = df["percentageReds"].to_numpy()
X = df.drop(labels=["percentageReds"], axis=1).to_numpy()

#number of folds
L = 10

#create  L folds
N = len(X)
indices = np.random.choice(N, N, replace=False)
X_folds = np.array(np.array_split(X[indices], L), dtype=object) 
Y_folds = np.array(np.array_split(Y[indices], L), dtype=object)

#1. Linear Regression
error = []
for i in range(L):
    print(i/L*100, "%")
    #create training and test data
    X_train = np.concatenate(X_folds[np.arange(L)!=i], axis=0)
    Y_train = np.concatenate(Y_folds[np.arange(L)!=i], axis=0)
    X_test = X_folds[i]
    Y_test = Y_folds[i]
    
    #compute error
    regression = LinearRegression()
    regression.train(X_train,Y_train)
    error.append(compute_error(regression, X_test, Y_test))
error = np.mean(error)

#print error
print("\nerror rate, linear regression:")
print(error)

0.0 %
10.0 %
20.0 %
30.0 %
40.0 %
50.0 %
60.0 %
70.0 %
80.0 %
90.0 %

error rate, linear regression:
0.005192709509021647


In [153]:
#2. Regression Forest
error = []
for i in range(L):
    print(i/L*100, "%")
    #create training and test data
    X_train = np.concatenate(X_folds[np.arange(L)!=i], axis=0)
    Y_train = np.concatenate(Y_folds[np.arange(L)!=i], axis=0)
    X_test = X_folds[i]
    Y_test = Y_folds[i]
    
    #compute error
    forest = RegressionForest(n_trees=5)
    forest.train(X_train,Y_train, n_min=500)
    error.append(compute_error(forest, X_test, Y_test))
error = np.mean(error)

#print error
print("\nerror rate, regression forest:")
print(error)

0.0 %
10.0 %
20.0 %
30.0 %
40.0 %
50.0 %
60.0 %
70.0 %
80.0 %
90.0 %

error rate, regression forest:
0.005265575175396048


### 2.3 Answering the Research Question

In [94]:
#define function that shuffles the data in one column
def shuffle_data(features, feature_index):
    '''
    Shuffles the data in the column denoted by feature_index. All other data remain unchanged 

    features: 2D array, each row stands for one instance, each column for one feature
    feature_index: the entries in the feature_index-th column will be shuffled randomly
    '''

    features = features.transpose()
    shuffled_feature = np.random.permutation(features[feature_index])
    features[feature_index] = shuffled_feature

    return features.transpose()

In [150]:
color_rating_index = 8 #index of the color rating in df
L = 10 #number of folds

#load csv-file where we save the mean squared errors
err_data = pd.read_csv("errors.txt", sep=",", index_col=False) 

# load original data set
Y = df["percentageReds"].to_numpy()
X = df.drop(labels=["percentageReds"], axis=1).to_numpy()

#1. Linear Regression
#shuffle data
Y_shuffled = Y
X_shuffled = shuffle_data(X, 8)

#create  L folds
N = len(X_shuffled)
indices = np.random.choice(N, N, replace=False)
X_folds = np.array(np.array_split(X_shuffled[indices], L), dtype=object) 
Y_folds = np.array(np.array_split(Y_shuffled[indices], L), dtype=object)

error = []
for i in range(L):
    print(i/L*100, "%")
    #create training and test data
    X_train = np.concatenate(X_folds[np.arange(L)!=i], axis=0)
    Y_train = np.concatenate(Y_folds[np.arange(L)!=i], axis=0)
    X_test = X_folds[i]
    Y_test = Y_folds[i]
    
    #compute error
    regression = LinearRegression()
    regression.train(X_train,Y_train)
    error.append(compute_error(regression, X_test, Y_test))
error_lr = np.mean(error)

#print error and save the value
print("\nerror rate, linear regression:")
print(error_lr)

#2. Regression Tree
error = []
for i in range(L):
    print(i/L*100, "%")
    #create training and test data
    X_train = np.concatenate(X_folds[np.arange(L)!=i], axis=0)
    Y_train = np.concatenate(Y_folds[np.arange(L)!=i], axis=0)
    X_test = X_folds[i]
    Y_test = Y_folds[i]
    
    #compute error
    forest = RegressionForest(n_trees=5)
    forest.train(X_train,Y_train, n_min=500)
    error.append(compute_error(forest, X_test, Y_test))
error = np.mean(error)

#print error and save the value
print("\nerror rate, regression tree:")
print(error)
err_data.loc[len(err_data)] = [error_lr, error]

err_data.to_csv("errors.txt", sep=",", index=False)

0.0 %
10.0 %
20.0 %
30.0 %
40.0 %
50.0 %
60.0 %
70.0 %
80.0 %
90.0 %

error rate, linear regression:
0.005183430036888673
0.0 %
10.0 %
20.0 %
30.0 %
40.0 %
50.0 %
60.0 %
70.0 %
80.0 %
90.0 %

error rate, regression tree:
0.005268014302182951


To obtain the following results we run the code from above several times. The first row stands for the results from the unshuffled dataset, the other rows from the shuffled datasets. One can see that the error of some of the rows corresponding to a dataset with shuffled color rating are lower than the error from the original dataset. So we can't find a skin color bias in red card decisions with a p-value of p=0.05. However, we have doubts if our code is completely correct: surprisingly the error for linear regression is always even lower, when you shuffle the color rating. We do not have an explanation for this.

In [154]:
err_data = pd.read_csv("errors.txt", sep=",", index_col=False) 
err_data

Unnamed: 0,mse linear regression,mse regression tree
0,0.005191,0.00526
1,0.005175,0.005267
2,0.005184,0.005269
3,0.005178,0.005265
4,0.00518,0.005268
5,0.005178,0.005256
6,0.005183,0.005269
7,0.005178,0.005264
8,0.005175,0.005268
9,0.005177,0.005263


### 2.4 How to Lie With Statistics

We already found a choice of features that does not reveal a skin color bias. So we try to find a choice of features that shows such a bias. We choose the "rating" column as the only feature. We only apply the Linear Regression model to the data, because our task is to find one example of a choice of features that shows a skin color bias in one of the used models. 

In [178]:
Y = df["percentageReds"].to_numpy()
X = df[["rating"]].to_numpy()
df_mean = df_mean[["rating", "percentageReds"]]

#number of folds
L = 20

#create  L folds
N = len(X)
indices = np.random.choice(N, N, replace=False)
X_folds = np.array(np.array_split(X[indices], L), dtype=object) 
Y_folds = np.array(np.array_split(Y[indices], L), dtype=object)

#1. Linear Regression
error = []
for i in range(L):
    print(i/L*100, "%")
    #create training and test data
    X_train = np.concatenate(X_folds[np.arange(L)!=i], axis=0)
    Y_train = np.concatenate(Y_folds[np.arange(L)!=i], axis=0)
    X_test = X_folds[i]
    Y_test = Y_folds[i]
    
    #compute error
    regression = LinearRegression()
    regression.train(X_train,Y_train)
    error.append(compute_error(regression, X_test, Y_test))
error = np.mean(error)

#print error
print("\nerror rate, linear regression:")
print(error)


0.0 %
5.0 %
10.0 %
15.0 %
20.0 %
25.0 %
30.0 %
35.0 %
40.0 %
45.0 %
50.0 %
55.00000000000001 %
60.0 %
65.0 %
70.0 %
75.0 %
80.0 %
85.0 %
90.0 %
95.0 %

error rate, linear regression:
0.0052392309104625995


In [196]:
color_rating_index = 0 #index of the color rating in df
L = 20 #number of folds

#load csv-file where we save the mean squared errors
err_data = pd.read_csv("errorsLie.txt", sep=",", index_col=False) 

# load original data set
Y = df["percentageReds"].to_numpy()
X = df[["rating"]].to_numpy()

#1. Linear Regression
#shuffle data
Y_shuffled = Y
X_shuffled = shuffle_data(X, color_rating_index)

#create  L folds
N = len(X_shuffled)
indices = np.random.choice(N, N, replace=False)
X_folds = np.array(np.array_split(X_shuffled[indices], L), dtype=object) 
Y_folds = np.array(np.array_split(Y_shuffled[indices], L), dtype=object)

error = []
for i in range(L):
    print(i/L*100, "%")
    #create training and test data
    X_train = np.concatenate(X_folds[np.arange(L)!=i], axis=0)
    Y_train = np.concatenate(Y_folds[np.arange(L)!=i], axis=0)
    X_test = X_folds[i]
    Y_test = Y_folds[i]
    
    #compute error
    regression = LinearRegression()
    regression.train(X_train,Y_train)
    error.append(compute_error(regression, X_test, Y_test))
error = np.mean(error)

#print error and save the value
print("\nerror rate, linear regression:")
print(error)

err_data.loc[len(err_data)] = [error]

err_data.to_csv("errorsLie.txt", sep=",", index=False)

0.0 %
5.0 %
10.0 %
15.0 %
20.0 %
25.0 %
30.0 %
35.0 %
40.0 %
45.0 %
50.0 %
55.00000000000001 %
60.0 %
65.0 %
70.0 %
75.0 %
80.0 %
85.0 %
90.0 %
95.0 %

error rate, linear regression:
0.005248877638861473


After running the code above 20 times we can find a skin color bias this time: The mean squared error for the shuffled data is always higher than the error for the original dataset.

### 2.5 Alternative Hypotheses

This exercise assumes that a correlation between the skin color and the probability to get a red card exists. We did not find such a correlation with our first choice of features. So we assume that the choice of features we used in 2.4 was "better". Two causal hypotheses for red cards would then be:
1. Heavier players cause more fouls (because the opponent is more likely to fall). This leads to more red cards for players with more weight.
2. Players in the position "Back" often have to stop an opponent player in the last moment ("no matter what it costs"). This leads to more red cards for players in the position "Back".
If one of these hypotheses is true, we should find a positive correlation between the position "Back"/weight and the color rating. 
Then we would typically expect a positive covariance for these quantities. Additionally, we would expect a positive covarance of the weight/"Back" and the probability for a red card

In [16]:
#compute covariance matrices
Y = df["percentageReds"].to_numpy()
X = df["weight"].to_numpy()
print(np.cov(X,Y))
Y = df["rating"].to_numpy()
print(np.cov(X,Y), "\n")

Y = df["percentageReds"].to_numpy()
X = df["Position_Back"].to_numpy()
print(np.cov(X,Y))
Y = df["rating"].to_numpy()
print(np.cov(X,Y))

[[5.12806150e+01 3.60300072e-03]
 [3.60300072e-03 5.17986190e-03]]
[[ 5.12806150e+01 -4.98876945e-02]
 [-4.98876945e-02  8.25710800e-02]] 

[[0.21255133 0.00071554]
 [0.00071554 0.00517986]]
[[ 0.21255133 -0.00133118]
 [-0.00133118  0.08257108]]


In both cases one of our expectations is not met, which means that our hypotheses are rather not true.