## _A Graph Neural Network deep-dive into successful counterattacks_

### 1. Installing the required Python libraries:
  Make sure the requirement.txt file is in the directory. Navigate to the directory using command prompt and run
 ```
 pip install -r requirements.txt
 ```
 or for MacOS
 ```
 pip install -r requirements_macos.txt
 ```

In [None]:
from ipywidgets import Checkbox, Dropdown, Accordion, VBox
import sklearn.metrics as metrics
from sklearn.calibration import calibration_curve
from spektral.data import Dataset, Graph, DisjointLoader
from spektral.layers import CrystalConv, GlobalAvgPool
from tensorflow.keras.layers import Dense, Input, Dropout
from tensorflow.keras.losses import BinaryCrossentropy
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import AUC

import copy
import logging
import matplotlib.pyplot as plt
import numpy as np
import pickle
import pandas as pd
import random
import sys
import tensorflow as tf
import requests
import progressbar
from os.path import isfile
import time

# Setting up the logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
stdout_handler = logging.StreamHandler(sys.stdout)
logger.addHandler(stdout_handler)

### 2. Data Processing

- `get_data()` fetches the specified file (Women, Men, Combined) from a public s3 bucket. The data is stored locally in {women, men, combined}.pkl. It is automatically loaded after downloading.
- We need to convert that data into a Spektral specific `Dataset` class to include node features, edge features and the adjacency matrix. We do this with the `CounterDataset` class that inherits from the `Dataset` class. 
- The `CounterDataset` class returns a list of `Graph` objects.

To help process the data we create a couple interactive checkboxes in 2.2 - 2.6 to help select node features, edge features etc.

##### !! For more information on GNN data structures etc. Check out [Spektral's Getting Started](https://graphneural.network/getting-started/) page !!

In [40]:
def get_data(file_name):
    '''
    Fetches the file from the location, loads it into memory and returns the data.
    '''
    if not isfile(file_name):
        url = f"https://ussf-ssac-23-soccer-gnn.s3.us-east-2.amazonaws.com/public/counterattack/{file_name}"

        logger.info(f"Downloading data from {url}...")
        
        r = requests.get(url, stream=True)
        # Fancy code to print progress bar
        block_size = 1024
        n_chunk = 1000
        file_size = int(r.headers.get('Content-Length', None))
        num_bars = np.ceil(file_size / (n_chunk * block_size))
        bar =  progressbar.ProgressBar(maxval=num_bars).start()
        with open(file_name, 'wb') as f:
            for i, chunk in enumerate(r.iter_content(chunk_size=n_chunk * block_size)):
                f.write(chunk)
                bar.update(i+1)
                # Add a little sleep so you can see the bar progress
                time.sleep(0.05)

        logger.info("File downloaded successfully!")
        
    logger.info(f"Opening {file_name}...")
    with open(file_name, 'rb') as handle:
        data = pickle.load(handle)
    return data

### 2.1 Choose Dataset Type

Choose the type of file for training the GNN. Options available are women's data (which include women's data from the 2022 NWSL and International women's soccer between 2020 and 2022), men's data (data from the 2022 MLS) and a combined data file (includes both women's and men's data).

In [42]:
print("Choose File for training:")
file_widget = Dropdown(
    options=['Women', 'Men', 'Combined'],
    value='Women',
    disabled=False,
)
display(file_widget)

Choose File for training:


Dropdown(options=('Women', 'Men', 'Combined'), value='Women')

### 2.2 (Down)load Data
Using the file selection widget we load the appropriate file from s3.

In [35]:
file_name = file_widget.value.lower() + '.pkl'
# Obtain the data
og_data = get_data(file_name)

Opening women.pkl...
Opening women.pkl...
Opening women.pkl...
Opening women.pkl...
Opening women.pkl...


### 2.2 Choose Adjacency Matrix

Available options:

- **normal** - connects attacking players amongst themselves, defensive players amongst themselves and the attacking and defending players are connected through the ball.
- **delaunay** - connects a few attacking players and defending players in a delaunay matrix fashion. Exact layout depends on player positioning during the frame in question.
- **dense** - connects all the players and the ball to each other
- **dense_ap** - connects all the attacking players to each other and to the defensive players.
- **dense_dp** - connects all the defending players to each other and to the attacking players.

In [None]:
print("Choose Adjacency Matrix:")
adj_matrix = Dropdown(
    options=['normal', 'delaunay', 'dense', 'dense_ap', 'dense_dp'],
    value='normal',
    disabled=False,
)
display(adj_matrix)

### 2.3 Choose Edge Features

Available options:

- **Player Distance** - Distance between two players connected to each other
- **Speed Difference** - Speed difference between two players connected to each other
- **Positional Sine angle** - Sine of the angle created between two players in the edge
- **Positional Cosine angle** - Cosine of the angle created between two players in the edge
- **Velocity Sine angle** - Sine of the angle created between the velocity vectors of two players in the edge
- **Velocity Cosine angle** - Coine of the angle created between the velocity vectors of two players in the edge

In [None]:
player_dist = Checkbox(
    value=True,
    description='Player Distance',
    disabled=False
)

speed_diff_matrix = Checkbox(
    value=True,
    description='Speed Difference',
    disabled=False
)

pos_sin_angle = Checkbox(
    value=True,
    description='Positional Sine angle',
    disabled=False
)

pos_cos_angle = Checkbox(
    value=True,
    description='Positional Cosine angle',
    disabled=False
)

vel_sin_angle = Checkbox(
    value=True,
    description='Velocity Sine angle',
    disabled=False
)

vel_cos_angle = Checkbox(
    value=True,
    description='Velocity Cosine angle',
    disabled=False
)

# Add the user selection of edge features in edge_f_box list.
edge_f_box = VBox([player_dist, speed_diff_matrix, pos_sin_angle, pos_cos_angle, vel_sin_angle, vel_cos_angle])

### 2.4 Choose Node Features

##### Node Features description
- **x coordinate** - x coordinate on the 2D pitch for the player
- **y coordinate** - y coordinate on the 2D pitch for the player
- **vx** - Velocity vector's x coordinate
- **vy** - Velocity vector's y coordinate
- **Velocity** - magnitude of the velocity
- **Velocity Angle** - angle made by the velocity vector
- **Distance to Goal** - distance of the player from the goal post
- **Angle with Goal** - angle made by the player with the goal
- **Distance to Ball** - distance from the ball (always 0 for the ball)
- **Angle with Ball** - angle made with the ball (always 0 for the ball)
- **Attacking Team Flag** - 1 if the team is attacking, 0 if not and for the ball
- **Potential Receiver** - 1 if player is a potential receiver, 0 otherwise

In [None]:
x_coord = Checkbox(
    value=True,
    description='x coordinate',
    disabled=False
)

y_coord = Checkbox(
    value=True,
    description='y coordinate',
    disabled=False
)

vx = Checkbox(
    value=True,
    description='vX',
    disabled=False
)


vy = Checkbox(
    value=True,
    description='vY',
    disabled=False
)


v = Checkbox(
    value=True,
    description='Velocity',
    disabled=False
)


velocity_angle = Checkbox(
    value=True,
    description='Velocity Angle',
    disabled=False
)

dist_goal = Checkbox(
    value=True,
    description='Distance to Goal',
    disabled=False
)

goal_angle = Checkbox(
    value=True,
    description='Angle with Goal',
    disabled=False
)


dist_ball = Checkbox(
    value=True,
    description='Distance to Ball',
    disabled=False
)

ball_angle = Checkbox(
    value=True,
    description='Angle with Ball',
    disabled=False
)


is_attacking = Checkbox(
    value=True,
    description='Attacking Team Flag',
    disabled=False
)

potential_receiver = Checkbox(
    value=True,
    description='Potential Receiver',
    disabled=False
)

# Add the user selection of node features in node_f_box list.
node_f_box = VBox([x_coord, y_coord, vx, vy, v, velocity_angle, dist_goal, goal_angle, dist_ball, ball_angle, 
                is_attacking, potential_receiver])

### 2.5 Display Checkboxes

In [None]:
accordion = Accordion(children=[edge_f_box, node_f_box])
accordion.set_title(0, 'Edge Features')
accordion.set_title(1, 'Node Features')
accordion

### 2.6 Update dataset with selected features
The loaded dataset is updated to include only the selected node features and edge features

In [None]:
edge_feature_idxs = [idx for idx, x in enumerate(edge_f_box.children) if x.value]
node_feature_idxs = [idx for idx, x in enumerate(node_f_box.children) if x.value]
node_features = [x.description for x in node_f_box.children if x.value]

# Check for empty edge features or node features.
if not any(edge_features_checker) and not any(node_features_checker):
    print("\nCannot have zero edge features and zero node features.\n")
    print("\nDefaulting to the previous configuration.")    
else:
    og_data[adj_matrix.value]['e'][0] = og_data[adj_matrix.value]['e'][0][:, edge_feature_idxs]
    og_data[adj_matrix.value]['x'][0] = og_data[adj_matrix.value]['x'][0][:, node_feature_idxs]

### 2.7 Convert to Spektral Graphs
Finally, we convert the data to Spektrals `Graph` representation. We also select the adjecancy matrix type.

##### !! [More information on creating custom datasets with Spektal](https://graphneural.network/creating-dataset/) !!

In [41]:
class CounterDataset(Dataset):
    """
    Convert raw Graph data to a CounterDataset with Dataset class from the spektral library to include node features, 
    edge features and the adjacency matrix.
    """
    def __init__(self, **kwargs):
        '''
        Constructor to load parameters.
        '''
        self.og_data = kwargs['og_data']
        self.matrix_type = kwargs['matrix_type']
        super().__init__(**kwargs)
        
    def read(self):
        '''
        Overriding the read function - to return a list of Graph objects
        '''
        logger.info("Loading Pass Dataset.")
        
        data = self.og_data
        # Choosing the data with the required matrix type.
        data_mat = data[self.matrix_type]
        
        # Print Graph information
        logger.info(
            f"""node_features (x): {data_mat['x'][0].shape}
             \n adj_matrix (a): {data_mat['a'][0].shape}
             \n edge_features (e): {data_mat['e'][0].shape}
             \n label: {np.asarray([data[self.y_label][0]]).shape}"""
        )
        
        # Return a list of Graph objects
        return [
            Graph(x=x, a=a, e=e, y=y) for x, a, e, y in zip(
                data_mat['x'], data_mat['a'], data_mat['e'], data['binary']
            )
        ]

In [None]:
# Load the dataset in the CounterDataset format.
dataset = CounterDataset(og_data = og_data, matrix_type = adj_matrix.value)

### 3 Training Graph Neural Network

### 3.1 Configurations

In [None]:
learning_rate = 1e-3  # Learning rate
epochs = 150  # Number of training epochs
batch_size = 16  # Batch size
channels = 128  # Hidden units for the neural network
layers = 3  # Number of CrystalConv layers

### 3.2 Parameters

In [None]:
N = max(g.n_nodes for g in dataset) # Number of nodes
F = dataset.n_node_features  # Dimension of node features
S = dataset.n_edge_features  # Dimension of edge features
n_out = dataset.n_labels  # Dimension of the target
n = len(dataset) # Number of samples in the dataset

# Train/test split for the dataset
idxs = np.random.RandomState(seed=15).permutation(len(dataset))
split_va, split_te = int(0.7 * len(dataset)), int(0.69 * len(dataset))
idx_tr, idx_va, idx_te = np.split(idxs, [split_va, split_te])
dataset_tr = dataset[idx_tr]
dataset_te = dataset[idx_te]
loader_tr = DisjointLoader(dataset_tr, batch_size=batch_size, epochs=epochs)
loader_te = DisjointLoader(dataset_te, batch_size=batch_size, epochs=1, shuffle = False)

# Display dataset information on target variable.
logger.info(f"n: {n}")
logger.info(f"Pct successful total: {round(np.asarray([graph.y[0] for graph in dataset]).sum() / n, 2)}")
logger.info(f"Pct successful train: {round(np.asarray([graph.y[0] for graph in dataset_tr]).sum() / (n * .7), 2)}")
logger.info(f"Pct successful test: {round(np.asarray([graph.y[0] for graph in dataset_te]).sum() / (n * .3), 2)}")

### 3.3 Build GNN Model
Build the GNN model using the Spektral preferred structure.

In [None]:
class GNN(Model):
    '''
    Building the Graph Neural Network configuration with Model as the parent class 
    from spektral library.
    '''
    def __init__(self, n_layers):
        '''
        Constructor code for setting up the layers needed for training the model.
        '''
        super().__init__()
        self.conv1 = CrystalConv()
        self.convs = []
        for _ in range(1, n_layers):
            self.convs.append(
                CrystalConv()
            )
        self.pool = GlobalAvgPool()
        self.dense1 = Dense(channels, activation="relu")
        self.dropout = Dropout(0.5)
        self.dense2 = Dense(channels, activation="relu")
        self.dense3 = Dense(n_out, activation="sigmoid")

    def call(self, inputs):
        '''
        Build the neural network.
        '''
        x, a, e, i = inputs
        x = self.conv1([x, a, e])
        for conv in self.convs:
            x = conv([x, a, e])
        x = self.pool([x, i])
        x = self.dense1(x)
        x = self.dropout(x)
        x = self.dense2(x)
        x = self.dropout(x)
        return self.dense3(x)


# Build model
model = GNN(layers)
# Setup the optimizer
optimizer = Adam(learning_rate)
# Set up the logloss function
loss_fn = BinaryCrossentropy()

### 3.4 Fit Model

In [None]:
@tf.function(input_signature=loader_tr.tf_signature(), experimental_relax_shapes=True)
def train_step(inputs, target):
    with tf.GradientTape() as tape:
        predictions = model(inputs, training=True)
        loss = loss_fn(target, predictions) + sum(model.losses)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    return loss


# Print loss at each step of training.
step = loss = 0
for batch in loader_tr:
    step += 1
    loss += train_step(*batch)
    if step == loader_tr.steps_per_epoch:
        step = 0
        print("Loss: {}".format(loss / loader_tr.steps_per_epoch))
        loss = 0

### 4 Evaluate Model Performance

### 4.1 Evaluate ROC-AUC

In [None]:
logger.info("Testing Model...")

y_true = []
y_pred = []

# Add the true values and the predicted values in the list
for batch in loader_te:
    inputs, target = batch
    p = model(inputs, training=False)
    y_true.append(target)
    y_pred.append(p.numpy())

# Calculate the ROC-AUC metric
y_true = np.vstack(y_true)
y_pred = np.vstack(y_pred)
fpr, tpr, threshold = metrics.roc_curve(y_true, y_pred)
roc_auc = metrics.auc(fpr, tpr)

# Plot the ROC w/ True positive rate on the y-axis and False positive rate on the x-axis.
plt.title('Receiver Operating Characteristic')
plt.plot(fpr, tpr, 'b', label = 'AUC = %0.2f' % roc_auc)
plt.legend(loc = 'lower right')
plt.plot([0, 1], [0, 1],'r--')
plt.xlim([0, 1])
plt.ylim([0, 1])
plt.ylabel('True Positive Rate')
plt.xlabel('False Positive Rate')
plt.show()

### 4.2 Evaluate Calibration
- Calculate Expected Calibration Error
- Plot Calibration Curves

In [None]:
# Reload the dataset in the CounterDataset format.
dataset_c = CounterDataset(og_data = og_data, matrix_type = adj_matrix.value)

# Setup the loader.
loader = DisjointLoader(dataset_c, batch_size=1, epochs=1, shuffle = False)

# Set up an empty pandas dataframe.
ece_df = pd.DataFrame(columns = ['output_pp', 'predicted', 'target', 'result'])

# Compute the predictions and save them in the Pandas DataFrame.
for batch in loader:
    inputs, target = batch
    p = model(inputs, training=False)
    original_prediction = p.numpy()[0][0]
    
    # Threshold set to 0.5
    predicted_value = 1 if original_prediction >= 0.5 else 0 
    ece_df.loc[len(ece_df)] = [original_prediction, predicted_value, target[0][0], 1 if predicted_value == target[0][0] else 0]

##### Expected Calibration Error (ECE)

$$ ECE = \sum_{k = 1}^{K} \frac{|B_k|}{N}  \left| \left( \frac{1}{|B_k|} \sum_{i \in B_k} y_i \right) - \left( \frac{1}{|B_k|} \sum_{i \in B_k} \hat{y_i} \right) \right| $$

We distribute the outcome into K bins, and compute the difference between the average prediction in each bin and the average expected outcome for the examples in each bin. $B_k$ corresponds to the set of examples in the $k$-th bin.

In [None]:
# Setting up the bins
bin_ranges = [(0, 0.1), (0.1, 0.2), (0.2, 0.3), (0.3, 0.4), (0.4, 0.5), (0.5, 0.6), (0.6, 0.7), (0.7, 0.8), (0.8, 0.9), (0.9, 1.0)]
bin_calc = pd.DataFrame(columns = ['bin', 'count', 'accuracy', 'avg_pp', 'acc-conf', 'count_into_acc-conf'])

for i, bin_range in enumerate(bin_ranges):
    # Get the higher and lower end of the bins
    lower, higher = bin_range[0], bin_range[1]
    
    # Get the probability outputs within the range
    bin_calc_temp = ece_df.loc[(ece_df['output_pp'] > lower) & (ece_df['output_pp'] <= higher)]
    count = bin_calc_temp.shape[0]
    
    # Compute parameters needed to calculate ECE 
    if count > 0:
        total_corrects = bin_calc_temp[(bin_calc_temp['result'] == 1)].shape[0]
        accuracy = total_corrects / count
        avg_pp = bin_calc_temp['output_pp'].mean()
        acc_conf = abs(accuracy - avg_pp)

        bin_calc.loc[i] = [bin_range, count, accuracy, avg_pp, acc_conf, count*acc_conf]
        
# Print ECE value    
print("ECE is : " + str(bin_calc['count_into_acc-conf'].sum() / bin_calc['count'].sum()))

##### Calibration Curves

In [None]:
# Use the sklearn calibration_curve() function to obtain calibration values for the model.
cal_y, cal_x = calibration_curve(ece_df['target'], ece_df['output_pp'], n_bins = 10)

# Plot the calibration curve.
fig, ax = plt.subplots()
plt.plot(cal_x, cal_y, marker = '.')
plt.plot([0, 1], [0, 1], ls = '--', color = 'green', label = 'Ideal calibration')
leg = plt.legend(loc = 'upper left')
plt.xlabel('Average Predicted Probability in each bin')
plt.ylabel('Ratio of positives')
plt.title("Calibration Curve")
plt.show()

### 5. Measure Feature Importance

[Permutation Feature Importance](https://christophm.github.io/interpretable-ml-book/feature-importance.html) ([Altmann & Tolosi, 2010](https://www.researchgate.net/publication/43130914_Permutation_importance_A_corrected_feature_importance_measure)) allows us to identify the importance of each individual feature by measuring the increase in prediction error when breaking the relationship between individual features and the observed result through the application of a random permutation to the feature’s values. In other words, we use the test set to randomly shuffle the values for one feature and recalculate the model error.

To do this we create a new dataset were the feature of either the attacking team or the defending team are shuffled.

In [None]:
class ShuffledCounterDataset(Dataset):
    '''
    Convert raw Graph data to a ShuffledCounterDataset with Dataset class from the spektral library to include node features, 
    edge features and the adjacency matrix.
    '''
    def __init__(self, **kwargs):
        '''
        Constructor to load the parameters.
        '''
        self.og_data = kwargs['og_data']
        self.node_feature_shuffle = kwargs['node_feature_shuffle']
        self.player_type = kwargs['player_type']
        self.matrix_type = kwargs['matrix_type']
        self.flag_index = kwargs['flag_index']
        super().__init__(**kwargs)
        
    def read(self):
        '''
        Overriding the read function - to return a list of Graph objects.
        Permuting the features in this function.
        '''
        data = copy.deepcopy(self.og_data)
        data_mat = data[self.matrix_type]
        
        # Check the type of players to be shuffled
        if self.player_type == 'attacking':
            for i in range(len(data_mat['x'])):
                arr = data_mat['x'][i]
                  
                # Get the appropriate type of player
                atts = arr[0:-1][arr[:-1, self.flag_index] == 1].copy()
                defs = arr[0:-1][arr[:-1, self.flag_index] == 0].copy()
                ball = arr[-1].copy()

                # Shuffle the feature requested
                random.shuffle(atts[:, self.node_feature_shuffle])

                # Assign back the shuffled values to the correct place they came from
                arr[0:-1][arr[:-1, self.flag_index] == 1] = atts
                arr[0:-1][arr[:-1, self.flag_index] == 0] = defs
           
        else:
            for i in range(len(data_mat['x'])):
                arr = data_mat['x'][i]
                    
                # Get the appropriate type of player
                atts = arr[0:-1][arr[:-1, self.flag_index] == 1].copy()
                defs = arr[0:-1][arr[:-1, self.flag_index] == 0].copy()
                ball = arr[-1].copy()

                # Shuffle the feature requested
                random.shuffle(defs[:, self.node_feature_shuffle])

                # Assign back the shuffled values to the correct place they came from
                arr[0:-1][arr[:-1, self.flag_index] == 1] = atts
                arr[0:-1][arr[:-1, self.flag_index] == 0] = defs
        
        
        return [
            Graph(x=x, a=a, e=e, y=y) for x, a, e, y in zip(data_mat['x'], data_mat['a'], data_mat['e'], data['binary'])
        ]

#### Select Attack or Defense for Shuffling

In [None]:
# Choose feature importance
print("Choose Player type for testing feature importance:")
player_type = Dropdown(
    options=['Attacking', 'Defending'],
    value='Attacking',
    disabled=False,
)
display(player_type)

#### Shuffle the node features & Change the number of random shuffling iterations we want to execute.

In [None]:
aucs = [] # Empty list to store the AUC values
feature_dict = {} # Feature dictionary to store the feature change values
flag_found = False # Flag to check if Attacking Team feature is included in training
iterations = 10 # Number of iterations

# Check if Attacking Team Flag exists
for flag_index, feature in enumerate(node_features):
    if feature == 'Attacking Team Flag':
        flag_found = True
        break

# If Attacking Team Flag does not exist - feature importance can't be performed.
if not flag_found:
    print("Attacking team Flag not included in node features. Can't compute feature importance.")
    
else:
    for i in range(len(node_features)): 
        mini_auc = []
        for _ in range(iterations):
            # Get shuffled data
            shuffle_data = ShuffledCounterDataset(og_data = og_data,
                                          node_feature_shuffle = i, player_type = player_type.value.lower(), 
                                          matrix_type = adj_matrix.value, flag_index = flag_index)
            
            # Split between training and testing data
            idxs = np.random.RandomState(seed=35).permutation(len(shuffle_data))
            split_va, split_te = int(0.7 * len(dataset)), int(0.7 * len(shuffle_data))
            idx_tr, idx_va, idx_te = np.split(idxs, [split_va, split_te])
            dataset_tr = shuffle_data[idx_tr]
            dataset_va = shuffle_data[idx_va]
            dataset_te = shuffle_data[idx_te]
            loader_te = DisjointLoader(dataset_te, batch_size=16, epochs=1)

            # Obtain the model metrics
            y_true = []
            y_pred = []

            for batch in loader_te:
                inputs, target = batch
                p = model(inputs, training=False)
                y_true.append(target)
                y_pred.append(p.numpy())

            y_true = np.vstack(y_true)
            y_pred = np.vstack(y_pred)
            fpr, tpr, threshold = metrics.roc_curve(y_true, y_pred)
            roc_auc = metrics.auc(fpr, tpr)
            # Store the AUC's at all iterations
            mini_auc.append(roc_auc)

        # Perform the error calculation and store it in the aucs list.
        errors = [1 - auc_1 for auc_1 in mini_auc]
        main_error = 1 - roc_auc
        errors_ = [100*(error - main_error) for error in errors]
        feature_dict[node_features[i]] = errors_
        aucs.append(sum(mini_auc) / len(mini_auc))

#### Box plot to inspect feature importances

In [None]:
plt.subplots(figsize=(10,25))

box_plot = plt.boxplot(feature_dict.values(),
                               positions=np.array(
    np.arange(len(feature_dict.keys())))*2.0+0.3,
                               widths=0.6, vert = False)
    
    
def define_box_properties(plot_name, color_code = 'black', label = ''):
    '''
    Define Box plot properties.
    '''
    for k, v in plot_name.items():
        plt.setp(plot_name.get(k), color=color_code)
         
    # use plot function to draw a small line to name the legend.
    plt.plot([], c=color_code, label=label)
    plt.legend()
    
    
define_box_properties(box_plot)

plt.yticks(np.arange(0, len(feature_dict.keys()) * 2, 2), feature_dict.keys())
plt.xlabel("Feature Importance")
plt.title('Feature Importance - Box Plot')