# Figure generation for all parameters
This Jupyter notebook generates figures to show the coverage for each parameter.

## The data
We start by loading the CSV file into a pandas DataFrame.

In [1]:
import pandas as pd

file = 'data/evaluations.csv'
conversion_dict = {'research_type': lambda x: int(x == 'E')}
evaluation_data = pd.read_csv(file, sep=',', header=0, index_col=0, converters=conversion_dict)

print('Samples per conference\n{}'.format(evaluation_data.groupby('conference').size()))

Samples per conference
conference
AAAI 14     100
AAAI 16     100
IJCAI 13    100
IJCAI 16    100
dtype: int64


## Generation
We will generate figures for four different categorisations: method, data, and experiment. The categories consist of the following variables: (*method*) problem, objective/goal, research method, research questions, and pseudo code; (*data*) training, validation, test, and results data; (*experiment*) hypothesis, prediction, method source code, hardware specification, software dependencies, experiment setup, experiment source code.

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib

matplotlib.style.use('ggplot')
%matplotlib notebook

colors = matplotlib.cm.get_cmap().colors
len_colors = len(colors)

def plot_bars(data, keys, elements, filename, figsize=(4,4)):
    plot_scores = []
    for (key, element) in zip(keys, elements):
        plot_scores.append(data[element].mean(axis=0))
    
    fig = plt.figure(figsize=figsize)
    ax = plt.subplot(111)
    
    N = len(plot_scores)
    ind = np.arange(N)
    width = 0.7
    
    plot_colors = colors[0:len_colors:int(len_colors/N)]
    ax.bar(ind+0.5, plot_scores, width, align='center',
            alpha=0.5, color=plot_colors)
    for x, y in zip(ind, plot_scores):
        ax.text(x+0.5, 0.95, '{0:.0%}'.format(y),
                ha='center', va='top', size=12)
    ax.set_xlim(0,N)
    ax.set_xticks(ind+0.5)
    ax.set_xticklabels(keys, rotation=35)
    
    ax.set_ylim(0, 1.0)
    ax.set_yticks([0.25, 0.50, 0.75, 1.0])
    ax.set_yticklabels(['25%', r'50%', '75%', '100%'],
            fontdict={'horizontalalignment': 'right'})
    plt.tight_layout()
    plt.savefig('figures/{}.png'.format(filename), format='png',
            bbox_inches='tight')

In [3]:
evaluation_data = evaluation_data.groupby('research_type').get_group(1)

keys = ['Results', 'Test', 'Valid-\nation', 'Train']
columns = ['results', 'test', 'validation', 'train'] 
plot_bars(evaluation_data[columns], keys, columns,
        'freq_data', figsize=(4,3))

<IPython.core.display.Javascript object>

In [4]:
keys = ['Pseudo\ncode', 'Research\nquestion',
        'Research\nmethod', 'Objective/\nGoal', 'Problem']
columns = ['pseudocode', 'research_question',
        'research_method', 'goal/objective', 'problem_description']
plot_bars(evaluation_data[columns], keys, columns,
          'freq_method', figsize=(4,3))

<IPython.core.display.Javascript object>

In [5]:
keys = ['Exp.\ncode', 'Exp.\nsetup', 'SW\ndep.',
        'HW\nspec.', 'Method\ncode', 'Prediction', 'Hypothesis']
columns = ['open_experiment_code', 'experiment_setup',
        'software_dependencies', 'hardware_specification',
        'open_source_code', 'prediction', 'hypothesis']
method_data = evaluation_data[columns]
plot_bars(method_data, keys, columns, 'freq_experiment', figsize=(5,3))

<IPython.core.display.Javascript object>

## Versions

Here's a generated output to keep track of software versions used to run this Jupyter notebook.

In [6]:
import IPython
import platform

print('Python version: {}'.format(platform.python_version()))
print('IPython version: {}'.format(IPython.__version__))
print('matplotlib version: {}'.format(matplotlib.__version__))
print('numpy version: {}'.format(np.__version__))
print('pandas version: {}'.format(pd.__version__))

Python version: 3.5.3
IPython version: 6.1.0
matplotlib version: 2.0.2
numpy version: 1.13.1
pandas version: 0.20.3
