In [1]:
%%javascript
// Disable the scrollbar in output cells
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
}

<IPython.core.display.Javascript object>

# Analysis of survey evaluations

This Jupyter notebook examines the evaluations recorded in `data/evaluations.csv` and `data_nn/evaluations_nn.csv`.

The analysis has been modified from the origianl version, to make the functions more general. 

The analysis is additionally duplicated for the original AAAI selection, and repeated for a
selection from the AI Reproducibility master's thesis research from 2018. This selection is grouped by year, 
rather than by conference, as the selection was not made with respect to conferences, but with respect to years.

## The data
We start by loading the CSV file into a [pandas DataFrame](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) and print some information on the size and structure of the dataset.

In [2]:
import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn', hides SettingWithCopyWarning

def print_summary(file, groupby_name):
    conversion_dict = {'research_type': lambda x: int(x == 'E')}
    evaluation_data = pd.read_csv(file, sep=',', header=0, index_col=0, converters=conversion_dict)

    print('Samples per {groupby_name}\n{data}'.format(groupby_name=groupby_name,
                                                      data=evaluation_data.groupby(groupby_name).size()
                                                     ),
          end='\n')

    column_headers = evaluation_data.columns.values
    print('\nColumn headers: {}'.format(column_headers))

    return evaluation_data

print("AAAI")
evaluation_data_aaai = print_summary('data/evaluations.csv', 'conference')

print("\nAIReproduction2018:")
evaluation_data_air2018 = print_summary('data_nn/evaluations_nn.csv', 'year')
# The papers are listed alphabetically rather than by index in the .csv-file
# The order of the indexing is rather arbitrary, according to the order the papers were logged
evaluation_data_air2018.sort_index(axis=0, inplace=True)



AAAI
Samples per conference
conference
AAAI 14     100
AAAI 16     100
IJCAI 13    100
IJCAI 16    100
dtype: int64

Column headers: ['title' 'research_type' 'result_outcome' 'affiliation'
 'problem_description' 'goal/objective' 'research_method'
 'research_question' 'hypothesis' 'prediction' 'contribution' 'pseudocode'
 'open_source_code' 'open_experiment_code' 'train' 'validation' 'test'
 'results' 'hardware_specification' 'software_dependencies'
 'third_party_citation' 'experiment_setup' 'evaluation_criteria' 'authors'
 'link' 'comments' 'conference']

AIReproduction2018:
Samples per year
year
2012    10
2014    10
2016    10
dtype: int64

Column headers: ['year' 'title' 'research_type' 'result_outcome' 'affiliation'
 'problem_description' 'goal/objective' 'research_method'
 'research_question' 'hypothesis' 'prediction' 'contribution' 'pseudocode'
 'open_source_code' 'open_experiment_code' 'train' 'validation' 'test'
 'results' 'hardware_specification' 'software_dependencies'
 'thir

The dataset has 400 samples with 27 columns. Some of these columns are not necessary for further analysis: *title*, *authors*, *link*, *comments*. Dropping these leaves us with a numerical index for each paper, the conference it was published to, and survey related data. The lambda function above converts the *research_type* data from E (experimental) and T (theoretical) to 1 and 0 respectively, making it easier to work with in pandas.

In [3]:
def drop_columns(evaluation_data):
    evaluation_data.drop(['title', 'authors', 'link', 'comments'], axis=1, inplace=True)
    #column_headers = evaluation_data.columns.values
    display(evaluation_data.head(2))

print("AAAI: ")
drop_columns(evaluation_data_aaai)
print("\nAI Reproducibility 2018: ")
drop_columns(evaluation_data_air2018)

AAAI: 


Unnamed: 0_level_0,research_type,result_outcome,affiliation,problem_description,goal/objective,research_method,research_question,hypothesis,prediction,contribution,...,train,validation,test,results,hardware_specification,software_dependencies,third_party_citation,experiment_setup,evaluation_criteria,conference
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,1,1,0,1,0,0,0,0,0,1,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,IJCAI 16
2,0,1,0,0,0,0,0,0,0,0,...,,,,,,,,,,IJCAI 16



AI Reproducibility 2018: 


Unnamed: 0_level_0,year,research_type,result_outcome,affiliation,problem_description,goal/objective,research_method,research_question,hypothesis,prediction,...,train,validation,test,results,hardware_specification,software_dependencies,third_party_citation,experiment_setup,evaluation_criteria,conference
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,2012,1,1,0,0,0,0,0,0,0,...,0.0,0.0,1.0,0,1,0,1,1,,IEEE Transactions on Pattern Analysis and Mach...
2,2012,1,1,0,1,0,0,0,0,0,...,,,,1,1,0,1,1,,Information Sciences


The above two rows exemplify an experimental (top row) and a theoretical (bottom row) paper. Values with a NaN value appear for theoretical papers for all columns that are specific to experimental papers. For the *affiliation* column, 0 represents academia, 1 represents collaboration and 2 represents industry authors. The remaining columns are boolean, 1 if documented and 0 if not. Note that some experimental papers have no value (NaN) for training and/or validation data if a train/validation/test split is not applicable.

## Miscellaneous statistics

In [4]:
def print_misc_stats(evaluation_data, groupby_name):
    print('Samples per affiliation\n{}'.format(evaluation_data.groupby('affiliation').size()
                                              ), 
          end='\n\n')
    print('Affiliation by {}\n{}'.format('groupby_name',
                                         evaluation_data.groupby([groupby_name, 'affiliation']).size()
                                        ), 
          end='\n\n')

    print('Samples per research type\n{}'.format(evaluation_data.groupby('research_type').size()), end='\n\n')
    print('Research type by {}\n{}'.format('groupby_name',
                                           evaluation_data.groupby([groupby_name, 'research_type']).size()
                                          ), 
          end='\n\n')

    print('Samples per research outcome\n{}'.format(evaluation_data.groupby('result_outcome').size()), end='\n\n')
    print('Research outcome by {}\n{}'.format('groupby_name',
                                              evaluation_data.groupby([groupby_name, 'result_outcome']).size()
                                             ), 
          end='\n\n')

    print('Samples with contribution\n{}'.format(evaluation_data.groupby('contribution').size()), end='\n\n')
    print('Contribution by {}\n{}'.format('groupby_name',
                                          evaluation_data.groupby([groupby_name, 'contribution']).size()
                                         ), 
          end='\n\n')
    
print("AAAI: ")
print_misc_stats(evaluation_data_aaai, 'conference')

print("\n\n\n==========\nAI Reproducibility 2018")
print_misc_stats(evaluation_data_air2018, 'year')

AAAI: 
Samples per affiliation
affiliation
0    331
1     58
2     11
dtype: int64

Affiliation by groupby_name
conference  affiliation
AAAI 14     0              83
            1              14
            2               3
AAAI 16     0              79
            1              17
            2               4
IJCAI 13    0              89
            1              11
IJCAI 16    0              80
            1              16
            2               4
dtype: int64

Samples per research type
research_type
0     75
1    325
dtype: int64

Research type by groupby_name
conference  research_type
AAAI 14     0                15
            1                85
AAAI 16     0                15
            1                85
IJCAI 13    0                29
            1                71
IJCAI 16    0                16
            1                84
dtype: int64

Samples per research outcome
result_outcome
0     23
1    377
dtype: int64

Research outcome by groupby_name
conference  r

## Extracting experimental papers
Analysis of the reproducibility is relevant for experimental papers, as such we will filter out the experimental papers.

In [5]:
def extract_experimental_papers(evaluation_data):
    experimental_data = evaluation_data[evaluation_data.research_type == 1]
    return experimental_data

def get_over_time_indices_aaai(experimental_data):
    """TODO: This is not easily generalizable, so do it separately for AAAI and AIR2018"""
    early_years_index = (experimental_data.conference == 'AAAI 14') | (experimental_data.conference == 'IJCAI 13')
    late_years_index = (experimental_data.conference == 'AAAI 16') | (experimental_data.conference == 'IJCAI 16')
    indices = [early_years_index, late_years_index]
    return indices

def get_over_time_indices_air2018(experimental_data):
    """TODO: This is not easily generalizable, so do it separately for AAAI and AIR2018"""
    years = list(sorted(set(experimental_data['year'])))
    indices = [(experimental_data.year == year) for year in years]
    return indices

experimental_data_aaai = extract_experimental_papers(evaluation_data_aaai)
experimental_data_air2018 = extract_experimental_papers(evaluation_data_air2018)

indices_aaai = get_over_time_indices_aaai(experimental_data_aaai)
indices_air2018 = get_over_time_indices_air2018(experimental_data_air2018)


## $R3(e) = Method(e)$

In [6]:
method = ['conference','problem_description','goal/objective','research_method',
        'research_question','pseudocode']

data = ['train', 'validation', 'test', 'results']

experiment = ['hypothesis', 'prediction',
    'open_source_code', 'open_experiment_code',
    'hardware_specification', 'software_dependencies',
    'experiment_setup', 'evaluation_criteria']

r3_columns = method
r2_columns = method + data
r1_columns = method + data + experiment

In [7]:
def calculate_method(experimental_data, over_time_indices, groupby_name):
    #TODO: Make this method instead of R3 for consistency ?
    experimental_data.loc[:, 'R3'] = experimental_data[r3_columns].all(axis=1)
    print('R3(e)\nTotal = {}'.format(experimental_data['R3'].sum()))
    display(experimental_data[['R3', groupby_name]].groupby(groupby_name).sum())


    experimental_data.loc[:, 'R3D'] = experimental_data[r3_columns].mean(axis=1)
    print('\n\nR3D\nTotal: {mean:.4f}, variance = {var:.4f}\nBy {gbn}, followed by variance'
          .format(mean=experimental_data['R3D'].mean(),
                  var=experimental_data['R3D'].var(),
                  gbn=groupby_name
                 )
         )
    display(experimental_data[['R3D', groupby_name]].groupby(groupby_name).mean())
    display(experimental_data[['R3D', groupby_name]].groupby(groupby_name).var())

    print('\ngroup\tR3D\tVariance\n')
    for over_time_index in over_time_indices:
        mean = experimental_data[over_time_index].R3D.mean()
        var = experimental_data[over_time_index].R3D.var()
        
        over_time_selection = experimental_data[over_time_index]
        nametags = list(sorted(set(over_time_selection[groupby_name])))
        name = '/'.join(str(x) for x in nametags)
        print("'{group}'\t{mean:.4f}\t{var:.4f}".format(group=name,
                                                        mean=mean,
                                                        var=var)
             )
        

print('AAAI:\n')
calculate_method(experimental_data_aaai, indices_aaai, 'conference')

print('\n\n==========\nAI Reproducibility 2018:\n')
calculate_method(experimental_data_air2018, indices_air2018, 'year')


AAAI:

R3(e)
Total = 0


Unnamed: 0_level_0,R3
conference,Unnamed: 1_level_1
AAAI 14,False
AAAI 16,False
IJCAI 13,False
IJCAI 16,False




R3D
Total: 0.2615, variance = 0.0342
By conference, followed by variance


Unnamed: 0_level_0,R3D
conference,Unnamed: 1_level_1
AAAI 14,0.28
AAAI 16,0.235294
IJCAI 13,0.23662
IJCAI 16,0.290476


Unnamed: 0_level_0,R3D
conference,Unnamed: 1_level_1
AAAI 14,0.039238
AAAI 16,0.034454
IJCAI 13,0.02664
IJCAI 16,0.034125



group	R3D	Variance

'AAAI 14/IJCAI 13'	0.2603	0.0338
'AAAI 16/IJCAI 16'	0.2627	0.0349


AI Reproducibility 2018:

R3(e)
Total = 0


Unnamed: 0_level_0,R3
year,Unnamed: 1_level_1
2012,False
2014,False
2016,False




R3D
Total: 0.2000, variance = 0.0359
By year, followed by variance


Unnamed: 0_level_0,R3D
year,Unnamed: 1_level_1
2012,0.24
2014,0.16
2016,0.2


Unnamed: 0_level_0,R3D
year,Unnamed: 1_level_1
2012,0.060444
2014,0.024889
2016,0.026667



group	R3D	Variance

'2012'	0.2400	0.0604
'2014'	0.1600	0.0249
'2016'	0.2000	0.0267


## $R2(e) = Method(e) \land Data(e)$

In [8]:
def calculate_data(experimental_data, over_time_indices, groupby_name):
    experimental_data.loc[:, 'Data'] = experimental_data[data].all(axis=1)                    # <-- TWO
    print('Data(e)\nTotal = {:}'.format(experimental_data['Data'].sum()))                     # <-- TWO
    display(experimental_data[['Data', groupby_name]].groupby(groupby_name).sum())            # <-- ONE

    experimental_data.loc[:, 'DataD'] = experimental_data[data].mean(axis=1)                  # <-- TWO
    print('\n\nDataDegree(e)\nTotal: {mean:.4f}, variance = {var:.4f}\nBy {gbn}, followed by variance' # <-- ONE
          .format(mean=experimental_data['DataD'].mean(),                                     # <-- ONE
                  var=experimental_data['DataD'].var(),                                       # <-- ONE
                  gbn=groupby_name
                 )
         )
    display(experimental_data[['DataD', groupby_name]].groupby(groupby_name).mean())          # <-- ONE
    display(experimental_data[['DataD', groupby_name]].groupby(groupby_name).var())           # <-- ONE

    
    
    print('\ngroup\tDataD\tVariance\n')                                                       # <-- ONE
    for over_time_index in over_time_indices:
        mean = experimental_data[over_time_index].DataD.mean()                                # <-- ONE
        var = experimental_data[over_time_index].DataD.var()                                  # <-- ONE
        
        over_time_selection = experimental_data[over_time_index]
        nametags = list(sorted(set(over_time_selection[groupby_name])))
        name = '/'.join(str(x) for x in nametags)
        print("'{group}'\t{mean:.4f}\t{var:.4f}".format(group=name,
                                                        mean=mean,
                                                        var=var)
             )
        
def calculate_R2(experimental_data, over_time_indices, groupby_name):
    experimental_data.loc[:, 'R2'] = experimental_data[r2_columns].all(axis=1)                    # <-- TWO
    print('R2(e)\nTotal = {:}'.format(experimental_data['R2'].sum()))                             # <-- TWO
    display(experimental_data[['R2', groupby_name]].groupby(groupby_name).sum())                  # <-- ONE

    experimental_data.loc[:, 'R2D'] = experimental_data[r2_columns].mean(axis=1)                  # <-- TWO
    print('\n\nR2D(e)\nTotal: {mean:.4f}, variance = {var:.4f}\nBy {gbn}, followed by variance'   # <-- ONE
          .format(mean=experimental_data['R2D'].mean(),                                           # <-- ONE
                  var=experimental_data['R2D'].var(),                                             # <-- ONE
                  gbn=groupby_name
                 )
         )
    display(experimental_data[['R2D', groupby_name]].groupby(groupby_name).mean())                # <-- ONE
    display(experimental_data[['R2D', groupby_name]].groupby(groupby_name).var())                 # <-- ONE

    
    
    print('\ngroup\tR2D\tVariance\n')                                                             # <-- ONE
    for over_time_index in over_time_indices:
        mean = experimental_data[over_time_index].R2D.mean()                                      # <-- ONE
        var = experimental_data[over_time_index].R2D.var()                                        # <-- ONE
        
        over_time_selection = experimental_data[over_time_index]
        nametags = list(sorted(set(over_time_selection[groupby_name])))
        name = '/'.join(str(x) for x in nametags)
        print("'{group}'\t{mean:.4f}\t{var:.4f}".format(group=name,
                                                        mean=mean,
                                                        var=var)
             )


print("===DATA===\n")

print('AAAI:\n')
calculate_data(experimental_data_aaai, indices_aaai, 'conference')
print('\n\n==========\nAI Reproducibility 2018:\n')
calculate_data(experimental_data_air2018, indices_air2018, 'year')

print("\n\n===R2===\n")

print('AAAI:\n')
calculate_R2(experimental_data_aaai, indices_aaai, 'conference')
print('\n\n==========\nAI Reproducibility 2018:\n')
calculate_R2(experimental_data_air2018, indices_air2018, 'year')




===DATA===

AAAI:

Data(e)
Total = 9


Unnamed: 0_level_0,Data
conference,Unnamed: 1_level_1
AAAI 14,2.0
AAAI 16,1.0
IJCAI 13,0.0
IJCAI 16,6.0




DataDegree(e)
Total: 0.2287, variance = 0.0763
By conference, followed by variance


Unnamed: 0_level_0,DataD
conference,Unnamed: 1_level_1
AAAI 14,0.202941
AAAI 16,0.261765
IJCAI 13,0.131455
IJCAI 16,0.303571


Unnamed: 0_level_0,DataD
conference,Unnamed: 1_level_1
AAAI 14,0.064723
AAAI 16,0.068312
IJCAI 13,0.048346
IJCAI 16,0.107035



group	DataD	Variance

'AAAI 14/IJCAI 13'	0.1704	0.0582
'AAAI 16/IJCAI 16'	0.2825	0.0875


AI Reproducibility 2018:

Data(e)
Total = 1


Unnamed: 0_level_0,Data
year,Unnamed: 1_level_1
2012,True
2014,False
2016,False




DataDegree(e)
Total: 0.2917, variance = 0.1167
By year, followed by variance


Unnamed: 0_level_0,DataD
year,Unnamed: 1_level_1
2012,0.25
2014,0.475
2016,0.15


Unnamed: 0_level_0,DataD
year,Unnamed: 1_level_1
2012,0.097222
2014,0.117361
2016,0.1



group	DataD	Variance

'2012'	0.2500	0.0972
'2014'	0.4750	0.1174
'2016'	0.1500	0.1000


===R2===

AAAI:

R2(e)
Total = 0


Unnamed: 0_level_0,R2
conference,Unnamed: 1_level_1
AAAI 14,False
AAAI 16,False
IJCAI 13,False
IJCAI 16,False




R2D(e)
Total: 0.2525, variance = 0.0251
By conference, followed by variance


Unnamed: 0_level_0,R2D
conference,Unnamed: 1_level_1
AAAI 14,0.254972
AAAI 16,0.247246
IJCAI 13,0.204924
IJCAI 16,0.295517


Unnamed: 0_level_0,R2D
conference,Unnamed: 1_level_1
AAAI 14,0.023186
AAAI 16,0.02305
IJCAI 13,0.017109
IJCAI 16,0.033059



group	R2D	Variance

'AAAI 14/IJCAI 13'	0.2322	0.0209
'AAAI 16/IJCAI 16'	0.2712	0.0284


AI Reproducibility 2018:

R2(e)
Total = 0


Unnamed: 0_level_0,R2
year,Unnamed: 1_level_1
2012,False
2014,False
2016,False




R2D(e)
Total: 0.2426, variance = 0.0310
By year, followed by variance


Unnamed: 0_level_0,R2D
year,Unnamed: 1_level_1
2012,0.25
2014,0.3
2016,0.177778


Unnamed: 0_level_0,R2D
year,Unnamed: 1_level_1
2012,0.035837
2014,0.019342
2016,0.036214



group	R2D	Variance

'2012'	0.2500	0.0358
'2014'	0.3000	0.0193
'2016'	0.1778	0.0362


## $R1(e) = Method(e) \land Data(e) \land Exp(e)$

In [9]:
def calculate_exp(experimental_data, over_time_indices, groupby_name):
    experimental_data.loc[:, 'Exp'] = experimental_data[experiment].all(axis=1)                    # <-- TWO
    print('Exp(e)\nTotal = {:}'.format(experimental_data['Exp'].sum()))                     # <-- TWO
    display(experimental_data[['Exp', groupby_name]].groupby(groupby_name).sum())            # <-- ONE

    experimental_data.loc[:, 'ExpD'] = experimental_data[experiment].mean(axis=1)                  # <-- TWO
    print('\n\nExpDegree(e)\nTotal: {mean:.4f}, variance = {var:.4f}\nBy {gbn}, followed by variance' # <-- ONE
          .format(mean=experimental_data['ExpD'].mean(),                                     # <-- ONE
                  var=experimental_data['ExpD'].var(),                                       # <-- ONE
                  gbn=groupby_name
                 )
         )
    display(experimental_data[['ExpD', groupby_name]].groupby(groupby_name).mean())          # <-- ONE
    display(experimental_data[['ExpD', groupby_name]].groupby(groupby_name).var())           # <-- ONE

    
    
    print('\ngroup\tExpD\tVariance\n')                                                       # <-- ONE
    for over_time_index in over_time_indices:
        mean = experimental_data[over_time_index].ExpD.mean()                                # <-- ONE
        var = experimental_data[over_time_index].ExpD.var()                                  # <-- ONE
        
        over_time_selection = experimental_data[over_time_index]
        nametags = list(sorted(set(over_time_selection[groupby_name])))
        name = '/'.join(str(x) for x in nametags)
        print("'{group}'\t{mean:.4f}\t{var:.4f}".format(group=name,
                                                        mean=mean,
                                                        var=var)
             )

def calculate_R1(experimental_data, over_time_indices, groupby_name):
    experimental_data.loc[:, 'R1'] = experimental_data[r1_columns].all(axis=1)                    # <-- TWO
    print('R1(e)\nTotal = {:}'.format(experimental_data['R1'].sum()))                             # <-- TWO
    display(experimental_data[['R1', groupby_name]].groupby(groupby_name).sum())                  # <-- ONE

    experimental_data.loc[:, 'R1D'] = experimental_data[r1_columns].mean(axis=1)                  # <-- TWO
    print('\n\nR1D(e)\nTotal: {mean:.4f}, variance = {var:.4f}\nBy {gbn}, followed by variance'   # <-- ONE
          .format(mean=experimental_data['R1D'].mean(),                                           # <-- ONE
                  var=experimental_data['R1D'].var(),                                             # <-- ONE
                  gbn=groupby_name
                 )
         )
    display(experimental_data[['R1D', groupby_name]].groupby(groupby_name).mean())                # <-- ONE
    display(experimental_data[['R1D', groupby_name]].groupby(groupby_name).var())                 # <-- ONE

    
    
    print('\ngroup\tR1D\tVariance\n')                                                             # <-- ONE
    for over_time_index in over_time_indices:
        mean = experimental_data[over_time_index].R1D.mean()                                      # <-- ONE
        var = experimental_data[over_time_index].R1D.var()                                        # <-- ONE
        
        over_time_selection = experimental_data[over_time_index]
        nametags = list(sorted(set(over_time_selection[groupby_name])))
        name = '/'.join(str(x) for x in nametags)
        print("'{group}'\t{mean:.4f}\t{var:.4f}".format(group=name,
                                                        mean=mean,
                                                        var=var)
             )


        
print("===EXPERIMENT===\n")

print('AAAI:\n')
calculate_exp(experimental_data_aaai, indices_aaai, 'conference')
print('\n\n==========\nAI Reproducibility 2018:\n')
calculate_exp(experimental_data_air2018, indices_air2018, 'year')

print("\n\n===R1===\n")

print('AAAI:\n')
calculate_R1(experimental_data_aaai, indices_aaai, 'conference')
print('\n\n==========\nAI Reproducibility 2018:\n')
calculate_R1(experimental_data_air2018, indices_air2018, 'year')


===EXPERIMENT===

AAAI:

Exp(e)
Total = 0


Unnamed: 0_level_0,Exp
conference,Unnamed: 1_level_1
AAAI 14,False
AAAI 16,False
IJCAI 13,False
IJCAI 16,False




ExpDegree(e)
Total: 0.2235, variance = 0.0219
By conference, followed by variance


Unnamed: 0_level_0,ExpD
conference,Unnamed: 1_level_1
AAAI 14,0.172059
AAAI 16,0.214706
IJCAI 13,0.197183
IJCAI 16,0.306548


Unnamed: 0_level_0,ExpD
conference,Unnamed: 1_level_1
AAAI 14,0.018592
AAAI 16,0.018085
IJCAI 13,0.019046
IJCAI 16,0.02199



group	ExpD	Variance

'AAAI 14/IJCAI 13'	0.1835	0.0188
'AAAI 16/IJCAI 16'	0.2604	0.0220


AI Reproducibility 2018:

Exp(e)
Total = 0


Unnamed: 0_level_0,Exp
year,Unnamed: 1_level_1
2012,False
2014,False
2016,False




ExpDegree(e)
Total: 0.2000, variance = 0.0234
By year, followed by variance


Unnamed: 0_level_0,ExpD
year,Unnamed: 1_level_1
2012,0.271429
2014,0.171429
2016,0.157143


Unnamed: 0_level_0,ExpD
year,Unnamed: 1_level_1
2012,0.015646
2014,0.012698
2016,0.038322



group	ExpD	Variance

'2012'	0.2714	0.0156
'2014'	0.1714	0.0127
'2016'	0.1571	0.0383


===R1===

AAAI:

R1(e)
Total = 0


Unnamed: 0_level_0,R1
conference,Unnamed: 1_level_1
AAAI 14,False
AAAI 16,False
IJCAI 13,False
IJCAI 16,False




R1D(e)
Total: 0.2383, variance = 0.0140
By conference, followed by variance


Unnamed: 0_level_0,R1D
conference,Unnamed: 1_level_1
AAAI 14,0.213408
AAAI 16,0.231972
IJCAI 13,0.200977
IJCAI 16,0.301436


Unnamed: 0_level_0,R1D
conference,Unnamed: 1_level_1
AAAI 14,0.010715
AAAI 16,0.0115
IJCAI 13,0.008913
IJCAI 16,0.018873



group	R1D	Variance

'AAAI 14/IJCAI 13'	0.2078	0.0099
'AAAI 16/IJCAI 16'	0.2665	0.0163


AI Reproducibility 2018:

R1(e)
Total = 0


Unnamed: 0_level_0,R1
year,Unnamed: 1_level_1
2012,False
2014,False
2016,False




R1D(e)
Total: 0.2224, variance = 0.0177
By year, followed by variance


Unnamed: 0_level_0,R1D
year,Unnamed: 1_level_1
2012,0.254808
2014,0.24375
2016,0.16875


Unnamed: 0_level_0,R1D
year,Unnamed: 1_level_1
2012,0.012815
2014,0.006467
2016,0.03303



group	R1D	Variance

'2012'	0.2548	0.0128
'2014'	0.2437	0.0065
'2016'	0.1688	0.0330


## Versions
Here's a generated output to keep track of software versions used to run this Jupyter notebook.

In [10]:
import IPython
import platform

print('Python version: {}'.format(platform.python_version()))
print('IPython version: {}'.format(IPython.__version__))
print('pandas version: {}'.format(pd.__version__))

Python version: 3.6.5
IPython version: 5.4.1
pandas version: 0.23.0
