## Fetch experiments data from Neptune using [Query API](https://docs.neptune.ai/python-api/query-api.html)

This notebook uses Neptune's query API, which is a set of Python methods that let you fetch experiments data from neptune.

We fetch all experimental runs for a particular experiment and then go on to compare them.
## Common methods of the API Methods
This notebook covers most common methods like:

1. [get_experiments()](https://docs.neptune.ai/neptune-client/docs/project.html#neptune.projects.Project.get_experiments) - get a list of the [Experiment objects](https://docs.neptune.ai/neptune-client/docs/experiment.html). We will need them to fetch data from selected experiments.
1. [get_leaderboard()](https://docs.neptune.ai/neptune-client/docs/project.html#neptune.projects.Project.get_leaderboard) - get experiments table as a pandas DataFrame. Example experiment table is [here](https://ui.neptune.ai/o/USERNAME/org/example-project/experiments?viewId=6013ecbc-416d-4e5c-973e-871e5e9010e9).
1. [get_hardware_utilization()](https://docs.neptune.ai/neptune-client/docs/experiment.html#neptune.experiments.Experiment.get_hardware_utilization) - for the Experiment in question, get hardware utilization metrics as pandas DataFrame ([example metrics](https://ui.neptune.ai/o/USERNAME/org/example-project/e/HELLO-177/monitoring)).
1. [get_logs()](https://docs.neptune.ai/neptune-client/docs/experiment.html#neptune.experiments.Experiment.get_logs) - get dict, where keys are log names and values are Channel objects.
1. [get_numeric_channels_values()](https://docs.neptune.ai/neptune-client/docs/experiment.html#neptune.experiments.Experiment.get_numeric_channels_values) - get values of numeric logs as pandas DataFrame ([example logs](https://ui.neptune.ai/o/USERNAME/org/example-project/e/HELLO-177/charts)).

In [3]:
%matplotlib inline
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [1]:
from utils.tokens import NEPTUNE_API_TOKEN
import neptune
from scipy.stats import hmean
import pandas as pd
import plotly.express as px
import tqdm

# Set project to work with (as usual)

In [2]:
project = neptune.init('createrandom/mus-coral',
                       api_token=NEPTUNE_API_TOKEN)



# Visualize metrics

`get_experiments()` below

In [3]:
attribute = 'Class'
#problem_type = problem_kind[attribute]
#print(problem_type)
# get experiments objects that satisfy all three conditions. Note that running time is in seconds.
experiments = project.get_experiments()

In [4]:
len(experiments)

82

In [5]:
problem_type = 'binary'
machines = ['ESAOTE_6100/val/', 'Philips_iU22/val/']
metric_mapping = {'regression': ['mae'],
              'binary': ['auc']}

metric_list = metric_mapping[problem_type]

logs_names = []
for machine in machines:
    for metric in metric_list:
        final_metric = machine + metric + '_Class'
        logs_names.append(final_metric)


In [6]:
logs_names
logs_names.append('training/mean_coral_Class')

In [None]:
experiments

In [None]:
philips_without_coral = experiments[-3].get_numeric_channels_values(*logs_names)
philips_with_coral = experiments[-1].get_numeric_channels_values(*logs_names)

In [None]:
philips_merged = philips_with_coral.merge(philips_without_coral,on='x',suffixes=['_w', '_w/o'])
pm = philips_merged.rename({'ESAOTE_6100/val/auc_Class_w': 'CORAL',
                      'ESAOTE_6100/val/auc_Class_w/o': 'MIL'},axis=1)
px.line(pm,x='x',y=['CORAL','MIL'],range_y=[0,1],labels={'value': 'Validation AUC','variable':'Condition','x':'Epoch'})

In [None]:
philips_merged

In [None]:
pm = philips_merged.rename({'training/mean_coral_Class_w': 'CORAL',
                      'training/mean_coral_Class_w/o': 'MIL'},axis=1)
px.line(pm,x='x',y=['CORAL','MIL'],labels={'value': 'Mean CORAL loss','variable':'Condition','x':'Epoch'})


In [11]:
metrics_df = pd.DataFrame(columns=['id', *logs_names])
for experiment in tqdm.notebook.tqdm(experiments):
    df = experiment.get_numeric_channels_values(*logs_names)  # get logs values
    #print(df)
   # df['tags'] = experiment.get_tags()
   # params = experiment.get_parameters()
    output = df.append(df, ignore_index=True)
    df.insert(loc=0, column='id', value=experiment.id)
    metrics_df = metrics_df.append(df, sort=True)


[Experiment(CORAL-14),
 Experiment(CORAL-15),
 Experiment(CORAL-16),
 Experiment(CORAL-17),
 Experiment(CORAL-18),
 Experiment(CORAL-19),
 Experiment(CORAL-47),
 Experiment(CORAL-48),
 Experiment(CORAL-49),
 Experiment(CORAL-50),
 Experiment(CORAL-51),
 Experiment(CORAL-52),
 Experiment(CORAL-53),
 Experiment(CORAL-54),
 Experiment(CORAL-55),
 Experiment(CORAL-56),
 Experiment(CORAL-57),
 Experiment(CORAL-58),
 Experiment(CORAL-59),
 Experiment(CORAL-60),
 Experiment(CORAL-61),
 Experiment(CORAL-62),
 Experiment(CORAL-63),
 Experiment(CORAL-64),
 Experiment(CORAL-65),
 Experiment(CORAL-66),
 Experiment(CORAL-67),
 Experiment(CORAL-68),
 Experiment(CORAL-69),
 Experiment(CORAL-70),
 Experiment(CORAL-71),
 Experiment(CORAL-72),
 Experiment(CORAL-73),
 Experiment(CORAL-74),
 Experiment(CORAL-75),
 Experiment(CORAL-76),
 Experiment(CORAL-77),
 Experiment(CORAL-78),
 Experiment(CORAL-79),
 Experiment(CORAL-80),
 Experiment(CORAL-81),
 Experiment(CORAL-82),
 Experiment(CORAL-83),
 Experiment

In [66]:
philips_without_coral = experiments[-3].get_numeric_channels_values(*logs_names)
philips_with_coral = experiments[-1].get_numeric_channels_values(*logs_names)

In [69]:
philips_merged = philips_with_coral.merge(philips_without_coral,on='x',suffixes=['_w', '_w/o'])
pm = philips_merged.rename({'ESAOTE_6100/val/auc_Class_w': 'CORAL',
                      'ESAOTE_6100/val/auc_Class_w/o': 'MIL'},axis=1)
px.line(pm,x='x',y=['CORAL','MIL'],range_y=[0,1],labels={'value': 'Validation AUC','variable':'Condition','x':'Epoch'})

In [70]:
philips_merged

Unnamed: 0,x,ESAOTE_6100/val/auc_Class_w,Philips_iU22/val/auc_Class_w,training/mean_coral_Class_w,ESAOTE_6100/val/auc_Class_w/o,Philips_iU22/val/auc_Class_w/o,training/mean_coral_Class_w/o
0,1.0,0.65089,0.679983,0.001535,0.584455,0.769865,0.003903
1,2.0,0.684532,0.762174,0.001268,0.591721,0.79,0.004115
2,3.0,0.713043,0.731771,0.001327,0.668696,0.751302,0.006941
3,4.0,0.560784,0.765089,0.001063,0.445752,0.699522,0.009052
4,5.0,0.488261,0.785931,0.001057,0.576087,0.749457,0.011405
5,6.0,0.48671,0.75597,0.001231,0.5878,0.693878,0.019008
6,7.0,0.579085,0.768997,0.001203,0.64488,0.703865,0.033346
7,8.0,0.564783,0.645833,0.001128,0.604783,0.734375,0.028849
8,9.0,0.688811,0.70343,0.001989,0.72465,0.716891,0.041753
9,10.0,0.492174,0.785156,0.001267,0.724783,0.724826,0.06268


In [71]:
pm = philips_merged.rename({'training/mean_coral_Class_w': 'CORAL',
                      'training/mean_coral_Class_w/o': 'MIL'},axis=1)
px.line(pm,x='x',y=['CORAL','MIL'],labels={'value': 'Mean CORAL loss','variable':'Condition','x':'Epoch'})


In [55]:
metrics_df = pd.DataFrame(columns=['id', *logs_names])
for experiment in tqdm.notebook.tqdm(experiments):
    df = experiment.get_numeric_channels_values(*logs_names)  # get logs values
    #print(df)
   # df['tags'] = experiment.get_tags()
   # params = experiment.get_parameters()
    output = df.append(df, ignore_index=True)
    df.insert(loc=0, column='id', value=experiment.id)
    metrics_df = metrics_df.append(df, sort=True)


HBox(children=(FloatProgress(value=0.0, max=78.0), HTML(value='')))




In [51]:
def compute_f1_esaote(entry):
    return hmean([entry['ESAOTE_6100/val/p'], entry['ESAOTE_6100/val/r']])

def compute_f1_philips(entry):
    return hmean([entry['Philips_iU22/val/p'], entry['Philips_iU22/val/r']])

if problem_type == 'binary':
    metrics_df['ESAOTE_6100/val/f1'] = metrics_df.apply(compute_f1_esaote, axis=1)
    metrics_df['Philips_iU22/val/f1'] = metrics_df.apply(compute_f1_philips, axis=1)
  #  metrics_df['val_f1_gap'] = metrics_df['ESAOTE_6100/val/f1']  -metrics_df['Philips_iU22/val/f1']
else:
    metrics_df['val_mae_gap'] = metrics_df['ESAOTE_6100/val/mae']  -metrics_df['Philips_iU22/val/mae']

Unnamed: 0,ESAOTE_6100/val/accuracy,ESAOTE_6100/val/p,ESAOTE_6100/val/r,Philips_iU22/val/accuracy,Philips_iU22/val/p,Philips_iU22/val/r,id,epoch,ESAOTE_6100/val/f1,Philips_iU22/val/f1
0,0.77451,0.785047,0.785047,0.50463,0.528205,0.872881,MUS1-464,1.0,0.785047,0.658147
1,0.769608,0.8,0.747664,0.638889,0.80303,0.449153,MUS1-464,2.0,0.772947,0.576087
2,0.764706,0.790476,0.761468,0.611111,0.646552,0.635593,MUS1-464,3.0,0.775701,0.641026
3,0.794118,0.761905,0.888889,0.560185,0.567251,0.822034,MUS1-464,4.0,0.820513,0.67128
4,0.789216,0.762295,0.869159,0.583333,0.584337,0.822034,MUS1-464,5.0,0.812227,0.683099


In [56]:
metrics_df.rename(columns={'x': 'epoch'},inplace=True)
metrics_df.head(n=5)

Unnamed: 0,ESAOTE_6100/val/auc_Class,Philips_iU22/val/auc_Class,id,training/mean_coral_Class,epoch
0,0.720799,0.368215,CORAL-14,0.000621,1.0
1,0.772113,0.479565,CORAL-14,0.000672,2.0
2,0.791739,0.549479,CORAL-14,0.000547,3.0
3,0.705882,0.755102,CORAL-14,0.000582,4.0
4,0.810435,0.683891,CORAL-14,0.000513,5.0


In [28]:
# grab the best scoring epoch for each experiment
if problem_type == 'binary':
    best_scores = metrics_df.sort_values(['ESAOTE_6100/val/auc_Class'], ascending=[False]).groupby('id').first()
else:
    best_scores = metrics_df.sort_values(['ESAOTE_6100/val/mae'], ascending=[True]).groupby('id').first()

In [29]:
all_data = project.get_leaderboard(tag=attribute).set_index('id').convert_dtypes()
metrics_df['id']=metrics_df['id'].astype(str)
plot_frame = best_scores.join(all_data)
plot_frame.head(n=10)

Unnamed: 0_level_0,ESAOTE_6100/val/auc_Class,Philips_iU22/val/auc_Class,training/mean_coral_Class,epoch,name,created,finished,owner,notes,running_time,...,parameter_n_epochs,parameter_n_params_backend,parameter_n_params_classifier,parameter_n_params_pooling,parameter_prediction_target,parameter_problem_type,parameter_source_train,parameter_target_train,parameter_use_pseudopatients,parameter_val
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
CORAL-71,0.818301,0.690404,0.000245,7.0,Untitled,2020-09-08 20:12:57.019000+00:00,2020-09-08 20:44:31.989000+00:00,createrandom,,1894,...,15.0,11176512.0,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"


In [65]:
test_frame = metrics_df.join(all_data,on='id')
# filter on machine
is_layer_of_interest = test_frame['parameter_layers_to_compute_da_on'] == '[2]'
is_machine = test_frame['parameter_source_train'] == 'ESAOTE_6100_train'
filtered = test_frame[is_machine & is_layer_of_interest]
filtered.sort_values(by=['Philips_iU22/val/auc_Class','training/mean_coral_Class'],ascending=[False, True])
#filtered.sort_values(by=['training/mean_coral_Class'],ascending=[True])

Unnamed: 0,ESAOTE_6100/val/auc_Class,Philips_iU22/val/auc_Class,id,training/mean_coral_Class,epoch,name,created,finished,owner,notes,...,parameter_n_epochs,parameter_n_params_backend,parameter_n_params_classifier,parameter_n_params_pooling,parameter_prediction_target,parameter_problem_type,parameter_source_train,parameter_target_train,parameter_use_pseudopatients,parameter_val
2,0.776522,0.812066,CORAL-96,0.001860,3.0,Untitled,2020-09-09 00:16:08.041000+00:00,2020-09-09 00:45:31.174000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"
8,0.826486,0.809813,CORAL-54,0.090980,9.0,Untitled,2020-08-29 13:56:52.617000+00:00,2020-08-29 14:26:15.749000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"
4,0.750000,0.808511,CORAL-100,0.001288,5.0,Untitled,2020-09-09 00:46:43.623000+00:00,2020-09-09 01:15:54.803000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"
6,0.827015,0.808076,CORAL-54,0.044692,7.0,Untitled,2020-08-29 13:56:52.617000+00:00,2020-08-29 14:26:15.749000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"
1,0.742048,0.803913,CORAL-104,0.003109,2.0,Untitled,2020-09-09 01:46:17.852000+00:00,2020-09-09 02:15:54.135000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
0,0.745983,0.336518,CORAL-55,0.013752,1.0,Untitled,2020-08-29 13:58:20.191000+00:00,2020-08-29 14:28:31.433000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"
0,0.709075,0.328702,CORAL-98,0.001081,1.0,Untitled,2020-09-09 00:46:12.699000+00:00,2020-09-09 01:15:39.785000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"
8,0.812937,0.244464,CORAL-96,0.002683,9.0,Untitled,2020-09-09 00:16:08.041000+00:00,2020-09-09 00:45:31.174000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"
0,0.556665,0.191924,CORAL-103,0.000880,1.0,Untitled,2020-09-09 01:16:34.172000+00:00,2020-09-09 01:45:55.525000+00:00,createrandom,,...,15.0,1.1176512E7,165121.0,65793.0,Class,bag,ESAOTE_6100_train,Philips_iU22_train,True,"['ESAOTE_6100_val', 'Philips_iU22_val']"


In [54]:
params_of_interest = ['epoch', 'parameter_problem_type', 'parameter_mil_pooling', 'parameter_lr']
to_include =  params_of_interest + logs_names
# TODO add the gaps back in
if problem_type == 'binary':
    to_include.append('ESAOTE_6100/val/f1')
    to_include.append('Philips_iU22/val/f1')
    comp_frame = plot_frame.sort_values('ESAOTE_6100/val/f1', ascending=False)[to_include]
else:
    comp_frame = plot_frame.sort_values('ESAOTE_6100/val/mae')[to_include]
    

comp_frame['parameter_mil_pooling'] = comp_frame['parameter_mil_pooling'].fillna('NA')
comp_frame

Unnamed: 0_level_0,epoch,parameter_problem_type,parameter_mil_pooling,parameter_lr,ESAOTE_6100/val/accuracy,ESAOTE_6100/val/p,ESAOTE_6100/val/r,Philips_iU22/val/accuracy,Philips_iU22/val/p,Philips_iU22/val/r,ESAOTE_6100/val/f1,Philips_iU22/val/f1
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
MUS1-465,5.0,image,,0.0788480377114249,0.897059,0.88,0.907216,0.606481,0.569892,0.540816,0.893401,0.554974
MUS1-475,10.0,image,,0.0347279111820452,0.897059,0.87,0.915789,0.606481,0.651163,0.285714,0.892308,0.397163
MUS1-466,5.0,image,,0.0929042484447034,0.877451,0.846154,0.907216,0.564815,0.7,0.071429,0.875622,0.12963
MUS1-469,5.0,image,,0.0351261200968086,0.872549,0.90099,0.850467,0.648148,0.659091,0.737288,0.875,0.696
MUS1-478,8.0,bag,mean,0.013850773307231,0.86,0.861111,0.877358,0.569444,0.569832,0.864407,0.869159,0.686869
MUS1-467,10.0,image,,0.054686828018487,0.867647,0.927083,0.816514,0.62037,0.669811,0.601695,0.868293,0.633929
MUS1-473,9.0,image,,0.0877621876261126,0.862745,0.893204,0.844037,0.601852,0.591954,0.872881,0.867925,0.705479
MUS1-485,10.0,bag,mean,0.0183940305879178,0.84,0.781955,0.971963,0.652778,0.686957,0.669492,0.866667,0.678112
MUS1-472,9.0,image,,0.0794512734435762,0.857843,0.884615,0.844037,0.606481,0.620438,0.720339,0.86385,0.666667
MUS1-464,9.0,image,,0.096753966942173,0.848039,0.848214,0.87156,0.569444,0.567568,0.889831,0.859729,0.693069


In [40]:
if problem_type == 'binary':
    print(comp_frame.groupby(['parameter_problem_type','parameter_mil_pooling']).min()['val_f1_gap'])
else:
    print(comp_frame.groupby(['parameter_problem_type','parameter_mil_pooling']).max()['val_mae_gap'])

KeyError: 'val_mae_gap'

In [78]:
plot_frame['parameter_lr'] = plot_frame['parameter_lr'].astype(float)
#plot_frame['parameter_backend_lr'] = plot_frame['parameter_backend_lr'].astype(float)

#plot_frame.drop(columns=['tags'], inplace=True)
fig = px.parallel_coordinates(plot_frame, dimensions=['parameter_lr', 'ESAOTE_6100/val/f1'])
fig.show()



ValueError: Value of 'dimensions_1' is not the name of a column in 'data_frame'. Expected one of ['ESAOTE_6100/val/mae', 'Philips_iU22/val/mae', 'epoch', 'val_mae_gap', 'name', 'created', 'finished', 'owner', 'notes', 'running_time', 'size', 'tags', 'channel_ESAOTE_6100/val/loss', 'channel_ESAOTE_6100/val/mae', 'channel_ESAOTE_6100/val/max_att', 'channel_ESAOTE_6100/val/mean', 'channel_ESAOTE_6100/val/mean_att', 'channel_ESAOTE_6100/val/min_att', 'channel_ESAOTE_6100/val/var', 'channel_ESAOTE_6100/val/var_att', 'channel_ESAOTE_6100/val_image/loss', 'channel_ESAOTE_6100/val_image/mae', 'channel_ESAOTE_6100/val_image/mean', 'channel_ESAOTE_6100/val_image/var', 'channel_Philips_iU22/val/loss', 'channel_Philips_iU22/val/mae', 'channel_Philips_iU22/val/max_att', 'channel_Philips_iU22/val/mean', 'channel_Philips_iU22/val/mean_att', 'channel_Philips_iU22/val/min_att', 'channel_Philips_iU22/val/var', 'channel_Philips_iU22/val/var_att', 'channel_Philips_iU22/val_image/loss', 'channel_Philips_iU22/val_image/mae', 'channel_Philips_iU22/val_image/mean', 'channel_Philips_iU22/val_image/var', 'channel_stderr', 'channel_stdout', 'channel_training/loss', 'channel_training/mae', 'channel_training/max_att', 'channel_training/mean', 'channel_training/mean_att', 'channel_training/min_att', 'channel_training/var', 'channel_training/var_att', 'parameter_attention_mode', 'parameter_backend', 'parameter_backend_cutoff', 'parameter_backend_lr', 'parameter_backend_mode', 'parameter_batch_size', 'parameter_fc_hidden_layers', 'parameter_fc_use_bn', 'parameter_lr', 'parameter_mil_mode', 'parameter_mil_pooling', 'parameter_n_epochs', 'parameter_n_params_backend', 'parameter_n_params_classifier', 'parameter_n_params_pooling', 'parameter_prediction_target', 'parameter_problem_type', 'parameter_source_train', 'parameter_use_pseudopatients', 'parameter_val'] but received: ESAOTE_6100/val/f1