Estimated standard deviations calculation #4

talrub · 2023-06-25T18:42:13Z

Hi,
At table 2 in the paper you say: " We also show the estimated standard deviations of the averages computed over 175 random data split and training seeds ".
But 'mnist_guess.yaml' parameters are: target_model_count=200 target_model_count_subrun=10 , which means that for each combination of (cur_num_samples, cur_loss_bin) there are 20 records in the Data Base (each record contain test_acc that was calculated over 10 models accuracies) that is used for the standard deviation calculation. So for my understanding, the standard deviation is calculated over 10 random data split and training seeds.
I would appreciate if you could clarify this issue for me, to make sure that i understand the way you calculated the standard deviations.

Thanks,
Tal

Ping-C · 2023-06-25T20:22:55Z

Hey Tal, I dug into the logs of my previous experiments, and I found that I was using 1 model for each subrun. To replicate my calculation of standard deviation, target_model_count_subrun should be set to 1.

Those arguments were added while I was cleaning the code after the experiment have finished, and I apologize for the inconsistency.

To calculate the standard deviation of the estimated mean, I used the following formula

$s=\sqrt{\frac{\sum_{i=1..n}(x_i - \bar{x})^2}{n}}, s_{mean} = \frac{s}{\sqrt{n}}$ where $n$ = 175.

talrub · 2023-07-01T16:22:48Z

Hi,
Thanks for your quick response.
So if n=175 that means you have chosen 'target_model_count'=175 and not 'target_model_count'=200 as in current 'mnist_guess.yaml'.
To sum up, i understand that your reported experiment at table 2 in the paper was made with: target_model_count=175, target_model_count_subrun=1. Is that correct?

Ping-C · 2023-07-01T16:44:54Z

Yes, that is correct.

Ping-C · 2023-07-18T13:18:51Z

I am closing this issue since it has been resolved.

talrub · 2023-08-27T20:07:21Z

Hi,
I am sorry for re-opening this issue but something in your calculation of standard deviation of the estimated mean looks odd to me.
From my understanding, for each combination of (num_train_samples, loss_bin) you calculate 's' using 175 test accuracies that you found during the run.
Why do you calculate 's_mean' and treats it like the standard deviation of the estimated mean?
Is 's' not the result we are looking for?

Thanks!

Ping-C closed this as completed Jul 18, 2023

talrub mentioned this issue Aug 27, 2023

reopening Estimated standard deviations calculation #7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Estimated standard deviations calculation #4

Estimated standard deviations calculation #4

talrub commented Jun 25, 2023

Ping-C commented Jun 25, 2023

talrub commented Jul 1, 2023

Ping-C commented Jul 1, 2023

Ping-C commented Jul 18, 2023

talrub commented Aug 27, 2023

Estimated standard deviations calculation #4

Estimated standard deviations calculation #4

Comments

talrub commented Jun 25, 2023

Ping-C commented Jun 25, 2023

talrub commented Jul 1, 2023

Ping-C commented Jul 1, 2023

Ping-C commented Jul 18, 2023

talrub commented Aug 27, 2023