Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed resetting performance counters related to idle-rate, etc. #1887

Merged
merged 4 commits into from Dec 6, 2015

Conversation

hkaiser
Copy link
Member

@hkaiser hkaiser commented Nov 30, 2015

  • also fly-by changes to preprocessor constant usage

@pagrubel Please verify whether this fixes your issue reported by #1859

- also fly-by changes to preprocessor constant usage
@pagrubel
Copy link
Member

Okay I'll check it out

@pagrubel
Copy link
Member

pagrubel commented Dec 3, 2015

got some negative counts so I don't think this is working. I will investigate further but in parallel I am going to make another branch for my work around

@hkaiser
Copy link
Member Author

hkaiser commented Dec 3, 2015

got some negative counts so I don't think this is working. I will investigate further but in parallel I am going to make another branch for my work around

In debug this should trigger assertions. In any case, what counters show those negative numbers?

@hkaiser
Copy link
Member Author

hkaiser commented Dec 4, 2015

[20:47] patg: hkaiser: to answer your question from earlier the counter that gave me negative values was /threads/time/average-overhead

@pagrubel
Copy link
Member

pagrubel commented Dec 5, 2015

@hkaiser
Here is an example using branch evaluate_counters_for_iteration of inncabs (which evaluates counters for each iteration of the benchmark) I ran the following script on carson
example_fft.sh.txt

(had to add .txt to upload it here)

the output is:

idle_rate,average_time,average_overhead,average_phase,average_phase_overhead,cumulative_time,cumulative_overhead,cumulative,phases
6079,4273,6625,2416,3746,8.74686e+09,1.35961e+10,2.04667e+06,3.6192e+06
5678,4160,23595,2353,3095,8.5157e+09,1.11875e+10,2.04667e+06,3.6176e+06
5673,4158,-10176,2351,3084,8.51044e+09,1.1179e+10,2.04663e+06,3.61851e+06
1964,8.80656
1981 1964 1961 
8956,459768,-16,306512,2.09637e+06,9.17738e+06,2.82843e+07,22,26

for comparison
using inncabs from master branch where counters are only evaluated at shutdown gives:

1957,8.52447
1974 1955 1957 
idle_rate,average_time,average_overhead,average_phase,average_phase_overhead,cumulative_time,cumulative_overhead,cumulative,phases
5829,4176,5837,2363,3303,2.56417e+10,3.584e+10,6.13946e+06,1.08495e+07

example_fib.sh.txt
when evaluating counters gives:

idle_rate,average_time,average_overhead,average_phase,average_phase_overhead,cumulative_time,cumulative_overhead,cumulative,phases
8687,3583,23726,2261,14974,7.88027e+07,5.22145e+08,21986,34840
3903,3817,5912,2379,1525,8.36429e+07,5.32256e+07,21909,35142
2827,3968,-8202,2507,987,8.69955e+07,3.43467e+07,21905,34676
13,16.7398
48 13 12 
6737,31712,-25,22651,69307,410412,2.15151e+06,18,22

and no evaluation:

19,15.4344
47 19 11 
idle_rate,average_time,average_overhead,average_phase,average_phase_overhead,cumulative_time,cumulative_overhead,cumulative,phases
7342,3755,10375,2366,6544,2.4726e+08,6.8529e+08,65773,104356

@pagrubel
Copy link
Member

pagrubel commented Dec 6, 2015

I fixed the above calculation but while running some tests saw other problems. I need to investigate more tomorrow .....actually that would be later today

@pagrubel
Copy link
Member

pagrubel commented Dec 6, 2015

Found and fixed the problems which were for both get_thread_duration and get_thread_phase_duration calculations for individual worker threads.

@hkaiser
Copy link
Member Author

hkaiser commented Dec 6, 2015

@pagrubel Are the problems you had fix now?

@pagrubel
Copy link
Member

pagrubel commented Dec 6, 2015

Those problems seem fixed. However, if someone queries total counts and worker-thread counts at the same time there will be problems. Should we just document this or prevent it. Also, I would like to check this out some more.

@hkaiser
Copy link
Member Author

hkaiser commented Dec 6, 2015

However, if someone queries total counts and worker-thread counts at the same time there will be problems. Should we just document this or prevent it. Also, I would like to check this out some more.

I think we should document that resetting counters may have unwanted side-effects. The problems you're listing are not unique. Other scenarios could create problems as well.

Is it ok to merge this now?

@pagrubel
Copy link
Member

pagrubel commented Dec 6, 2015

Yes, all is working now.

hkaiser added a commit that referenced this pull request Dec 6, 2015
Fixed resetting performance counters related to idle-rate, etc.
@hkaiser hkaiser merged commit 8417f14 into master Dec 6, 2015
@hkaiser hkaiser deleted the fixing_1859 branch December 6, 2015 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants