-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different op types / weighted operations count #152
Comments
Hi Tom, Thanks for your constructive suggestions and I do agree that different operators shall have different complexity just as the However, it is non-trival to set a MACs do not always reflect to latency, and so is the THOP package. I am afraid that even we introduce the configuration, it still cannot precisly estimate the latency. Maybe a good way is to report detailed operators (e.g., how many adds, muls, divs, subs, log-, exp- are used in the model) and let the users to choose how to use. |
Yes, that was my thinking exactly. The first step would be to do a sub-division of operations and give them as a table. The weighting can then be applied as desired as a simple mapping function. The list of weights above is just something which is used by ITU-T in the standardization of protocols for mobile communication, so I think it is a fairly credible compromise in that particular area of application, but not directly generalizable to GPUs. It really depends on what is the target application - the CPUs and GPUs vary a lot depending on the area of application. That's why any mapping has to be a separate configuration file which the user can choose. |
Got your point tom. This may take some time to support all operators. I will add this to my todoist and gradually support all operators. |
Hey, An exciting project you have here, looking forward to using it!
A question; I'd be interested in a more detailed estimation of complexity, in particular such that different non-linearities would have different weight. Typically, for example, a log- or exp-operation performed on a CPU is 25 times more expensive than a regular multiply-add-with-carry (MAC). So, basically, I'm thinking that there could be a configuration file for the profiler, with a table stating the proportional complexity of different non-linearities. Such weighting would make the profiler output better reflect the true cost of execution. Alternatively, the profiler could give as an optional output a subdivision of operation types which have been used.
So, what do you think?
For a short example, see Table 1 in my wiki at https://wiki.aalto.fi/display/ITSP/Other+performance+measures
The more detailed information which I use can be found on page 259 of https://www.itu.int/rec/T-REC-G.191-200911-S/en
a newer version of that is available on page 277 of file STLmanual.pdf from https://www.itu.int/rec/T-REC-G.191-201901-I/en
I'm aware that for best accuracy, we should also include for-loops and if-statements in the counting, but I would assume that they have less impact in big models. I'm primarily interested on the nonlinear operations, since in my experience they have a large contribution to the overall complexity in the models I use.
If this has wider interest, then I can contribute something to it, but I don't have time to do it myself completely.
cheers,
Tom
https://research.aalto.fi/en/persons/tom-b%C3%A4ckstr%C3%B6m
The text was updated successfully, but these errors were encountered: