You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your great work, and I have a question about how to calculate MAdds in your paper.
The dynamic network has different widths and MAdds for each instance, but you denoted MAdds for your networks.
Are they the average MAdds for the whole dataset?
The text was updated successfully, but these errors were encountered:
Thank you for your rapid answer. 😊
I think it is hard to know how much MAdds is required before the validation, but you achieved similar MAdds to comparative algorithms.
Is there any rule to achieve specific MAdds?
Thanks for the valuable question.
The gate is actually very sensitive to hyperparameters. To avoid troublesome tuning of the loss balancing factors and other hyperparameters, we use different routing space for different network, e.g. we only use the slimmest few sub-networks to form the DS-Net-S.
This can be find in Appendix.A of the paper
In practice, We first test the MAdds for each sub-networks, then manually choose those sub-networks with MAdds around the target MAdds as the dynamic routing space.
Thank you for your great work, and I have a question about how to calculate MAdds in your paper.
The dynamic network has different widths and MAdds for each instance, but you denoted MAdds for your networks.
Are they the average MAdds for the whole dataset?
The text was updated successfully, but these errors were encountered: