The centroid and star-shaped structure of DisTraL #8

c4cld · 2021-07-20T08:25:20Z

Description

In 2.3 Policy Gradient and a Better Parameterization of 'Distral: Robust Multitask Reinforcement Learning', the author argues that the centroid and star-shaped structure of DisTraL is good for learning a better distilled policy. However, the explanation is too simple. Could you know the advantages of the centroid and star-shaped structure and explain them in detail? I have tried to contact the author of the paper, but I haven't received a reply. So I come to consult you.

shagunsodhani · 2021-07-23T02:17:29Z

Hi! Thank you for the question. The paper mentions that Distral learns a distilled policy in the space of policies, which is better than learning in the space of parameters. The basic idea is, it should be easier to interpolate in the space of functions/policies than interpolating in the parameter space. Two networks could have very similar predictions on any given input but have very different weights. Averaging their predictions would give meaningful predictions, while averaging their weights could result in a worse model than the original two models. This is also related to how we perform ensembling- we average the predictions of multiple models and not average the weights of different models.

The paper does not comment about star/centroid being better. If you think otherwise, could you please point me to the relevant line in the paper?

c4cld · 2021-07-26T02:03:03Z

@shagunsodhani Thank you for you selfless help. The paper mentions star/centroid in 2.3 Policy Gradient and a Better Parameterization. Could you know the advantages of the centroid and star-shaped structure and explain them in detail?

shagunsodhani · 2021-07-26T13:22:12Z

One advantage could be the reduced computation, every model distills with the a central model so the number of distillation operations is linear in number of models. With say a fully-connected topology, the number of operations will be quadratic. This has the obvious limitation that the information exchange is bottlenecked on the central model.

c4cld · 2021-07-27T00:36:17Z

@shagunsodhani Thank you very much!

shagunsodhani · 2021-07-28T01:02:30Z

Cool - closing the task - feel free to reopen if needed :)

shagunsodhani closed this as completed Jul 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The centroid and star-shaped structure of DisTraL #8

The centroid and star-shaped structure of DisTraL #8

c4cld commented Jul 20, 2021

shagunsodhani commented Jul 23, 2021

c4cld commented Jul 26, 2021

shagunsodhani commented Jul 26, 2021

c4cld commented Jul 27, 2021

shagunsodhani commented Jul 28, 2021

The centroid and star-shaped structure of DisTraL #8

The centroid and star-shaped structure of DisTraL #8

Comments

c4cld commented Jul 20, 2021

Description

shagunsodhani commented Jul 23, 2021

c4cld commented Jul 26, 2021

shagunsodhani commented Jul 26, 2021

c4cld commented Jul 27, 2021

shagunsodhani commented Jul 28, 2021