Including GAE #26

CesMak · 2020-03-28T12:25:49Z

Hey there,

you used Monte Carlo Estimate - would it no also be nice to have GAE (Generalized Advantage estimation?)

The function should be something like:

I am not sure how exactly I can include gae in the code....

    def get_advantages(self, values, masks, rewards, gamma):
        returns = []
        gae = 0
        for i in reversed(range(len(rewards))):
            delta = rewards[i-1] + gamma * values[i] * masks[i-1] - values[i-1]
            gae = delta + gamma * 0.95 * masks[i-1] * gae
            returns.insert(0, gae + values[i-1])

        adv = np.array(returns) - values.detach().numpy()
        adv = torch.tensor(adv.astype(np.float32)).float()
        # Normalizing advantages
        return returns, (adv - adv.mean()) / (adv.std() + 1e-5)

       # Monte Carlo estimate of rewards:
        rewards = []
        discounted_reward = 0
        for reward, is_terminal in zip(reversed(memory.rewards), reversed(memory.is_terminals)):
            if is_terminal:
                discounted_reward = 0
            discounted_reward = reward + (self.gamma * discounted_reward)
            rewards.insert(0, discounted_reward)

The text was updated successfully, but these errors were encountered:

nikhilbarhate99 · 2020-06-03T03:09:15Z

I do not think the added complexity is worth it in this repo, since the goal is to provide a simplistic and beginner friendly implementation.

Also, Bootstrapping values does require the experience to be collected from parallel workers in order to work practically.
Source: skip to 54:19 of (https://www.youtube.com/watch?v=EKqxumCuAAY&list=PLkFD6_40KJIwhWJpGazJ9VSj9CFMkb79A&index=6)

I do not know how GAE will perform on a single worker and I also do not have the time to re-test GAE on different environments.

If your implementation works correctly, then you can add it in your fork of this repo.

nikhilbarhate99 closed this as completed Jun 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Including GAE #26

Including GAE #26

CesMak commented Mar 28, 2020

nikhilbarhate99 commented Jun 3, 2020

Including GAE #26

Including GAE #26

Comments

CesMak commented Mar 28, 2020

nikhilbarhate99 commented Jun 3, 2020