Question about _computeReward in Multi Agent environment #20

mfarrelm · 2021-02-09T12:57:02Z

First of all, thank you for your work on this simulator. I have a question, why is it using index 0 on getDroneStateVector instead of i. Is it a typo or maybe there is another meaning behind it?

  rewards = {}
  states = np.array([self._getDroneStateVector(0) for i in range(self.NUM_DRONES)])
  rewards[0] = -1 * np.linalg.norm(np.array([1, 1, 1]) - states[0, 0:3])**2
  for i in range(1, self.NUM_DRONES):
      rewards[i] = -1 * np.linalg.norm(states[i-1, 0:3] - states[i, 0:3])**2
  return rewards

This is from LeaderFollowerAviary.py.

The text was updated successfully, but these errors were encountered:

JacopoPan · 2021-02-09T16:34:42Z

Thanks! It absolutely looks like a typo and had trickled to a couple of other places too. Should be fixed in the latest push.

JacopoPan added the bug Something isn't working label Feb 9, 2021

JacopoPan closed this as completed Feb 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about _computeReward in Multi Agent environment #20

Question about _computeReward in Multi Agent environment #20

mfarrelm commented Feb 9, 2021 •

edited

Loading

JacopoPan commented Feb 9, 2021

Question about _computeReward in Multi Agent environment #20

Question about _computeReward in Multi Agent environment #20

Comments

mfarrelm commented Feb 9, 2021 • edited Loading

JacopoPan commented Feb 9, 2021

mfarrelm commented Feb 9, 2021 •

edited

Loading