Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refining task api, etc #63

Merged
merged 22 commits into from
Jun 7, 2023
Merged

Refining task api, etc #63

merged 22 commits into from
Jun 7, 2023

Conversation

kywch
Copy link
Collaborator

@kywch kywch commented Jun 3, 2023

  • separated Predicate (predicate_api.py) and Task (task_api.py)

    • Predicate evaluates progress toward goal and returns float [0,1]
    • Task computes rewards for agents and tracks goal completion
  • refactored task api, allow two methods to provide tasks to the env

    • env.reset(new_tasks = a list of task instances)
    • env.reset(teams: Dict[int,List[int]], task_spec = a list of tasks to instantiate inside the env)
  • all the tests are using the new task api

  • merged the team helpers into lib/team_helper.py

  • added event number and gold constraints to predicate

  • fixed IMMORTAL bug (mainly for performance testing)

  • fixed test_render_save.py, using the new replay helper

nmmo/core/env.py Outdated
map_id: Map index to load. Selects a random map by default
seed: random seed to use
new_tasks: A list of instantiated tasks
task_spec: A list of task spec to instantiate inside reset()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about just make_new_tasks_fn

that way env doesn't need to know anything about making tasks, or teams

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please tell me more about it? Let me think about it again after sleep.

However, I also realized that the current make_team_tasks(teams, task_spec) doesn't need any info about the env. If one can pass in teams and task_spec, one can also use make_team_tasks to make task instances and pass them in. ... Then yes, the env doesn't need to know anything about making tasks or teams.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I got it, and implemented as follows. But after writing it, I just thought... if one knows it all, why bother passing in these and not the instantiated tasks? Please let me know what you think.

    if new_tasks is not None:
      # providing an empty new_tasks [] is also possible
      self.tasks = new_tasks
    elif make_task_fn is not None:
      self.tasks = make_task_fn(**make_task_fn_kwargs)
    else:
      for task in self.tasks:
        task.reset()

@@ -88,12 +78,6 @@ def box(rows, cols):
if self.config.PROVIDE_ACTION_TARGETS:
obs_space['ActionTargets'] = self.action_space(None)

if self._task_encoding:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need this somewhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need it, but not necessary in this format. Let me check back after seeing the syllabus integration.

nmmo/core/env.py Outdated
elif task_spec is not None and teams is not None:
self.tasks = make_team_tasks(teams, task_spec)
else:
self.tasks = nmmo_default_task(self.possible_agents)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to replace existing tasks with the default_task if we don't pass in new_tasks, which is not expected. how about we set self.tasks to default_task in the constructor, and then don't do that here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nmmo_default_task has moved to the init, and this line was replaced with

      for task in self.tasks:
        task.reset()

not changing the existing tasks.

@@ -308,11 +283,7 @@ def step(self, actions: Dict[int, Dict[str, Dict[str, Any]]]):

# Store the observations, since actions reference them
self.obs = self._compute_observations()
gym_obs = {}
for a, o in self.obs.items():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we still need this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, this was gym_obs = {a: o.to_gym() for a,o in self.obs.items()}, but it was changed to that to add in

      if self._task_encoding:
        gym_obs[a]['Task'] = self._encode_goal().get(a,np.zeros(self._task_embedding_size))

Currently, I'm not sure about the exact form of the task encoding. Hoping to get some input/specs as we start task conditioned learning very soon

depletion = config.RESOURCE_DEPLETION_RATE
water = self.entity.resources.water
water.decrement(depletion)

if self.config.IMMORTAL:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't love IMMORTAL, maybe we can just get rid of it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMMORTAL is pretty much for performance testing. Perhaps change to PERFORMANCE_TEST?

@@ -43,6 +43,11 @@ def StayAlive(gs: GameState,
"""True if all subjects are alive.
"""
return count(subject.health > 0) == len(subject)
# The below is for speed testing (bypass GroupView)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not include it in this PR, perf testing can go in a different PR if we want to commit it at all

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

@@ -87,7 +92,7 @@ def CanSeeGroup(gs: GameState,
target: Group = constraint.TEAM_GROUPS):
""" Returns True if subject can see any of target
"""
return OR(*(CanSeeAgent(subject, agent) for agent in target.agents))
return POR(*(CanSeeAgent(subject, agent) for agent in target.agents))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why POR not OR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OR for Predicate. But I hear you, and am removing all P prefix.


return task_cls(eval_fn=self, assignee=assignee, reward_multiplier=reward_multiplier)

def __and__(self, other):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a fan of the P prefix

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

return POR(self, other)
def __invert__(self):
return PNOT(self)
def __rshift__(self, other):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMPLY seems really weird for task predicates, do we really need it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

nmmo/core/env.py Outdated
embedding_size=self._task_embedding_size,
task_encoding=self._task_encoding,
reset=False)
self.tasks = nmmo_default_task(self.possible_agents)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you do task_api.make_nmmo_default_tasks()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicitly stated where the default task comes from.

nmmo/core/env.py Outdated
def reset(self, map_id=None, seed=None, options=None):
def reset(self, map_id=None, seed=None, options=None,
new_tasks: List[Task]=None,
make_task_fn: Callable=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only need make_task_fn, the caller can define a lambda without having to pass kwargs. and they can always return [] if the want to clear the tasks. there's no need to accept new_tasks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David, you suggest that we get rid of new_tasks, a list of task instances, and go only with make_task_fn, which can be

  • make_task_fn = lambda: return make_team_tasks(teams, task_spec) in case of the manual curriculum example
  • or make_task_fn = 'lambda: return []'

and inside reset(), self.tasks = make_task_fn() is called. Am I right?

@jsuarez5341 how would the puffer like to handle this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the part that we don't need to pass in args for make_task_fn, so removed it.

And, I agree that new_tasks and make_task_fn are redundant. Well, ok ... it's pretty easy to put new_tasks in, so I'll go ahead only with make_task_fn.

nmmo/core/env.py Outdated

# Remove rewards for dead agents
for agent_id in dones:
rewards[agent_id] = -1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems wrong? dones only contains agents who died this turn. seems fine for them to get a reward. but agents that have already been dead should not be rewarded after death, right?

or... maybe they should be, if their task is accomplished after they died

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you are tinkering with this part too. I don't have any strong idea, so I'll just take it out.

@@ -104,6 +101,10 @@ def reset(self, map_id: int = None):
Item.INSTANCE_ID = 0
self.items = {}

if self._replay_helper is not None:
self._replay_helper.reset()
self._replay_helper.update() # capture the initial packet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just call update() in reset?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@jsuarez5341 jsuarez5341 merged commit 7386079 into 2.0 Jun 7, 2023
6 checks passed
@kywch kywch deleted the task-rev branch June 9, 2023 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants