-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refining task api, etc #63
Conversation
nmmo/core/env.py
Outdated
map_id: Map index to load. Selects a random map by default | ||
seed: random seed to use | ||
new_tasks: A list of instantiated tasks | ||
task_spec: A list of task spec to instantiate inside reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about just make_new_tasks_fn
that way env doesn't need to know anything about making tasks, or teams
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please tell me more about it? Let me think about it again after sleep.
However, I also realized that the current make_team_tasks(teams, task_spec)
doesn't need any info about the env. If one can pass in teams and task_spec, one can also use make_team_tasks to make task instances and pass them in. ... Then yes, the env doesn't need to know anything about making tasks or teams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I got it, and implemented as follows. But after writing it, I just thought... if one knows it all, why bother passing in these and not the instantiated tasks? Please let me know what you think.
if new_tasks is not None:
# providing an empty new_tasks [] is also possible
self.tasks = new_tasks
elif make_task_fn is not None:
self.tasks = make_task_fn(**make_task_fn_kwargs)
else:
for task in self.tasks:
task.reset()
@@ -88,12 +78,6 @@ def box(rows, cols): | |||
if self.config.PROVIDE_ACTION_TARGETS: | |||
obs_space['ActionTargets'] = self.action_space(None) | |||
|
|||
if self._task_encoding: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we need this somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need it, but not necessary in this format. Let me check back after seeing the syllabus integration.
nmmo/core/env.py
Outdated
elif task_spec is not None and teams is not None: | ||
self.tasks = make_team_tasks(teams, task_spec) | ||
else: | ||
self.tasks = nmmo_default_task(self.possible_agents) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is going to replace existing tasks with the default_task if we don't pass in new_tasks, which is not expected. how about we set self.tasks to default_task in the constructor, and then don't do that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nmmo_default_task has moved to the init, and this line was replaced with
for task in self.tasks:
task.reset()
not changing the existing tasks.
@@ -308,11 +283,7 @@ def step(self, actions: Dict[int, Dict[str, Dict[str, Any]]]): | |||
|
|||
# Store the observations, since actions reference them | |||
self.obs = self._compute_observations() | |||
gym_obs = {} | |||
for a, o in self.obs.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we still need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, this was gym_obs = {a: o.to_gym() for a,o in self.obs.items()}
, but it was changed to that to add in
if self._task_encoding:
gym_obs[a]['Task'] = self._encode_goal().get(a,np.zeros(self._task_embedding_size))
Currently, I'm not sure about the exact form of the task encoding. Hoping to get some input/specs as we start task conditioned learning very soon
depletion = config.RESOURCE_DEPLETION_RATE | ||
water = self.entity.resources.water | ||
water.decrement(depletion) | ||
|
||
if self.config.IMMORTAL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't love IMMORTAL, maybe we can just get rid of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMMORTAL is pretty much for performance testing. Perhaps change to PERFORMANCE_TEST?
nmmo/task/base_predicates.py
Outdated
@@ -43,6 +43,11 @@ def StayAlive(gs: GameState, | |||
"""True if all subjects are alive. | |||
""" | |||
return count(subject.health > 0) == len(subject) | |||
# The below is for speed testing (bypass GroupView) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not include it in this PR, perf testing can go in a different PR if we want to commit it at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
nmmo/task/base_predicates.py
Outdated
@@ -87,7 +92,7 @@ def CanSeeGroup(gs: GameState, | |||
target: Group = constraint.TEAM_GROUPS): | |||
""" Returns True if subject can see any of target | |||
""" | |||
return OR(*(CanSeeAgent(subject, agent) for agent in target.agents)) | |||
return POR(*(CanSeeAgent(subject, agent) for agent in target.agents)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why POR not OR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OR for Predicate. But I hear you, and am removing all P prefix.
|
||
return task_cls(eval_fn=self, assignee=assignee, reward_multiplier=reward_multiplier) | ||
|
||
def __and__(self, other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a fan of the P prefix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
nmmo/task/predicate_api.py
Outdated
return POR(self, other) | ||
def __invert__(self): | ||
return PNOT(self) | ||
def __rshift__(self, other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMPLY seems really weird for task predicates, do we really need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
Simple Optimizations for Speed V2
minor changes to predicate api and caching
nmmo/core/env.py
Outdated
embedding_size=self._task_embedding_size, | ||
task_encoding=self._task_encoding, | ||
reset=False) | ||
self.tasks = nmmo_default_task(self.possible_agents) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you do task_api.make_nmmo_default_tasks()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicitly stated where the default task comes from.
nmmo/core/env.py
Outdated
def reset(self, map_id=None, seed=None, options=None): | ||
def reset(self, map_id=None, seed=None, options=None, | ||
new_tasks: List[Task]=None, | ||
make_task_fn: Callable=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only need make_task_fn, the caller can define a lambda without having to pass kwargs. and they can always return [] if the want to clear the tasks. there's no need to accept new_tasks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David, you suggest that we get rid of new_tasks, a list of task instances, and go only with make_task_fn, which can be
- make_task_fn =
lambda: return make_team_tasks(teams, task_spec)
in case of the manual curriculum example - or make_task_fn = 'lambda: return []'
and inside reset(), self.tasks = make_task_fn()
is called. Am I right?
@jsuarez5341 how would the puffer like to handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the part that we don't need to pass in args for make_task_fn, so removed it.
And, I agree that new_tasks and make_task_fn are redundant. Well, ok ... it's pretty easy to put new_tasks in, so I'll go ahead only with make_task_fn.
nmmo/core/env.py
Outdated
|
||
# Remove rewards for dead agents | ||
for agent_id in dones: | ||
rewards[agent_id] = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems wrong? dones only contains agents who died this turn. seems fine for them to get a reward. but agents that have already been dead should not be rewarded after death, right?
or... maybe they should be, if their task is accomplished after they died
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you are tinkering with this part too. I don't have any strong idea, so I'll just take it out.
nmmo/core/realm.py
Outdated
@@ -104,6 +101,10 @@ def reset(self, map_id: int = None): | |||
Item.INSTANCE_ID = 0 | |||
self.items = {} | |||
|
|||
if self._replay_helper is not None: | |||
self._replay_helper.reset() | |||
self._replay_helper.update() # capture the initial packet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just call update() in reset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
separated Predicate (
predicate_api.py
) and Task (task_api.py
)refactored task api, allow two methods to provide tasks to the env
env.reset(new_tasks
= a list of task instances)env.reset(teams: Dict[int,List[int]], task_spec
= a list of tasks to instantiate inside the env)all the tests are using the new task api
merged the team helpers into
lib/team_helper.py
added event number and gold constraints to predicate
fixed
IMMORTAL
bug (mainly for performance testing)fixed
test_render_save.py
, using the new replay helper