Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

callback trainer enhancements #3269

Closed
mikerossgithub opened this issue Sep 20, 2019 · 6 comments
Closed

callback trainer enhancements #3269

mikerossgithub opened this issue Sep 20, 2019 · 6 comments

Comments

@mikerossgithub
Copy link
Contributor

mikerossgithub commented Sep 20, 2019

As discussed in #3230
One pain point when implementing custom callbacks is choosing the right priority level when registering functions. You have to review the priorities of all the other callbacks and figure out an appropriate integer, which is then hardcoded into your function definition.

@handle_event(Events.BATCH_END, priority=-50)
def my_custom_batchend_func(self, trainer: 'CallbackTrainer'):
	...

Two proposals to make this easier to use:

Proposal 1: Register requirement lists instead of priority levels

@handle_event(Events.BATCH_END, 
              run_after=["log_to_tensorboard.batch_end_logging", "callback_X.func_X", "callback_Y"],
              run_before=["callback_Z.func_Z"])
def my_custom_batchend_func(self, trainer: 'CallbackTrainer'):
	...

The idea here is that the CallbackTrainer sorts the list of callbacks for each event, respecting the run_after and run_before requirements. Requirements can be listed with either a specific function ("log_to_tensorboard.batch_end_logging") or the name of a callback for all relevent events in that callback ("callback_Y" or perhaps "callback_Y.*")

Proposal 2: Allow overriding the requirement lists in the init.

A second pain point is that the callback ordering is hardcoded and cant be overridden in the config. This can be fixed by adding overrides parameters to
the parent Callback class init:

class Callback:

  def __init__(..., run_after: Dict[str, List[str]=None], run_before: Dict[str, List[str]=None]):
  """
  run_after is a dictionary from function names to requirement list overrides.
  """
      if run_after is not None:
         CallbackRegistry.reregister_run_afters(self, run_after)
      if run_before is not None:
         CallbackRegistry.reregister_run_befores(self, run_before)

One could then override in the config like this:

callbacks: [
   "checkpoint",
   {"type": "log_to_tensorboard",  # override default requirements
    "run_after": {"batch_end_logging": ["some_callback.some_func"]}  
   }
]
@joelgrus
Copy link
Contributor

joelgrus commented Oct 3, 2019

so I'm not a huge fan of the run_after / run_before, for me it makes things a lot more complicated in that now instead of having a rough mental model of priorities I have to keep track of a DAG in my head.

I do like the idea of allowing overrides in the constructor, I would probably just do

priority_overrides: Dict[str, int] = None and then override them that way.

thoughts?

@mikerossgithub
Copy link
Contributor Author

mikerossgithub commented Oct 4, 2019

If we remove "run_before", I think I can make a better case.

I'm seeing the Callback framework as essentially a whiteboard design pattern using the trainer attributes as the whiteboard. The tricky thing about these patterns is making sure that the variables you need are up-to-date with the current iteration (batch or epoch). It's easy to accidentally use a value from the previous iteration which can muck up your calculations.

The advantage of "run_after" is that it's just a requirements list for the callbacks that produce the variables which you need. So it feels to me like a standard coding paradigm (like package requirements, or import statements).

Also, I think you can do this without changing the underlying code for priority values.
run_after = [a,b,c]
just translates to:
self.priority = max([x.priority for x in [a,b,c]]) + 1
But the former seems a lot cleaner to me, even if it still uses priority integers under the hood.

If that is not convincing to you, I would say that just allowing priority overrides in the config would be helpful

@joelgrus
Copy link
Contributor

joelgrus commented Oct 4, 2019

the thing I still don't like about that is that it requires each callback to "know" which other callbacks are going to be run, and also it means that if you create your own callbacks (which is expected) then you potentially have to override every other run_after to get the sequencing correct.

I agree about the dangers of accidentally using a stale value, but I'm not sure that switching from priority to run_after actually solves that issue. instead you'd almost want a cleanup callback that wipes out the relevant values at the end of the batch / epoch so that you can't misuse them.

@mikerossgithub
Copy link
Contributor Author

mikerossgithub commented Oct 4, 2019

@mikerossgithub
Copy link
Contributor Author

mikerossgithub commented Nov 10, 2019

See also #3420 which could also be framed as a need to specify requirements.

@github-actions
Copy link

github-actions bot commented Aug 19, 2020

This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread 👇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants