-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
too much magic - what would it take to have an object oriented interface? #193
Comments
I think your request would be more clear if you elaborated on what you see as being magical. Aside from captured functions, I don't see much magic in Sacred (though its still quite neat). When you talk about object-oriented interface and inheritance, I think you may be referring to the fact that Sacred takes the design approach that an experiment is an An approach that may be more familiar to some OO programmers is to define a new experiment by deriving a new class from Experiment, and override a run method, or something like that. Code I'm not sure if that would be a better approach, though. There would be more boilerplate, for one. Also, there is some theoretical argument for Experiments being objects in that in you make Experiments classes, you'd only ever instantiate each one once, so you're basically just defining a bunch of singleton classes. It may be possible to support both, though. I guess we'll see what the Qwlouse has to say. |
Hi @rueberger, hi @nimrand.
Ok. To sum up: I don't know a good solution to these issues yet. But I'd be very interested if these are the same "too much magic" issues that you had in mind, and to hear your thoughts on my suggestions. |
I'd like to add some comments here. I've "sacredized" my entire PyTorch experiment workflow (in Jupyter notebooks), but then hit upon 2 problems:
|
Hi @pchalasani. Thanks for the feedback! Your second point is important and already on my list of things to fix (see #171). You are right, there is no reason for not allowing this. It is just an oversight. Considering your first point: I understand, and that is indeed a drawback. So far I wasn't able to come up with a way of getting the benefits of config injection without this hurdle. I really want to avoid having a global config variable, and passing everything around manually also seems like a bad deal. |
Hey everyone! Great work so far. I can see everyone's sides to this issue, but my style errs more towards that of @rueberger. That being said, I do think that it's likely possible wrap some light OO around what already exists. A simple example would be:
I think the way to make this more powerful while minimizing (and potentially reducing or improving) boilerplate is via metaclasses like Django, Luigi, etc. Here's how Django uses metaclasses to populate a DB with instance values that are later created, similar to Sacred:
The parallel is that the Question class attributes are the config of the experiment. In my opinion, this is much more natural than writing a decorated config function, and also more explicit. Yes, you have to create a class, but all it does is inherit from an interface that makes its role explicit. In terms of lines of code, I think it'll end up being equal or less, but it does swap one form of magic (DI) for another (metaclasses). I could see an experiment looking something like:
You could likely get the @staticmethod and @ex.automain decorators into ExperimentModel.run(). Sorry I haven't had time to do all that work, but if you think it's promising and haven't tried the same path only to hit a blocker, let me know and I'll run with it. I really want the same OO aspects as @rueberger for a lot of reasons, so I'll likely give this a go on a fork regardless. Thanks again for the great work! |
I agree with you (@trickmeyer and @rueberger) that an OO interface would be nice. Let's brainstorm on what that could look like. Like you suggested, I think the basis should be an experiment base class that each individual experiment inherit from. Let's stick with The user provided subclass would take the role of the current experiment object instance, by collecting functions, configurations, and some other settings. The main method (or commands in general) would be methods, and configuration entries could be accessible through But departing from your suggesting I would have instantiations of that class take the role of the current run objects, by holding the (modified) configuration and settings, being executable, capturing stdout, and firing events for the observers. That means config updates should be evaluated during instantiation of the class. That would suggest putting the configuration code into the So to a first approximation, it might look like this: from sacred.oo import ExperimentModel, main
class MyExperiment(ExperimentModel):
def __init__(self, command=None, config_updates=None, named_configs=None, options=None):
super().__init__(command, config_updates, named_configs, options)
self.single = 1
self.double = self.one * 2
@main # serves as a marker for the entry point
def mymain(self):
print("double of {} is {}".format(self.single, self.double))
if __name__ == '__main__':
run = MyExperiment(config_updates={'single': 5})
run() # calls run.mymain() with stdout capturing and event firing One potential problem with this is that the namespace of this object is used for many different things: the methods of the experiment, the methods of the run, the configuration, and the custom methods. But other than that I quite like it. Integrating this would take quite some effort, but is certainly doable. What do you guys think? Is this what you are looking for? |
Thanks all for the great discussion! @nimrand @Qwlouse in retrospect, yes I think that the config scopes are to me the most magical part of sacred. I was thrown off in the beginning by experiments being objects, but I've grown to appreciate that design choice. I think it makes sense for the current approach of defining an experiment in a single module. However, my intuition for how to use the experiment object breaks down for larger projects where one would want to keep different components in separate modules. It's this use case that I think an OO interface would be useful for. In the context of @Qwlouse being unhappy about the design of ingredients, I wonder if it would make sense to use the current interface as a 'light' interface to sacred without support for ingredients, and use the OO interface as the full interface, including ingredients. My thoughts being that ingredients seem to map naturally onto a OO structure. Aside: recently I've been looking into options for building reproducible data pipelines. Sacred has enabled me to keep the model half of my projects on lockdown, but the data half is still the wild west. pachyderm is the closest thing I've found to what I'm looking for, but seems rather more complicated than is warranted. All I want is a linear pipeline that uses a hash list of configs to enforce reproducibility and lazily builds data. Here's a draft of the base class skeleton. I mention this because it's right up sacred's alley, and if ingredients are getting redesigned anyway... This should probably be its own issue though. @Qwlouse, that looks good to me. If namespace pollution is an issue, why not store it under Pragmatically, @Qwlouse, would you mind giving a rough overview of what the required changes would be? |
@Qwlouse I think your draft looks good. Let us know if we can contribute to help. |
I very much like this idea! Some comments:
For my perspective on what constitutes "magic", I found two aspects confusing: (1) the magic setting of variables, and (2) the way an I think making @Qwlouse in your example I'd really like if the calls could be done in a way that is totally obvious even to a novice who hasn't seen the docs. I think the invisible remapping of the call from from sacred.oo import ExperimentModel, main
class MyExperiment(ExperimentModel):
def __init__(self, command=None, config_updates=None, named_configs=None, options=None):
super().__init__(command, config_updates, named_configs, options)
self.single = 1
self.double = self.one * 2
@main # serves as a marker for the entry point
def mymain(self):
print("double of {} is {}".format(self.single, self.double))
if __name__ == '__main__':
# behavior that might be easier to understand and use:
run = MyExperiment(config_updates={'single': 5})
run.mymain() # calls run.mymain() with stdout capturing and event firing
# previous example code
# run = MyExperiment(config_updates={'single': 5})
# run() # calls run.mymain() with stdout capturing and event firing @rueberger my concern with the preprocessor skeleton is it might not work well for certain use cases, like if you're loading datasets in tensorflow via input tensors. |
I would like to bring the ipython traitlets project to your attention. They handle the "configuration part" using an object oriented approach based on traits. It could be a good project to base sacred on, or at least may be inspiration for the OO interface design. |
@leezu: I like traitlets. They don't quite fit my requirements for the config system, but I am thinking about using them (or something similar) for the @rueberger , @trickmeyer , @ahundt : Sorry for being unresponsive recently. Unfortunately my time for sacred development has been a rather limited recently, so I've egoistically concentrated on fixes and features that I needed. But I have thought a lot about refactoring/redesigning sacred, and I hope to share these ideas with you guys soon. If you are still interested, should I maybe create a google-group or some other mailing-list for discussions? |
I'm happy to stay involved via any medium. Be egoistic! Gotta do it - no
shame.
…On Tue, Nov 21, 2017 at 7:36 AM, Klaus Greff ***@***.***> wrote:
@leezu <https://github.com/leezu>: I like traitlets. They don't quite fit
my requirements for the config system, but I am thinking about using them
(or something similar) for the --option=value part of sacred.
@rueberger <https://github.com/rueberger> , @trickmeyer
<https://github.com/trickmeyer> , @ahundt <https://github.com/ahundt> :
Sorry for being unresponsive recently. Unfortunately my time for sacred
development has been a rather limited recently, so I've egoistically
concentrated on fixes and features that I needed. But I have thought a lot
about refactoring/redesigning sacred, and I hope to share these ideas with
you guys soon. If you are still interested, should I maybe create a
google-group or some other mailing-list for discussions?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#193 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFzy5zttXMZreIqeUoMwh1fOaNgSEHOKks5s4tHBgaJpZM4OdMiy>
.
|
No problem! That's part of life. I actually prefer discussions within github issues, especially since github emails you like a mailing list anyway and makes it pretty easy to enable/disable subscription on a per-thread basis right on the website. I'd love to hear your thoughts! |
Hi folks, MWE: The
Notice that decorating hte class with As it is the above code works but if I change the call to super from
Here's the main entry point of the experiment
Also I'd like to point that it seems quite difficult to capture params in the |
Decorating the class does make its
Oh, dang. This is a gotcha I wasn't aware of. It seems that wrapping the class, precludes it from being used within As for the parameter capture in |
@Qwlouse |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
unstale |
Lately I've really been straining against the limits of In particular, I've been setting up hyperparameter optimization for my experiment with the excellent You have to do something like this (note that the experiment is not imported until train_func is executed): def train_func_factory(named_config):
""" Create train func with experiment bound in its scope
Workaround to calling main directly, experiments are not pickle-able
Args:
named_configs - [str]
Returns:
train_func
"""
def train_func(config_updates, reporter):
""" Wraps experiment into format expected by ray
Args:
config_updates
report: reporter method passed by ray
"""
from foo.experiment import foo_experiment
info = {
'reporter': reporter,
}
foo_experiment.run(
config_updates=config_updates,
named_configs=[named_config],
info=info # NOTE: see https://github.com/IDSIA/sacred/issues/480
)
return train_func This is just a small annoyance, but would not be necessary with an 'OOP' interface to The real problem is that it's currently impossible to implement tune's class-based API with sacred - and you need the class-based API to take full advantage of tune (can't do checkpointing without it). One reason that it is currently impossible to implement tune's class-based API is that you need to be able to stop/resume experiments. Another even thornier reason is that we need to be able support config mutating over time to able to use PBT. And branching for experiments too? PBT is a nightmare for the abstraction, but so powerful we should seriously think about how it could conceivably be supported. I've been using So when I am faced with such a drastic departure from the current abstraction, as is required for PBT, the temptation to rip out the complexity I don't use in config to prepare Looking forward to hearing your thoughts. |
@rueberger : I hear you and fully agree with your assessment. Going forward, Sacred should be modularized much better, such that all the individual features such as config management, commandline interface, observers, etc. can be easily used in isolation and swappend on demand. I love the idea of PBT and figuring out a good way to support resuming and branching would be extremely valuable. That being said, I currently have to focus on writing up my thesis and unfortunately do not have the bandwidth to tackle these issue. I'd be happy to share my thoughts and support any efforts in that direction, but designing and implementing these are large endeavors, that require more time and effort than I am currently willing to invest. |
We're looking for feedback: here is the proposal to have a Config object #623 Comments are welcome! |
I'm also a big fan of ray tune, so @rueberger's suggestions will be very helpful indeed. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Magic Johnson |
Hi @rueberger, I've just released https://github.com/machinable-org/machinable that has a similiar feature list as sacred but uses a fairly different object-oriented API. The configuration is specified in a project-wide configuration file while functionality is implemented in subclasses that can overwrite events like 'on_create' etc. The documentation provides a short overview here. Perhaps that suits your needs? |
@flukeskywalker machinable also integrates with Ray and Ray Tune as well, see here and here. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hello sacred authors! Great work - I'm just starting to embrace sacred and am feeling very excited about the potential.
So far I've been finding it just a bit too magical for my tastes though. I think that the magic has it's place in reducing the boilerplate to almost zero for a quick experiment - but I've been finding it to be somewhat of an activation barrier in getting started for someone not familiar with the library.
The issue is that sacred breaks completely out of the standard python programming paradigm, which makes it difficult to reason about code behavior.
Again, I do think that this has it's place for quick scripting - but for production workflows that prioritize extensibility and maintainability over line count, I think an object-oriented interface would be more appropriate.
My reasoning is that:
Any estimates how challenging this would be? Recommendations on where to start?
The text was updated successfully, but these errors were encountered: