Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Error messages #348

Closed
wants to merge 6 commits into from
Closed

WIP: Error messages #348

wants to merge 6 commits into from

Conversation

thequilo
Copy link
Collaborator

This is a WIP pullrequest that addresses the error messages as described in #239. I finally got time to prepare a pullrequest for this.

The code is currently very ugly and could more be seen as a proof of concept that those things can work.

There are lists of things that are already done and that need to be done below.
Any suggestions for further improvement of the error messages (or the code, since some part of it is not well structured and untested) are welcome.

Done

Use iterate_ingredients for gathering commands and named configs

This causes gather_commands and gather_named_configs to raise a CircularDependencyError instead of a RecursionError, which makes much clearer what is causing the error.
In addition, any future gather_something functions that may be implemented can overwrite one method and the error handling is done in iterate_ingredients, and the path filtering for experiments is done there.

Track Ingredients that cause circular dependencies

The CircularDependencyError is caught in iterate_ingredients and the current ingredient is added to a list CircularDependencyError.__ingredients__ to keep track of which ingrediens cuased the circular depenceny.

An example error:

Traceback (most recent call last):
  File "error_messages.py", line 24, in <module>
    @ex.automain
  File ".../sacred/experiment.py", line 141, in automain
    self.run_commandline()
  File ".../sacred/experiment.py", line 248, in run_commandline
    short_usage, usage, internal_usage = self.get_usage()
  File ".../sacred/experiment.py", line 173, in get_usage
    commands = OrderedDict(self.gather_commands())
  File ".../sacred/experiment.py", line 394, in _gather
    for ingredient, _ in self.traverse_ingredients():
  File ".../sacred/ingredient.py", line 370, in traverse_ingredients
    raise e
  File ".../sacred/ingredient.py", line 363, in traverse_ingredients
    for ingred, depth in ingredient.traverse_ingredients():
  File ".../sacred/ingredient.py", line 370, in traverse_ingredients
    raise e
  File ".../sacred/ingredient.py", line 363, in traverse_ingredients
    for ingred, depth in ingredient.traverse_ingredients():
  File ".../sacred/ingredient.py", line 370, in traverse_ingredients
    raise e
  File ".../sacred/ingredient.py", line 363, in traverse_ingredients
    for ingred, depth in ingredient.traverse_ingredients():
  File ".../sacred/ingredient.py", line 357, in traverse_ingredients
    raise CircularDependencyError(ingredients=[self])
sacred.exception.CircularDependencyError: ing->ing2->ing

Track sources of configuration entries

This code is still very ugly, but it allows to track the sources of configuration values.
This works up to different resolutions:

  • for a ConfigScope, we can find the wrapped function and get the place of definition of this function (file + line of the signature line)
  • for a configuration file we can find the file that defines the configuration values. It would be very difficult to get the line of the config value inside of the file.
  • for a dict config, we can use inspect.stack to find the line in which the dict configuration value was added.
  • for configuration defined in the command line, we can say that it was defined in the command line options

See the InvalidConfigError for examples.

Add a baseclass SacredError for future Excpetions that is pretty printed in experiment.run_commandline

The init definition looks like this:

def __init__(self, *args, print_traceback=True,
                 filter_traceback=None, print_usage=False):
    # ...

It provides the following additional arguments (that are handled in experiment.run_commandline):

  • print_traceback: if True, traceback is printed according to filter_traceback. If False, no traceback is printed (except for the Exception itself)
  • filter_traceback: If True, the traceback is filtered (WITHOUT sacred internals), if False, it is not filtered and if None, it falls back to the previous behaviour (filter if not raised within sacred)
  • print_usage: The short usage is printed when this is set to True.

Add an InvalidConfigError that can be raised in user code

Added an InvalidConfigError that prints the conflicting configuration values.

Example:

ex = Experiment()

@ex.config
def config():
    config1 = 123
    config2 = dict(a=234)

@ex.automain
def main(config1, config2):
  if not type(config1) == type(config2['a']):
    raise InvalidConfigError('Must have same type', conflicting_configs=('config1', 'config2.a'))
$ python error_messages.py with config1=abcde

WARNING - root - Changed type of config entry "config1" from int to str
WARNING - error_messages - No observers have been added to this run
INFO - error_messages - Running command 'main'
INFO - error_messages - Started
ERROR - error_messages - Failed after 0:00:00!
Traceback (most recent calls WITHOUT Sacred internals):
  File ".../wrapt/wrappers.py", line 523, in __call__
    args, kwargs)
  File "error_messages.py", line 27, in main
    raise InvalidConfigError('Must have same type', conflicting_configs=('config1', 'config2.a'))
sacred.exception.InvalidConfigError: Must have same type
Conflicting configuration values:
  config1=abcde
    defined in command line config "config1=abcde"
  config2.a=234
    defined in "error_messages.py:20"

MissingConfigError

Prints missing configuration values. Prints the filtered stack trace by default, so that the function call that is missing values can be found.
It also prints the name of the ingredient that captured the function and the file in which the captured function is defined.

Example error:

Traceback (most recent calls WITHOUT Sacred internals):
  File .../wrapt/wrappers.py", line 523, in __call__
    args, kwargs)
sacred.exception.MissingConfigError: main is missing value(s) for ['config3']
Function that caused the exception: <function main at 0x0F7A0780> captured by the experiment "error_messages" at "error_messages.py:24"

NamedConfigNotFoundError

Raise a NamedConfigNotFoundError instead of KeyError, and don't print traceback.

TODO

  • print list of available named configs
  • give suggestion based on levenshtein distance

ConfigAddedError

Raise a ConfigAddedError when a config value is added that is not used anywhere. This is a sublcass of ConfigError and prints the source where the new configuration value is defined:

Traceback (most recent call last):
sacred.utils.ConfigAddedError: Added new config entry "unused" that is not used anywhere
Conflicting configuration values:
  unused=3
    defined in command line config "unused=3"
Did you mean "config1" instead of "unused"

TODO

  • print suggestions based on levenshtein distance

TODO

  • print suggestions for ConfigAddedError
  • (colored exception output?)
  • make source tracking optional in SETTINGS
  • improve resolution of source tracking (line of config file, line in a config scope maybe using inspect.stack)
  • CommandNotFoundError (?)
  • Error when parameter is not present for config scope
  • tests

@thequilo
Copy link
Collaborator Author

Oh, I just noticed that I used some 3.6 syntax. That must be removed.

@Qwlouse
Copy link
Collaborator

Qwlouse commented Aug 30, 2018

Hey! This looks amazing. Thanks a lot for all the effort!
I didn't have the time yet to properly go through the code, but I'll try to do so over the weekend.

For the "did you mean" suggestions: difflib is part of the python standard library and provides a function to get close matches.

for ingred in self.ingredients:
for cmd_name, cmd in ingred.gather_commands():
yield cmd_name, cmd
def _gather(self, func):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a generic gather function a lot. This removes some of the code duplication and as you said, it raises the more descriptive CircularDependencyError.

Some minor aesthetics: the usage of this function with the inline lambda expressions is a bit unwieldy. How about using it as a decorator? Something like this:

@self._gather_from_ingredients
@staticmethod
def gather_commands(ingredient):
    for command_name, command in ingredient.commands.items():
       yield join_paths(ingredient.path, command_name), command

Not entirely sure if that works, because of the interaction with @staticmethod, but it feels cleaner and more readable. Alternatively having a separate get_commands method that is used inside the gather_commands method instead of the lambda expression could be good too.

Copy link
Collaborator Author

@thequilo thequilo Sep 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the decorator. self is unfortunately not defined outside of the function. We can define a decorating function outside of the class body:

def gather_from_ingredients(f):
    def wrapped(self):
        for ingredient, _ in self.traverse_ingredients():
            for item in f(ingredient):
                yield item
    return wrapped


class Ingredient(object):
    # ...
    @gather_from_ingredients
    def gather_commands(ingredient):
        for command_name, command in ingredient.commands.items():
            yield join_paths(ingredient.path, command_name), command

for ingredient, _ in self.traverse_ingredients():
for name, item in func(ingredient):
if ingredient == self:
name = name[len(self.path) + 1:]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand: what is the purpose of this if clause?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command names of the experiment itself should not be prefixed with the experiment name. traverse_ingredients returns all ingredients and the gathering function returns the full names. I kept the previous behavior by removing the own path prefix from the name of the experiment.

for ingredient in self.ingredients:
for ingred, depth in ingredient.traverse_ingredients():
yield ingred, depth + 1
except CircularDependencyError as e:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use this pattern a lot. Basically:

try:
   #  stuff
except SomeCustomError as e:
  # special handling (i.e. adding information) of that error

I think it might be nicer to provide a context manager to provide that special handling without cluttering the code too much. It could even be part of the exception itself. Something like:

with CircularDependencyError.track(self):  # not sure about the name
    # stuff

Shouldn't be hard to implement and would improve readability IMHO.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already thought about doing this. This should be easy to implement and improve readability a lot.

@@ -0,0 +1,223 @@
import inspect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a separate exception.py file is a good idea. But I think you forgot to remove the Exceptions from utils.py ;-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't want to push ´exception.py´. At the moment, the exceptions depend on ´utils.py´, and moving the exceptions without an import in ´utils.py´ would be incompatible to the current version.


from sacred.config.config_sources import ConfigSource

if colored_exception_output:
Copy link
Collaborator

@Qwlouse Qwlouse Sep 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is not defined anywhere. You import it from utils later, but also there it is not defined.
Also, you probably should import BLUE, GREEN, etc. in either case. ;-)

Should this maybe be in settings? Maybe alongside the colors? Or did you plan on auto-detecting if the console supports color? In that case it might be worth it to have a separate colored_output file that takes care of that logic and defines the colors. Aren't there some good libraries to handle this?

raise
except Exception as e:
if not self.current_run or not self.current_run.debug \
or not self.current_run.pdb:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing if-statement. Are you sure this is what you want? The only case in which this will not execute is if there is a current_run and debug is true and pdb is true,

set_by_dotted_path(config_updates,
join_paths(scaff.path, ncfg_key),
value)

distribute_config_updates(prefixes, scaffolding, config_updates)

distribute_config_sources(prefixes, scaffolding, config_sources)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this piece of code got even more complicated. If this work as intendend, I am impressed. Even before your additions I was having difficulties mentally keeping track of what happens. This is not good (but not your fault). I think a serious refactoring might be needed here.
If I understand correctly, you are capturing where a particular config entry originated from and passing this information around. Maybe there is a way to somehow encapsulate this into the config entries inside the configuration process. I'll have to think about it. (This is just me thinking out loud, not an instruction to you).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This piece of code is very ugly and I'm not sure if it works in all cases. It worked for all scenarios I tested, but it may fail for others. I think I should rethink and rewrite this part, including tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One (again very rough) idea to make it more structured and clear (I may be missing some important points and maybe it is not possible like this):

We could add these configuration origins to the ConfigSummary class and use the ConfigSummary throughout the whole configuration process, and not just at the end to track the changes. This would e.g. simplify the interface of chain_evaluate_config_scopes to just return a ConfigSummary and we could combine this with the config updates using a simple ConfigSummary.update_from (or similar). The config sources for config scopes could then be added inside the config scopes, and the config scope's __call__ would also return a ConfigSummary. We could store everything hierarchically (so, every sub-dict would be a ConfigSummary, same for lists) and create an empty structure based on the ingredient hierarchy before starting the configuration process to get rid of these calls to distribute_config_updates, because we could directly write to the sub-ConfigSummary of the scaffolding using ´recursive_update´ or set_by_dotted_path.

@Qwlouse
Copy link
Collaborator

Qwlouse commented Sep 16, 2018

Hi,
I just had some time to look over your PR. As you said, it is a bit rough, but I like where this is going. Many very nice improvements for the error reporting. This has the potential to significantly improve the user-experience. So again: thanks a lot for doing this!

I left a few thought in the code, but let me also comment on a high level:

  1. iterate_ingredients: very nice!
  2. circular dependency errors: very nice!
  3. sources of config entries: Very useful feature! But the code is indeed ugly. Doing this nicely might require some more thorough refactoring, so I am not sure what the best way to proceed is. I won't have time to do any refactoring on that scale in the next 2 months. We can use ugly code, but then it definitely needs testing.
  4. SacredError baseclass: Having a baseclass for all Sacred errors is a good idea. The properties make sense to me, but AFAICT you are not actually using them yet.
  5. InvalidConfigError: Having a go-to exception for user-code is a good idea. This might also provide the basis for some config-validation convenience functions or features.
  6. MissingConfigError: very nice!
  7. NamedConfigError: very nice!
  8. ConfigAddedError: nice! Not sure about calling them "conflicting" though. Maybe "unexpected"?

In general: this is a very large PR, which will make it hard to test and review. That is not a show-stopper, but if you can split it into smaller chunks, I think it will get into master sooner.

@thequilo
Copy link
Collaborator Author

Thanks for your feedback!

I agree that this is a very large PR and splitting it into smaller ones seems appropriate. This is a first rough suggestion of how to split it:

  1. generic gathering function (without CircularDependencyError)
  2. base class for Exceptions and handling of additional args
  3. Error classes without source tracking, maybe split into multiple PRs?
  4. Source tracking, maybe including some refactoring/rewriting of the initialize.py code
  5. "did you mean" suggestions

@thequilo
Copy link
Collaborator Author

The basic exceptions are now merged with #367. Missing points are:

  • suggestions (I'm working on that), and
  • track where the configuration values were set

For the latter part, some refactoring of initialize.py would be very helpful / required. @Qwlouse are you going to work on this in the near future?

@Qwlouse
Copy link
Collaborator

Qwlouse commented Oct 16, 2018

Great! 🎉

Oxt weekend I might be able to spend a day on refactoring initialize.py. Not sure that is going to be enough time, but hopefully I'll be able to lay the foundation for tracking the origin of config values. I like your idea of using ConfigSummaries, so I guess that is what I'll work towards.

Does that sound good to you? Any further thoughts?

@thequilo
Copy link
Collaborator Author

Yes that sounds great! I'm glad that you like the idea to use ConfigSummaries. My suggested "hierarchical structure" that can be updated by recursive dict updates might be a lot more complicated than I thought, especially when the same Ingredient appears in different places. But I think you have a lot better overview over sacred and the configuration process and maybe come up with better ideas.

I first had to check what "oxt weekend" means, but I find that a great idea. I am always confused by the phrase "next weekend"...

I'll wait until you made some progress (hopefully oxt weekend) and then start to implement the "tracking of origin of config values" feature based on your changes.

What do you think about colored output for the error messages, e.g. highlight all config keys with the same color?

@Qwlouse
Copy link
Collaborator

Qwlouse commented Oct 27, 2018

Hey @thequilo.
So the good news is that I spent quite a while today thinking and coding. But trying to refactor the ugly initialize code I went down a rabbit-hole of changes that I'd like to implement, and unfortunately I couldn't get to a usable state yet. So the bad news is, that I am not sure when I'll be able to finish this. The next three weeks are rather intense for me.

If you are interested in the current state you can find the code in the config_refactor branch. But sadly this is more of a sketch of my intentions than an actual refactoring.

Some highlights of what I am trying to do:

  • Divide the configuration process into several stages, where config entries are frozen after each stage.
    • Stage 0: intialization sets the initial seed
    • Stage 1: config updates incorporates the commandline updates
    • Stage 2: runs through named configurations
    • Stage 3: is the regular configurations
  • Get rid of the Scaffold objects and possibly migrate a lot of the logic into the Run object. (moving towards something like the suggestion of @johny-c here)
  • introduce an explicit Path object that unifies handling of paths like "foo.bar.a" including non-string keys
  • Have containers that save meta information alongside the entries, and that support attribute access.

Regarding colored output messages: I think that is a good idea and there are probably several opportunities for helpful highlighting. Since this concerns error messages, we might need to be careful about not breaking the output in terminals that don't support color (Though, I am not sure if that is even a real problem).

@thequilo
Copy link
Collaborator Author

thequilo commented Nov 8, 2018

It looks like you already made some progress that looks promising. I like the changes that you are trying to do. I'll just wait with my modifications until you finished implementing it.

I found that the 'got unexpected kwarg(s)' exception is currently not handled by the SacredError. I'll prepare a PR for that.

@thequilo
Copy link
Collaborator Author

thequilo commented Jan 7, 2019

First, happy new year!

How's the progress on the config_refactor branch?

@Qwlouse
Copy link
Collaborator

Qwlouse commented Feb 21, 2019

Hi @thequilo,
a belated happy new year to you too 🎆, and sorry for the lengthy delay. I plan to get back to the config refactor sometime next week once I have caught up with the other issues and PRs. I'll keep you updated.

@stale
Copy link

stale bot commented May 4, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label May 4, 2019
@JarnoRFB
Copy link
Collaborator

JarnoRFB commented May 8, 2019

unstale

@stale stale bot removed the stale label May 8, 2019
@stale
Copy link

stale bot commented Aug 7, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 7, 2019
@Qwlouse Qwlouse removed the stale label Aug 7, 2019
@stale
Copy link

stale bot commented Nov 5, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Nov 5, 2019
@thequilo
Copy link
Collaborator Author

thequilo commented Nov 5, 2019

Unstale: This work is still not finished yet, but waiting for the config refactoring to finish

@stale stale bot removed the stale label Nov 5, 2019
@stale
Copy link

stale bot commented Feb 3, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 3, 2020
@Qwlouse Qwlouse removed the stale label Feb 8, 2020
@stale
Copy link

stale bot commented May 3, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label May 3, 2020
@thequilo
Copy link
Collaborator Author

thequilo commented May 3, 2020

Still waiting for config v2

@stale stale bot removed the stale label May 3, 2020
@stale
Copy link

stale bot commented Aug 1, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 1, 2020
@thequilo
Copy link
Collaborator Author

thequilo commented Aug 1, 2020

Still waiting

@stale stale bot removed the stale label Aug 1, 2020
@stale
Copy link

stale bot commented Oct 31, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 31, 2020
@thequilo thequilo removed the stale label Nov 1, 2020
@stale
Copy link

stale bot commented Jun 18, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 18, 2021
@stale stale bot closed this Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants