Allow manual error handling in walk* methods #56

sametmax · 2014-01-30T16:37:09Z

When iterating over a file list, there can be errors, and for now, path.py deals it this way :

    def walkfiles(self, pattern=None, errors='strict'):
        [...]
        for child in childList:
            try:
                isfile = child.isfile()
                isdir = not isfile and child.isdir()
            except:
                if errors == 'ignore':
                    continue
                elif errors == 'warn':
                    warnings.warn(
                        "Unable to access '%s': %s"
                        % (self, sys.exc_info()[1]),
                        TreeWalkWarning)
                    continue
                else:
                    raise

This does give some options, but it limits the choices between "shut up", "scream" or "crash". You may want to have other behaviors : logging, permission change, user change, deletion, accumulation of the errors in a list, and so on.

We could allow any behavior by just doing:

    def walkfiles(self, pattern=None, errors='strict'):
        [...]
        for child in childList:
            try:
                isfile = child.isfile()
                isdir = not isfile and child.isdir()
            except Exception as e:

                # If errors is a callable, call it by passing the file
                # file name and the exception so it can do something about it
                if callable(errors):
                    res = errors(child, e)
                    # if the callable returns something, yield it (it can
                    # this way do something to the file, 
                    # then yield it anyway if needed, considering the error
                    # solved or just ignore the file by not calling return)
                    if res is not None:
                        yield res

                elif errors == 'ignore':
                    continue
                elif errors == 'warn':
                    warnings.warn(
                        "Unable to access '%s': %s"
                        % (self, sys.exc_info()[1]),
                        TreeWalkWarning)
                    continue
                else:
                    raise

Then you would use it this way :

errors_to_deal_with_later = []
on_errors = lambda f, e: errors_to_deal_with_later.append((f, e))
for p in path('/etc').walkfiles(errors=on_errors):
    do_stuff(p)

for f, e in errors_to_deal_with_later:
    do_stuff_with_all_errors(on_errors)

Of course, the lambda could do anything : sending an email, adding the faulty path in a file, making a remote call, etc. This gives you the power to do something for each faulty files, not just the first one causing troubles and preventing the whole iteration to carry on.

We can apply this logic at several places in the path.py code and it doesn't affect the API, just add flexibility. The Overhead is not really important since we are already in an except clause (the whole stacktrace is been collected) so it's already a very slow logical path anyway.

BTW: nothing related but, what's the use of isdir = not isfile and child.isdir() ?

The text was updated successfully, but these errors were encountered:

jaraco · 2014-09-23T02:29:39Z

This issue is fixed in the changes for #73.

jaraco added a commit that referenced this issue Sep 23, 2014

Update changelog to reflect 5.3 addresses #56.

0b5c6ba

jaraco closed this as completed Sep 23, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow manual error handling in walk* methods #56

Allow manual error handling in walk* methods #56

sametmax commented Jan 30, 2014

jaraco commented Sep 23, 2014

Allow manual error handling in walk* methods #56

Allow manual error handling in walk* methods #56

Comments

sametmax commented Jan 30, 2014

jaraco commented Sep 23, 2014