Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Error Reporting #259

Closed
rhunwicks opened this issue May 6, 2015 · 6 comments
Closed

Better Error Reporting #259

rhunwicks opened this issue May 6, 2015 · 6 comments

Comments

@rhunwicks
Copy link
Contributor

At the moment, when an error occurs with raise_errors = False we store the traceback and the error message and show it to the user.

However, we don't show the actual data in the row that caused the error, which makes it hard to troubleshoot what went wrong. This is particularly true if the dataset has been manipulated in before_import; for example, I have a lot of imports where I have to un-crosstab an Excel spreadsheet before importing it, so the rows in the dataset don't match the rows in the spreadsheet.

One simple solution is to allow the results.Error class to store the row:

class Error(object):

    def __init__(self, error, traceback=None, row=None):
        self.error = error
        self.traceback = traceback
        self.row = row

And then for import_data to save it:

            except Exception as e:
                tb_info = traceback.format_exc(2)
                row_result.errors.append(Error(e, tb_info, row))

And then finally for the template to display it:

      {% for line, errors in result.row_errors %}
        {% for error in errors %}
          <li>
            {% trans "Line number" %}: {{ line }} - {{ error.error }}
            <div>{{ error.row }}</div>
            <div class="traceback">{{ error.traceback|linebreaks }}</div>

This works for me without any special effort to coerce the row to a string. I get output like:

Line number: 91 - 'RelatedManager' object has no attribute 'geographicunitsetversion'
OrderedDict([(u'geographicunitsetversion', <GeographicUnitSetVersion: Standard Admin Units: null - null>), (u'geographicunitset', <GeographicUnitSet: Standard Admin Units>), (u'end', None), (u'start', None), (u'data_source_organization', None), (u'data_source_document', None), (u'parent_code', u'21106'), (u'unit_type', u'admin2'), (u'name', u'St Georges County'), (u'code', u'2110605'), (u'set', u'Standard Admin Units'), (u'admin0', u'USA'), (u'fnid0', u'US'), (u'full_name', None), (u'country', u'US'), (u'fnid1', u'21106'), (u'fnid2', u'2110605'), (u'fnid3', u'0'), (u'fnid4', u'0'), (u'admin1', u'MD'), (u'admin2', u'St Georges County'), (u'admin3', u''), (u'admin4', u'')])
Traceback (most recent call last):
File "/home/roger/.virtualenvs/fdw/local/lib/python2.7/site-packages/import_export/resources.py", line 361, in import_data
if self.skip_row(instance, original):
File "/home/roger/.virtualenvs/fdw/local/lib/python2.7/site-packages/import_export/resources.py", line 266, in skip_row
if field.get_value(instance) != field.get_value(original):
AttributeError: 'RelatedManager' object has no attribute 'geographicunitsetversion'

That output is much more useful to me than the default.

@bmihelac
Copy link
Member

bmihelac commented May 6, 2015

Looks good to me, care to make PR?

@rhunwicks
Copy link
Contributor Author

Sure, can I suggest another improvement too.

I currently have a problem where I get an exception in a before_import which is added to result.base_errors but not raised because raise_errors = False, and then all subsequent updates fail because of a TransactionManagementError - i.e. Postgresql won't allow any more updates because of an error in the transaction, which was the original error in the before_import. The second error is raised (it's in a different Resource with raise_errors = True) but doesn't mean anything, but I no longer have access to the result object.

I'd like to add a Null logger to the resources.py and then use logger.exception() everywhere we have an exception handler. That should have no impact on the existing code at all, but would allow users who have added a custom logger to see all the exceptions that occurred.

@rhunwicks
Copy link
Contributor Author

Also, I noticed that in the import.html template we show the traceback for row errors but not for base errors. I'd like to correct that, unless there was a specific reason for not showing them.

@rhunwicks
Copy link
Contributor Author

fwiw, with the exception logging place, the test output now looks like:

~/git/django-import-export$ ./runtests.sh 
Creating test database for alias 'default'...
......................................................ERROR:root:Column 'id': invalid literal for int() with base 10: 'foo'
Traceback (most recent call last):
  File "/home/roger/git/django-import-export/import_export/resources.py", line 355, in import_data
    instance, new = self.get_or_init_instance(instance_loader, row)
  File "/home/roger/git/django-import-export/import_export/resources.py", line 186, in get_or_init_instance
    instance = self.get_instance(instance_loader, row)
  File "/home/roger/git/django-import-export/import_export/resources.py", line 183, in get_instance
    return instance_loader.get_instance(row)
  File "/home/roger/git/django-import-export/import_export/instance_loaders.py", line 32, in get_instance
    params[field.attribute] = field.clean(row)
  File "/home/roger/git/django-import-export/import_export/fields.py", line 59, in clean
    raise ValueError("Column '%s': %s" % (self.column_name, e))
ValueError: Column 'id': invalid literal for int() with base 10: 'foo'
..ERROR:root:Column 'birthday': Enter a valid date/time.
Traceback (most recent call last):
  File "/home/roger/git/django-import-export/import_export/resources.py", line 373, in import_data
    self.import_obj(instance, row, real_dry_run)
  File "/home/roger/git/django-import-export/import_export/resources.py", line 238, in import_obj
    self.import_field(field, obj, data)
  File "/home/roger/git/django-import-export/import_export/resources.py", line 230, in import_field
    field.save(obj, data)
  File "/home/roger/git/django-import-export/import_export/fields.py", line 94, in save
    setattr(obj, self.attribute, self.clean(data))
  File "/home/roger/git/django-import-export/import_export/fields.py", line 59, in clean
    raise ValueError("Column '%s': %s" % (self.column_name, e))
ValueError: Column 'birthday': Enter a valid date/time.
....................
----------------------------------------------------------------------
Ran 76 tests in 0.767s

OK
Destroying test database for alias 'default'...

I.e. the tests all run successfully but the errors get logged to the console. Is that OK?

@rhunwicks
Copy link
Contributor Author

I can suppress the exception logging when running tests by adding to settings.py

LOGGING = {
    'version': 1,
    'disable_existing_loggers': True,
    'handlers': {
        'console': {
            'class': 'logging.NullHandler'
        }
    },
    'root': {
        'handlers': ['console'],
    }}

@int-ua
Copy link
Contributor

int-ua commented Jul 9, 2015

Please add some comment here when this code is released, it should help me a lot. Maybe add a "released" label?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants