Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError on audit log view #74

Closed
hvelarde opened this issue Dec 1, 2017 · 6 comments
Closed

UnicodeDecodeError on audit log view #74

hvelarde opened this issue Dec 1, 2017 · 6 comments
Labels

Comments

@hvelarde
Copy link
Member

hvelarde commented Dec 1, 2017

When accessing the audit log view of a site I get the following error:

2017-12-01T00:49:59 ERROR Zope.SiteErrorLog 1512096598.990.132773563534 http://www.example.org/@@fingerpointing-audit-log
Traceback (innermost last):
  Module ZPublisher.Publish, line 138, in publish
  Module ZPublisher.mapply, line 77, in mapply
  Module ZPublisher.Publish, line 48, in call_object
  Module Products.Five.browser.metaconfigure, line 479, in __call__
  Module Products.Five.browser.pagetemplatefile, line 125, in __call__
  Module Products.Five.browser.pagetemplatefile, line 59, in __call__
  Module zope.pagetemplate.pagetemplate, line 132, in pt_render
  Module zope.pagetemplate.pagetemplate, line 240, in __call__
  Module zope.tal.talinterpreter, line 271, in __call__
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 888, in do_useMacro
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 533, in do_optTag_tal
  Module zope.tal.talinterpreter, line 518, in do_optTag
  Module zope.tal.talinterpreter, line 513, in no_tag
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 954, in do_defineSlot
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 533, in do_optTag_tal
  Module zope.tal.talinterpreter, line 518, in do_optTag
  Module zope.tal.talinterpreter, line 513, in no_tag
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 858, in do_defineMacro
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 954, in do_defineSlot
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 533, in do_optTag_tal
  Module zope.tal.talinterpreter, line 518, in do_optTag
  Module zope.tal.talinterpreter, line 513, in no_tag
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 954, in do_defineSlot
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 533, in do_optTag_tal
  Module zope.tal.talinterpreter, line 518, in do_optTag
  Module zope.tal.talinterpreter, line 513, in no_tag
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 946, in do_defineSlot
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 533, in do_optTag_tal
  Module zope.tal.talinterpreter, line 518, in do_optTag
  Module zope.tal.talinterpreter, line 513, in no_tag
  Module zope.tal.talinterpreter, line 343, in interpret
  Module zope.tal.talinterpreter, line 583, in do_setLocal_tal
  Module zope.tales.tales, line 696, in evaluate
   - URL: /home/example/buildout/eggs/collective.fingerpointing-1.5rc1-py2.7.egg/collective/fingerpointing/browser/logview.pt
   - Line 24, Column 6
   - Expression: <PythonExpr (view.build_audit_log())>
   - Names:
      {'args': (),
       'container': <PloneSite at /Plone>,
       'context': <PloneSite at /Plone>,
       'default': <object object at 0x7fe0ac3d8530>,
       'here': <PloneSite at /oab>,
       'loop': {},
       'nothing': None,
       'options': {},
       'repeat': <Products.PageTemplates.Expressions.SafeMapping object at 0xf1d8e10>,
       'request': <HTTPRequest, URL=http://www.example.org/@@fingerpointing-audit-log>,
       'root': <Application at >,
       'template': <Products.Five.browser.pagetemplatefile.ViewPageTemplateFile object at 0x769f210>,
       'traverse_subpath': [],
       'user': <PropertiedUser 'admin'>,
       'view': <Products.Five.metaclass.SimpleViewClass from /home/example/buildout/eggs/collective.fingerpointing-1.5rc1-py2.7.egg/collective/fingerpointing/browser/logview.pt object at 0xf25fb10>,
       'views': <Products.Five.browser.pagetemplatefile.ViewMapper object at 0xf25fa50>}
  Module zope.tales.pythonexpr, line 59, in __call__
   - __traceback_info__: (view.build_audit_log())
  Module <string>, line 1, in <module>
  Module collective.fingerpointing.browser.logview, line 57, in build_audit_log
  Module collective.fingerpointing.browser.logview, line 47, in _batched_audit_log_lines
  Module collective.fingerpointing.browser.logview, line 38, in _filtered_audit_log_lines
  Module collective.fingerpointing.utils, line 62, in all_audit_log_lines
  Module collective.fingerpointing.utils, line 51, in audit_log_lines_of
  Module file_read_backwards.file_read_backwards, line 116, in next
  Module encodings.utf_8, line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode byte 0x85 in position 222: invalid start byte
@hvelarde hvelarde added the bug label Dec 1, 2017
@hvelarde
Copy link
Member Author

hvelarde commented Dec 1, 2017

seems the problem is the logger of the AT image as it includes the whole tag and, in this case it contains extended characters (Diretoria Ribeirão Preto):

2016-05-17 14:25:07,328 - INFO - user=jsmith ip=123.123.1.1 action=create object=<img src="http://www.example.org/fotos/cerimonia/diretoria-ribeirao-preto/image" alt="Diretoria Ribeir<C3><A3>o Preto" title="Diretoria Ribeir<C3><A3>o Preto" height="0" width="0" />
2016-05-17 14:25:10,426 - INFO - user=jsmith ip=123.123.1.1 action=modify object=<img src="http://www.example.org/fotos/cerimonia/diretoria-ribeirao-preto/image" alt="Diretoria Ribeir<C3><A3>o Preto" title="Diretoria Ribeir<C3><A3>o Preto" height="1181" width="1772" />

@jensens
Copy link
Member

jensens commented Dec 1, 2017

I think logging the repr is overall not really perfect. but at least a safe unicode would help here.

@cleberjsantos
Copy link
Member

cleberjsantos commented Mar 22, 2018

@hvelarde @jensens We have another problem what produces this same error, seems the problem also is the method audit_log_files() in utils.py.

If exists log files compressed (logrotate) on the same path, then this instructions have a bug.

> /home/cleber/cleber.portal/eggs/collective.fingerpointing-1.5rc1-py2.7.egg/collective/fingerpointing/utils.py(38)audit_log_files()
-> logfiles = sorted(
(Pdb) l
 33         """Return a list with the current audit log and all of its
 34         backups. The list is sorted with newer files first.
 35         """
 36         # retrieve the list of audit log backup files
 37  ->     logfiles = sorted(
 38             [f for f in glob(log_info.logfile + '.*') if '.lock' not in f],
 39             reverse=True
 40         )
 41         # include active audit log as first element
 42         if os.path.exists(log_info.logfile):
(Pdb) log_info.logfile
'/home/cleber/cleber.portal/var/log/audit.log'
(Pdb) glob(log_info.logfile + '.*')
['/home/cleber/cleber.portal/var/log/audit.log.4.bz2', '/home/cleber/cleber.portal/var/log/audit.log.2.bz2', '/home/cleber/cleber.portal/var/log/audit.log.3.bz2', '/home/cleber/cleber.portal/var/log/audit.log.1.bz2']
(Pdb)

...

Then, the method all_audit_log_lines()

> /home/cleber/cleber.portal/eggs/collective.fingerpointing-1.5rc1-py2.7.egg/collective/fingerpointing/utils.py(63)all_audit_log_lines()
-> for line in audit_log_lines_of(logfile):
(Pdb) l
 58         - Newest file first
 59         - File is read reversed (backwards)
 60         """
 61         for logfile in audit_log_files():
 62  ->         for line in audit_log_lines_of(logfile):
 63                 yield line
[EOF]
(Pdb) audit_log_files()
['/home/cleber/cleber.portal/var/log/audit.log.4.bz2', '/home/cleber/cleber.portal/var/log/audit.log.2.bz2', '/home/cleber/cleber.portal/var/log/audit.log.3.bz2', '/home/cleber/cleber.portal/var/log/audit.log.1.bz2']
(Pdb) [line for line in audit_log_lines_of(audit_log_files()[1])]
*** UnicodeDecodeError: 'utf8' codec can't decode byte 0x90 in position 9: invalid start byte
(Pdb)

Solution.

--- utils.py    2018-03-22 07:24:10.778193355 -0300
+++ utils_new.py    2018-03-22 07:31:18.086183529 -0300
@@ -7,6 +7,23 @@

 import os

+""" TODO.
+    This page ( https://www.garykessler.net/library/file_sigs.html )
+    has a list of "magic" file signatures. Grab the ones you need
+    and put them in a dict like below.
+    Then we need a function that matches the dict keys with the
+    start of the file.
+
+    I've written a suggestion, though it can be optimized by
+     preprocessing the magic_dict into e.g. one giant compiled regexp
+"""
+magic_dict = {
+    "\x1f\x8b\x08": "gz",
+    "\x42\x5a\x68": "bz2",
+    "\x50\x4b\x03\x04": "zip"
+    }
+max_len = max(len(x) for x in magic_dict)
+

 def get_request_information():
     """Return logged in user name and remote IP address."""
@@ -29,13 +46,27 @@
     return user_id, ip


+def isCompressedFile(filename):
+    """ This solution should be cross-plattform and is of course
+        not dependent on file name extension, but it may give
+        false positives for files with random content that just
+        happen to start with some specific magic bytes.
+    """
+    with open(filename) as f:
+        file_start = f.read(max_len)
+    for magic, filetype in magic_dict.items():
+        if file_start.startswith(magic):
+            return (True, filetype)
+    return (False, "no match")
+
+
 def audit_log_files():
     """Return a list with the current audit log and all of its
     backups. The list is sorted with newer files first.
     """
     # retrieve the list of audit log backup files
     logfiles = sorted(
-        [f for f in glob(log_info.logfile + '.*') if '.lock' not in f],
+        [f for f in glob(log_info.logfile + '.*') if '.lock' not in f and not isCompressedFile(f)[0]],
         reverse=True
     )
     # include active audit log as first element

My TraceBack.

  Module zope.tal.talinterpreter, line 583, in do_setLocal_tal
  Module zope.tales.tales, line 696, in evaluate
   - URL: /home/cleber/cleber.portal/eggs/collective.fingerpointing-1.5rc1-py2.7.egg/collective/fingerpointing/browser/logview.pt
   - Line 24, Column 6
   - Expression: <PythonExpr (view.build_audit_log())>
   - Names:
      {'args': (),
       'container': <PloneSite at /Plone>,
       'context': <PloneSite at /Plone>,
       'default': <object object at 0x7f21b9f7d540>,
       'here': <PloneSite at /Plone>,
       'loop': {},
       'nothing': None,
       'options': {},
       'repeat': <Products.PageTemplates.Expressions.SafeMapping object at 0x7f218fa352b8>,
       'request': <HTTPRequest, URL=http://10.1.255.51:8980/Plone/@@fingerpointing-audit-log>,
       'root': <Application at >,
       'template': <Products.Five.browser.pagetemplatefile.ViewPageTemplateFile object at 0x7f2199c78f90>,
       'traverse_subpath': [],
       'user': <PropertiedUser 'admin'>,
       'view': <Products.Five.metaclass.SimpleViewClass from /home/cleber/cleber.portal/eggs/collective.fingerpointing-1.5rc1-py2.7.egg/collective/
erpointing/browser/logview.pt object at 0x7f21a0aa0f10>,
       'views': <Products.Five.browser.pagetemplatefile.ViewMapper object at 0x7f2199c6f910>}
  Module zope.tales.pythonexpr, line 59, in __call__
   - __traceback_info__: (view.build_audit_log())
  Module <string>, line 1, in <module>
  Module collective.fingerpointing.browser.logview, line 57, in build_audit_log
  Module collective.fingerpointing.browser.logview, line 47, in _batched_audit_log_lines
  Module collective.fingerpointing.browser.logview, line 38, in _filtered_audit_log_lines
  Module collective.fingerpointing.utils, line 62, in all_audit_log_lines
  Module collective.fingerpointing.utils, line 51, in audit_log_lines_of
  Module file_read_backwards.file_read_backwards, line 116, in next
  Module encodings.utf_8, line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode byte 0x90 in position 9: invalid start byte
> /home/cleber/env/lib/python2.7/encodings/utf_8.py(16)decode()
-> return codecs.utf_8_decode(input, errors, True)

@hvelarde
Copy link
Member Author

I released https://pypi.python.org/pypi/collective.fingerpointing/1.6rc1

this should be fixed now.

@cleberjsantos
Copy link
Member

@hvelarde Sorry, but I did not see the fix for this error :/ If exists log files compressed (logrotate) on the same path, then this instructions have a bug.

@hvelarde
Copy link
Member Author

your error is different from mine; you have to open a different issue :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants