Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.10, fix several decoding issues #2171

Merged
merged 15 commits into from Jul 6, 2018

Conversation

sebres
Copy link
Contributor

@sebres sebres commented Jul 4, 2018

  • fail2ban running in the preferred encoding now (as default encoding also within python 2.x), mostly UTF-8 in opposite to ascii previously, so minimizes influence of implicit conversions errors (between unicode, bytes and str);
  • provided new universal helper uni_string, which uses safe explicit conversion to string (also if default encoding is ascii);
  • actions: avoid conversion errors on wrong-chars by replace tags;
  • database: improve adapter/converter handlers working on invalid characters in sense of json and/or sqlite-database;
    additionally both are exception-safe now, so avoid possible errors in log-handlers (concat, str. conversion, etc);
    closes After receiving error, does not work and does not block IP #2137;
  • database test-cases extended to cover any possible variants (invalid chars in unicode, bytes, str + unterminated char-sequence) with both cases (with replace of chars, with and without errors inside adapter-handlers).
  • fixes issue with wrong-chars in string items of CallingMap by its representation, additionally don't calculate values implicitly (may be unexpected for some constellations resp. too slow in DEBUG);
  • logging in fail2ban is process-wide exception-safe now

…resentation, additionally don't calculate values implicitly (may be unexpected for some constellations resp. too slow in DEBUG)
…ytes and str), provide new universal helper `uni_string`, which uses safe explicit conversion to string (also if default encoding is ascii); avoid conversion errors on wrong-chars by replace tags.
…ense of json and/or sqlite-database;

both should be additionally exception-safe, so avoid possible errors in log-handlers (concat, str. conversion, etc);
test cases extended to cover any possible variants (invalid chars in unicode, bytes, str + unterminated char-sequence) with both cases (with replace of chars, with and without errors inside adapter-handlers).
@coveralls
Copy link

coveralls commented Jul 4, 2018

Coverage Status

Coverage increased (+0.08%) to 97.404% when pulling 73e89df on sebres:0.10-fix-decoding-issues into 2f5059e on fail2ban:0.10.

@sebres sebres added the grave label Jul 4, 2018
@codecov-io
Copy link

codecov-io commented Jul 4, 2018

Codecov Report

Merging #2171 into 0.10 will increase coverage by 0.11%.
The diff coverage is 98.34%.

Impacted file tree graph

@@            Coverage Diff             @@
##             0.10    #2171      +/-   ##
==========================================
+ Coverage   95.46%   95.57%   +0.11%     
==========================================
  Files          76       76              
  Lines       13087    13190     +103     
  Branches     2079     2081       +2     
==========================================
+ Hits        12493    12607     +114     
+ Misses        324      315       -9     
+ Partials      270      268       -2
Impacted Files Coverage Δ
fail2ban/tests/fail2banclienttestcase.py 96.35% <ø> (ø) ⬆️
fail2ban/tests/misctestcase.py 96.12% <100%> (+0.69%) ⬆️
fail2ban/server/action.py 95.58% <100%> (+0.16%) ⬆️
fail2ban/server/database.py 97.98% <100%> (+0.38%) ⬆️
fail2ban/tests/actiontestcase.py 98.58% <100%> (+0.02%) ⬆️
fail2ban/tests/action_d/test_badips.py 100% <100%> (ø) ⬆️
fail2ban/server/actions.py 90.71% <100%> (+0.38%) ⬆️
config/action.d/badips.py 100% <100%> (+14.28%) ⬆️
fail2ban/server/ticket.py 95.79% <100%> (ø) ⬆️
fail2ban/tests/utils.py 97.32% <100%> (+1.5%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2f5059e...73e89df. Read the comment docs.

@sebres sebres mentioned this pull request Jul 4, 2018
Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor comments... unicode handling across Pythons indeed "a messy business" . Whenever even core modules (such as pdb) crash while trying to troubleshoot such cases, it means that it remains not fully cooked ;) For fun of it you could try

wget http://www.onerussian.com/tmp/test-uni.py
LC_ALL=C.POSIX python3 test-uni.py

and in pdb prompt issue l command. Enjoy the crash. But it does work in python2 just fine... My point is -- I do dislike overload of the default encoding in python2 via library reload etc, and do not remember why myself decided not to mess with it, but I do not know a better way per se.
generic whining remains of excessive "pragma: no cover" which just artificially blows up code coverage %, e.g. if some code is to be rarely reached, it doesn't mean that it shouldn't be tested. And it most likely could be tested with a dedicated test+ possible mocking where needed.
Otherwise my review is just superficial since codebase is no longer very familiar to me ;-)

- if [[ "$F2B_PY_2" ]]; then travis_retry pip install dnspython; fi
- if [[ "$F2B_PY_3" ]]; then travis_retry pip install dnspython3; fi
- if [[ "$F2B_PY" = 2 ]]; then travis_retry pip install dnspython; fi
- if [[ "$F2B_PY" = 3 ]]; then travis_retry pip install dnspython3; fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: I don't think you need then [[ ]] and regular [ ] should be enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm always careful with travis (no idea what for shell used or will be used later)...

# if default encoding is 'ascii');
if sys.version_info < (3,): # pragma: 3.x no cover
# correct default (global system) encoding (mostly UTF-8):
def __resetDefaultEncoding(encoding):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: mixed indentation - comment indented with space(s) and then regular tab goes

reload(sys)
_sys = sys
if hasattr(_sys, "setdefaultencoding"):
_sys.setdefaultencoding(encoding)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, in datalad I was considering to do the same hack but IIRC decided not to mess with it, but now do not remember why exactly (besides it being ugly etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this hack is only to minimize conversions errors like described above str(u"\uFFFD") or similar, which implicitly used the default encoding (and makes BOOM if ascii). Normally we've no interest for correct conversion of surrogate pairs resp. chars does not covered from some encoding (just ignored or replaced with ?)...
So by explicit conversion like str(...) or map(str, ...) I can use uni_string now, but not for implicit-conversions resp. modules that we've not under our control.
So for the case if I missed something somewhere, better to have it - as described "minimizes errors influence of conversion errors on wrong char's"...

return dict(self)
except:
return dict(self.data, **self.storage)
__repr__ = _asrepr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need a _asrepr then, just for consistency with _asdict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just because of second argument calculated=True, that repr does not accept in 2.x...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, missed that, thanks

@@ -156,7 +173,7 @@ def __iter__(self):
def __len__(self):
return len(self.data)

def copy(self): # pargma: no cover
def copy(self): # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why all this functionality is Ok to be not tested, is beyond me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unused currently in unpstream (only in my internal f2b)... I'll take a look later, if get merged.


def _json_default(x):
"""Avoid errors on types unknow in json-adapters."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor typo "unknown"

def _json_default(x):
"""Avoid errors on types unknow in json-adapters."""
if isinstance(x, set):
x = list(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it should be sorted for consistent string representation. Or since you are converting to string anyways later on -- why not to just go for repr(x) for sets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, set it inpossible as json... and I'll get an iterable object after unserialization (json.loads) hereafter, to do things like v in ticket.something_returns_set_or_list...
one can write a new json handler for set, but ATM it's used only this way (by user-names).

try:
logSys.error('json dumps failed: %r', e, exc_info=logSys.getEffectiveLevel() <= 4)
except: # pragma: no cover
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since I guess it could happen only if repr(e) fails, may be at least try to log without it and explicit msg: json dumps failed, and repr of exception failed. Otherwise it is just completely ignored - might be difficult to debug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this handler is currently only for wrapper set->list, but I want to see it, if tomorrow someone extends ticket.data with something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, now I understand what you want... agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in bcf5579 (a bit more safe now and it's fail2ban-wide, so relocated now from database to the inject in Logger in helpers + fully test-covered for breakdown in logger, if some python-version will change this Logger protected method unexpected way).

try:
logSys.error('json loads failed: %r', e, exc_info=logSys.getEffectiveLevel() <= 4)
except: # pragma: no cover
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here - imho better to log some error than ignore completely

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error will be logged always, only the full-stack will be logged, if I do for example:
fail2ban-testcases -l 4 someTest... to gather where it's occurred exactly

…logger handling, using injection on _log-method of Logger class;

additionally provides more info if handler/conversion failed (with double protection inside catch-case);
tests/utils.py: log handler "_MemHandler" of LogCaptureTestCase fixed now to be safe also (test-cases only);
tests/misctestcase.py: the safe logging of all possible constellations is covered in testSafeLogging now.
@sebres
Copy link
Contributor Author

sebres commented Jul 6, 2018

I've got confirmed that this runs now without the hangs (after wrong encoded char warnings).
Thus merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants