New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.10, fix several decoding issues #2171
Conversation
…resentation, additionally don't calculate values implicitly (may be unexpected for some constellations resp. too slow in DEBUG)
…ytes and str), provide new universal helper `uni_string`, which uses safe explicit conversion to string (also if default encoding is ascii); avoid conversion errors on wrong-chars by replace tags.
…ense of json and/or sqlite-database; both should be additionally exception-safe, so avoid possible errors in log-handlers (concat, str. conversion, etc); test cases extended to cover any possible variants (invalid chars in unicode, bytes, str + unterminated char-sequence) with both cases (with replace of chars, with and without errors inside adapter-handlers).
Codecov Report
@@ Coverage Diff @@
## 0.10 #2171 +/- ##
==========================================
+ Coverage 95.46% 95.57% +0.11%
==========================================
Files 76 76
Lines 13087 13190 +103
Branches 2079 2081 +2
==========================================
+ Hits 12493 12607 +114
+ Misses 324 315 -9
+ Partials 270 268 -2
Continue to review full report at Codecov.
|
…) after all the errors during the simulation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor comments... unicode handling across Pythons indeed "a messy business" . Whenever even core modules (such as pdb) crash while trying to troubleshoot such cases, it means that it remains not fully cooked ;) For fun of it you could try
wget http://www.onerussian.com/tmp/test-uni.py
LC_ALL=C.POSIX python3 test-uni.py
and in pdb prompt issue l
command. Enjoy the crash. But it does work in python2 just fine... My point is -- I do dislike overload of the default encoding in python2 via library reload etc, and do not remember why myself decided not to mess with it, but I do not know a better way per se.
generic whining remains of excessive "pragma: no cover" which just artificially blows up code coverage %, e.g. if some code is to be rarely reached, it doesn't mean that it shouldn't be tested. And it most likely could be tested with a dedicated test+ possible mocking where needed.
Otherwise my review is just superficial since codebase is no longer very familiar to me ;-)
- if [[ "$F2B_PY_2" ]]; then travis_retry pip install dnspython; fi | ||
- if [[ "$F2B_PY_3" ]]; then travis_retry pip install dnspython3; fi | ||
- if [[ "$F2B_PY" = 2 ]]; then travis_retry pip install dnspython; fi | ||
- if [[ "$F2B_PY" = 3 ]]; then travis_retry pip install dnspython3; fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I don't think you need then [[ ]]
and regular [ ]
should be enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm always careful with travis (no idea what for shell used or will be used later)...
fail2ban/helpers.py
Outdated
# if default encoding is 'ascii'); | ||
if sys.version_info < (3,): # pragma: 3.x no cover | ||
# correct default (global system) encoding (mostly UTF-8): | ||
def __resetDefaultEncoding(encoding): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: mixed indentation - comment indented with space(s) and then regular tab goes
reload(sys) | ||
_sys = sys | ||
if hasattr(_sys, "setdefaultencoding"): | ||
_sys.setdefaultencoding(encoding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, in datalad I was considering to do the same hack but IIRC decided not to mess with it, but now do not remember why exactly (besides it being ugly etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this hack is only to minimize conversions errors like described above str(u"\uFFFD")
or similar, which implicitly used the default encoding (and makes BOOM if ascii). Normally we've no interest for correct conversion of surrogate pairs resp. chars does not covered from some encoding (just ignored or replaced with ?
)...
So by explicit conversion like str(...) or map(str, ...)
I can use uni_string
now, but not for implicit-conversions resp. modules that we've not under our control.
So for the case if I missed something somewhere, better to have it - as described "minimizes errors influence of conversion errors on wrong char's"...
return dict(self) | ||
except: | ||
return dict(self.data, **self.storage) | ||
__repr__ = _asrepr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need a _asrepr
then, just for consistency with _asdict
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just because of second argument calculated=True
, that repr does not accept in 2.x...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, missed that, thanks
@@ -156,7 +173,7 @@ def __iter__(self): | |||
def __len__(self): | |||
return len(self.data) | |||
|
|||
def copy(self): # pargma: no cover | |||
def copy(self): # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why all this functionality is Ok to be not tested, is beyond me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is unused currently in unpstream (only in my internal f2b)... I'll take a look later, if get merged.
fail2ban/server/database.py
Outdated
|
||
def _json_default(x): | ||
"""Avoid errors on types unknow in json-adapters.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor typo "unknown"
fail2ban/server/database.py
Outdated
def _json_default(x): | ||
"""Avoid errors on types unknow in json-adapters.""" | ||
if isinstance(x, set): | ||
x = list(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it should be sorted
for consistent string representation. Or since you are converting to string anyways later on -- why not to just go for repr(x)
for sets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, set it inpossible as json... and I'll get an iterable object after unserialization (json.loads) hereafter, to do things like v in ticket.something_returns_set_or_list
...
one can write a new json handler for set, but ATM it's used only this way (by user-names).
fail2ban/server/database.py
Outdated
try: | ||
logSys.error('json dumps failed: %r', e, exc_info=logSys.getEffectiveLevel() <= 4) | ||
except: # pragma: no cover | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since I guess it could happen only if repr(e)
fails, may be at least try to log without it and explicit msg: json dumps failed, and repr of exception failed
. Otherwise it is just completely ignored - might be difficult to debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this handler is currently only for wrapper set->list
, but I want to see it, if tomorrow someone extends ticket.data with something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, now I understand what you want... agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in bcf5579 (a bit more safe now and it's fail2ban-wide, so relocated now from database to the inject in Logger in helpers + fully test-covered for breakdown in logger, if some python-version will change this Logger protected method unexpected way).
fail2ban/server/database.py
Outdated
try: | ||
logSys.error('json loads failed: %r', e, exc_info=logSys.getEffectiveLevel() <= 4) | ||
except: # pragma: no cover | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here - imho better to log some error than ignore completely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the error will be logged always, only the full-stack will be logged, if I do for example:
fail2ban-testcases -l 4 someTest
... to gather where it's occurred exactly
…logger handling, using injection on _log-method of Logger class; additionally provides more info if handler/conversion failed (with double protection inside catch-case); tests/utils.py: log handler "_MemHandler" of LogCaptureTestCase fixed now to be safe also (test-cases only); tests/misctestcase.py: the safe logging of all possible constellations is covered in testSafeLogging now.
…e (logged now, as logging is safe)
I've got confirmed that this runs now without the hangs (after wrong encoded char warnings). |
UTF-8
in opposite toascii
previously, so minimizes influence of implicit conversions errors (between unicode, bytes and str);uni_string
, which uses safe explicit conversion to string (also if default encoding isascii
);additionally both are exception-safe now, so avoid possible errors in log-handlers (concat, str. conversion, etc);
closes After receiving error, does not work and does not block IP #2137;