-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process: add possiblity to hide specific output to call function #92
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR works but could be interesting to use SensitiveDataFilter(logging.Filter)
instead
The thing is that the captured output (std threads) does not go through our logger. It's a huge change that I'd rather avoid for now. What do you think? |
|
I confirm that we are already using a logger for the actual command (but not for the dry-run logging) so it's possible if use the same logger for the dry run as well Here is an example of Filter class:
You can then apply it to your logger:
Be careful of nested args though, and convert list to str (otherwise the substitution will be broken): |
You're right this is going through the logger. And looking at the python logging code it's not too hard to subclass. |
nimp/sys/process.py
Outdated
@@ -109,7 +116,9 @@ def _output_worker(index, decoding_format): | |||
return | |||
force_ascii = locale.getpreferredencoding().lower() != 'utf-8' | |||
while process is not None: | |||
logger = logging.getLogger('child_processes') | |||
logger_child_processes = logging.getLogger('child_processes') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why you need 2 different loggers? they are in the same fct and you apply the same filter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried using one single logger for the whole function instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if is relevant to have all the logs in the same logger as the call_logger actually. I think it wouls be insteresting to keep the distinction between the "child_process" and the call_logger. For instance, the difference between the keep alive and the call to unreal:
[2023.11.13-11.53.58:518][ 0]LogGatherTextFromAssetsCommandlet: Display: [ 9.62%] Loading package: '/Game/Dialogue/P01/R02/Settler/M07/STG/DLG_R02_Settler_M07_A_MagalieAlive_STG_SCRIPT2'...
2023-11-13 11:53:58,635 [INFO] Keepalive for ..\..\.\UE\Engine\Binaries\Win64\UnrealEditor.exe
nimp/sys/process.py
Outdated
def call(command, cwd='.', heartbeat=0, stdin=None, encoding='utf-8', | ||
capture_output=False, capture_debug=False, hide_output=False, | ||
capture_output=False, capture_debug=False, hide_output=False, hide_output_specific=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not too sure about the hide_output_specific
naming.
It's not really obvious what it does.
Also, it's not really something that's specific to the process.call function, maybe it should be some helper that can be used in a more global way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a small test and it's a bit annoying though:
import logging
import re
logging.basicConfig(level=logging.INFO)
class SensitiveDataFilter(logging.Filter):
def __init__(self, *args):
super().__init__()
self.pattern = re.compile(rf"({'|'.join(args)})")
def filter(self, record):
record.msg = self.pattern.sub("*****", record.msg)
record.args = tuple(self.pattern.sub("****", a) for a in record.args)
return super().filter(record)
filter_ = SensitiveDataFilter('toto', 'tata')
logger = logging.getLogger()
logger.addFilter(filter_)
logger.info('logger: toto %s, %s, %s', 'tata', 'tatato', 'tata:toto')
logging.info('logging: toto %s, %s, %s', 'tata', 'tatato', 'tata:toto')
test_logger = logging.getLogger('test')
test_logger.info('test_logger: toto %s, %s, %s', 'tata', 'tatato', 'tata:toto')
INFO:root:logger: ***** ****, ****to, ****:****
INFO:root:logging: ***** ****, ****to, ****:****
INFO:test:test_logger: toto tata, tatato, tata:toto
new logger do not inherit filter from root logger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tdesveaux @ftith I added convenience context manager and filters to more flexibly filter nimp logs; could you please take a look at it?
Example use:
from nimp.sys.logging import FilteredLogging
from nimp.sys.logging import SensitiveDataFilter
<...>
with FilteredLogging(SensitiveDataFilter('string_to_hide_1', 'string_to_hide_2')) as filter_logger:
if nimp.sys.process.call(command, dry_run=env.dry_run) != 0: # call ouput will hide sensitive info
filter_logger.info('This will hide string_to_hide_1') # this will also hide stuff
Handy when we need to hide stuff in CI for example.
... for all call function.
13ed4ed
to
9f0ac0e
Compare
Also add a conveninec filter for hiding stuff. Typical use: with FilteredLogging(SensitiveDataFilter('string_to_hide_1', 'string_to_hide_2')) as filter_logger: if nimp.sys.process.call(bpt_command, dry_run=env.dry_run) != 0: # call ouput will hide sensitive info filter_logger.info('This will hide string_to_hide_1')
@tdesveaux @tdesveaux |
nimp/sys/logging.py
Outdated
""" Custom filter to hide specific information from logging stream """ | ||
def __init__(self, *args): | ||
super().__init__() | ||
self.pattern = re.compile(rf"({'|'.join(args)})") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should escape args just in case re.escape(a) for a in args
if isinstance(arg, list): | ||
record_args.append([self.pattern.sub(hide_string, str(a)) for a in arg]) | ||
elif isinstance(arg, str): | ||
record_args.append(self.pattern.sub(hide_string, str(arg))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the str(arg)
since you already check that it's an instance of str?
I'm starting to losing tracks and missing the way to test correctly the PR. Maybe we can talk about it together tomorrow.
Source: https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library and also that summary handler is using |
@ftith good catch I'll look into that |
Handy when we need to hide stuff in CI for example.