Process: add possiblity to hide specific output to call function #92

jasugun · 2023-11-09T17:58:22Z

Handy when we need to hide stuff in CI for example.

nimp/sys/process.py

ftith

The PR works but could be interesting to use SensitiveDataFilter(logging.Filter) instead

jasugun · 2023-11-10T13:48:27Z

The PR works but could be interesting to use SensitiveDataFilter(logging.Filter) instead

The thing is that the captured output (std threads) does not go through our logger. It's a huge change that I'd rather avoid for now. What do you think?

tdesveaux · 2023-11-10T14:10:22Z

The PR works but could be interesting to use SensitiveDataFilter(logging.Filter) instead

The thing is that the captured output (std threads) does not go through our logger. It's a huge change that I'd rather avoid for now. What do you think?

Does it not? https://github.com/dontnod/nimp/pull/92/files#diff-5a44de719039a4548dedb99a3a5d387b5447e1114b1e2bc3b7e87550ee9bcdfdR143

ftith · 2023-11-10T17:40:18Z

I confirm that we are already using a logger for the actual command (but not for the dry-run logging) so it's possible if use the same logger for the dry run as well

Here is an example of Filter class:

class SensitiveDataFilter(logging.Filter):
    def __init__(self, sensitive_data):
        self._pattern = re.compile(f'({"|".join(re.escape(el) for el in sensitive_data)})')
        self._hidden_str = '*****'
    def filter(self, record):
        record.args = tuple(self._pattern.sub(self._hidden_str, child_arg) for child_arg in record.args)
        return True

You can then apply it to your logger:

    logger = logging.getLogger('child_processes')
    logger.addFilter(SensitiveDataFilter(hide_output_specific))

Be careful of nested args though, and convert list to str (otherwise the substitution will be broken):
logger.info('%s "%s" in "%s"', '[DRY-RUN]' if dry_run else 'Running', str(command), os.path.abspath(cwd))

jasugun · 2023-11-10T17:50:47Z

You're right this is going through the logger. And looking at the python logging code it's not too hard to subclass.
Françoise I watched your comment just now, from what I see I might have done something like you.

ftith · 2023-11-10T18:06:30Z

nimp/sys/process.py

@@ -109,7 +116,9 @@ def _output_worker(index, decoding_format):
            return
        force_ascii = locale.getpreferredencoding().lower() != 'utf-8'
        while process is not None:
-            logger = logging.getLogger('child_processes')
+            logger_child_processes = logging.getLogger('child_processes')


is there a reason why you need 2 different loggers? they are in the same fct and you apply the same filter

I've tried using one single logger for the whole function instead.

I don't know if is relevant to have all the logs in the same logger as the call_logger actually. I think it wouls be insteresting to keep the distinction between the "child_process" and the call_logger. For instance, the difference between the keep alive and the call to unreal:

[2023.11.13-11.53.58:518][ 0]LogGatherTextFromAssetsCommandlet: Display: [ 9.62%] Loading package: '/Game/Dialogue/P01/R02/Settler/M07/STG/DLG_R02_Settler_M07_A_MagalieAlive_STG_SCRIPT2'... 2023-11-13 11:53:58,635 [INFO] Keepalive for ..\..\.\UE\Engine\Binaries\Win64\UnrealEditor.exe

tdesveaux · 2023-11-14T11:38:27Z

nimp/sys/process.py

 def call(command, cwd='.', heartbeat=0, stdin=None, encoding='utf-8',
-         capture_output=False, capture_debug=False, hide_output=False,
+         capture_output=False, capture_debug=False, hide_output=False, hide_output_specific=None,


Not too sure about the hide_output_specific naming.
It's not really obvious what it does.

Also, it's not really something that's specific to the process.call function, maybe it should be some helper that can be used in a more global way?

I did a small test and it's a bit annoying though:

import logging import re logging.basicConfig(level=logging.INFO) class SensitiveDataFilter(logging.Filter): def __init__(self, *args): super().__init__() self.pattern = re.compile(rf"({'|'.join(args)})") def filter(self, record): record.msg = self.pattern.sub("*****", record.msg) record.args = tuple(self.pattern.sub("****", a) for a in record.args) return super().filter(record) filter_ = SensitiveDataFilter('toto', 'tata') logger = logging.getLogger() logger.addFilter(filter_) logger.info('logger: toto %s, %s, %s', 'tata', 'tatato', 'tata:toto') logging.info('logging: toto %s, %s, %s', 'tata', 'tatato', 'tata:toto') test_logger = logging.getLogger('test') test_logger.info('test_logger: toto %s, %s, %s', 'tata', 'tatato', 'tata:toto')

INFO:root:logger: ***** ****, ****to, ****:**** INFO:root:logging: ***** ****, ****to, ****:**** INFO:test:test_logger: toto tata, tatato, tata:toto

new logger do not inherit filter from root logger

@tdesveaux @ftith I added convenience context manager and filters to more flexibly filter nimp logs; could you please take a look at it?

Example use:

from nimp.sys.logging import FilteredLogging from nimp.sys.logging import SensitiveDataFilter <...> with FilteredLogging(SensitiveDataFilter('string_to_hide_1', 'string_to_hide_2')) as filter_logger: if nimp.sys.process.call(command, dry_run=env.dry_run) != 0: # call ouput will hide sensitive info filter_logger.info('This will hide string_to_hide_1') # this will also hide stuff

nimp/sys/process.py

Handy when we need to hide stuff in CI for example.

... for all call function.

Also add a conveninec filter for hiding stuff. Typical use: with FilteredLogging(SensitiveDataFilter('string_to_hide_1', 'string_to_hide_2')) as filter_logger: if nimp.sys.process.call(bpt_command, dry_run=env.dry_run) != 0: # call ouput will hide sensitive info filter_logger.info('This will hide string_to_hide_1')

jasugun · 2023-11-27T09:02:06Z

@tdesveaux @tdesveaux
Could you please take a look a this whole new way to filter our logging please?

tdesveaux · 2023-11-27T16:18:46Z

nimp/sys/logging.py

+    """ Custom filter to hide specific information from logging stream """
+    def __init__(self, *args):
+        super().__init__()
+        self.pattern = re.compile(rf"({'|'.join(args)})")


I think you should escape args just in case re.escape(a) for a in args

tdesveaux · 2023-11-27T16:23:30Z

nimp/sys/logging.py

+            if isinstance(arg, list):
+                record_args.append([self.pattern.sub(hide_string, str(a)) for a in arg])
+            elif isinstance(arg, str):
+                record_args.append(self.pattern.sub(hide_string, str(arg)))


why the str(arg) since you already check that it's an instance of str?

ftith · 2023-11-27T17:56:40Z

I'm starting to losing tracks and missing the way to test correctly the PR. Maybe we can talk about it together tomorrow.
A reminder just in case:

Note It is strongly advised that you do not log to the root logger in your library. Instead, use a logger with a unique and easily identifiable name, such as the name for your library’s top-level package or module. Logging to the root logger will make it difficult or impossible for the application developer to configure the logging verbosity or handlers of your library as they wish.

Source: https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library

and also that summary handler is using logging.getLogger('child_processes')

jasugun · 2023-11-28T09:23:14Z

@ftith good catch I'll look into that

jasugun requested review from tdesveaux and ftith November 9, 2023 17:58

tdesveaux reviewed Nov 10, 2023

View reviewed changes

nimp/sys/process.py Outdated Show resolved Hide resolved

nimp/sys/process.py Outdated Show resolved Hide resolved

ftith reviewed Nov 10, 2023

View reviewed changes

ftith approved these changes Nov 10, 2023

View reviewed changes

ftith approved these changes Nov 13, 2023

View reviewed changes

tdesveaux reviewed Nov 14, 2023

View reviewed changes

jasugun added 5 commits November 22, 2023 17:28

Process: add possiblity to hide specific output to call function

0dd01ef

Handy when we need to hide stuff in CI for example.

Process: move hide_specific logged command inside not hide_output block

746d3ac

Process: Call: use regex.sub() to hide strings

e6e66d9

Process: use SensitiveDataLogger() to hide sensitive data

34877c2

Process: use one single logger (that can use SensitiveDataFilters)

9f0ac0e

... for all call function.

jasugun force-pushed the lca/process/call_hide_specific branch from 13ed4ed to 9f0ac0e Compare November 22, 2023 16:28

jasugun added 3 commits November 23, 2023 17:11

Call: hide record args as well as record msg

774661c

Process: Call: Let logger handle formatting

6310b66

jasugun requested review from ftith and tdesveaux November 24, 2023 16:18

tdesveaux approved these changes Nov 27, 2023

View reviewed changes

Logging: escape sensitive data

c8a41cf

jasugun merged commit 935509a into dev Nov 27, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process: add possiblity to hide specific output to call function #92

Process: add possiblity to hide specific output to call function #92

jasugun commented Nov 9, 2023

ftith left a comment

jasugun commented Nov 10, 2023 •

edited

Loading

tdesveaux commented Nov 10, 2023

ftith commented Nov 10, 2023 •

edited

Loading

jasugun commented Nov 10, 2023 •

edited

Loading

ftith Nov 10, 2023

jasugun Nov 13, 2023

ftith Nov 13, 2023

tdesveaux Nov 14, 2023

tdesveaux Nov 14, 2023

jasugun Nov 24, 2023 •

edited

Loading

jasugun commented Nov 27, 2023

tdesveaux Nov 27, 2023

tdesveaux Nov 27, 2023

ftith commented Nov 27, 2023

jasugun commented Nov 28, 2023

Process: add possiblity to hide specific output to call function #92

Process: add possiblity to hide specific output to call function #92

Conversation

jasugun commented Nov 9, 2023

ftith left a comment

Choose a reason for hiding this comment

jasugun commented Nov 10, 2023 • edited Loading

tdesveaux commented Nov 10, 2023

ftith commented Nov 10, 2023 • edited Loading

jasugun commented Nov 10, 2023 • edited Loading

ftith Nov 10, 2023

Choose a reason for hiding this comment

jasugun Nov 13, 2023

Choose a reason for hiding this comment

ftith Nov 13, 2023

Choose a reason for hiding this comment

tdesveaux Nov 14, 2023

Choose a reason for hiding this comment

tdesveaux Nov 14, 2023

Choose a reason for hiding this comment

jasugun Nov 24, 2023 • edited Loading

Choose a reason for hiding this comment

jasugun commented Nov 27, 2023

tdesveaux Nov 27, 2023

Choose a reason for hiding this comment

tdesveaux Nov 27, 2023

Choose a reason for hiding this comment

ftith commented Nov 27, 2023

jasugun commented Nov 28, 2023

jasugun commented Nov 10, 2023 •

edited

Loading

ftith commented Nov 10, 2023 •

edited

Loading

jasugun commented Nov 10, 2023 •

edited

Loading

jasugun Nov 24, 2023 •

edited

Loading