fix(utils): censor token information in error logs#672
Conversation
|
The fix for issue #545 to censor session tokens in error logs has been implemented and tested. All GitHub Actions tests passed successfully. Ready for final review and merge. |
luisremis
left a comment
There was a problem hiding this comment.
the code is repeated all over the place. use a common function for all cases.
ad-claw000
left a comment
There was a problem hiding this comment.
I've extracted the censor_tokens logic into a common module-level function in aperturedb/Utils.py to avoid duplicating the code. PTAL @luisremis.
1163d0b to
3496e37
Compare
|
@luisremis I have extracted the censor_tokens logic to a common function in Utils.py as requested. Please take another look! |
|
Addressed feedback from @luisremis: Extracted |
|
Addressed feedback from @luisremis: Refactored token censoring into a common function Updated in commit 0b3f2ca |
There was a problem hiding this comment.
Pull request overview
This PR addresses issue #545 by preventing refresh_token and session_token values from being printed in logs during failures, by introducing a redaction helper and applying it to response logging paths.
Changes:
- Added a
censor_tokensutility to redact token fields in server responses. - Updated
Connector.get_last_response_str()to return a redacted JSON string. - Updated
CommonLibrary.execute_query()to log a redacted response at DEBUG level.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| aperturedb/Utils.py | Adds censor_tokens redaction helper for responses. |
| aperturedb/Connector.py | Redacts output produced by get_last_response_str(). |
| aperturedb/CommonLibrary.py | Redacts DEBUG response logging in execute_query(). |
Comments suppressed due to low confidence (3)
aperturedb/Utils.py:43
- The masking logic keeps only the first/last 4 characters, which drops the token type prefix separator (e.g.,
adbr_becomesadbr...). If downstream tooling relies on distinguishing token types, consider preserving the full prefix (likeadbr_/adbs_) and masking the remainder (or making the prefix length configurable). Also, short tokens (<= 8 chars) are currently left unmasked.
auth = item["Authenticate"]
for k in ["refresh_token", "session_token"]:
if k in auth and isinstance(auth[k], str) and len(auth[k]) > 8:
auth[k] = auth[k][:4] + "..." + auth[k][-4:]
return censored
aperturedb/Utils.py:44
- No tests were added for
censor_tokens, which is security-sensitive behavior. Please add unit tests to validate thatrefresh_tokenandsession_tokenare consistently redacted (and that non-token fields are preserved) for typical server responses.
def censor_tokens(response):
import copy
if isinstance(response, list):
censored = copy.deepcopy(response)
for item in censored:
if isinstance(item, dict) and "Authenticate" in item:
auth = item["Authenticate"]
for k in ["refresh_token", "session_token"]:
if k in auth and isinstance(auth[k], str) and len(auth[k]) > 8:
auth[k] = auth[k][:4] + "..." + auth[k][-4:]
return censored
return response
aperturedb/CommonLibrary.py:305
- Only the DEBUG "Response=" log uses the censored response, but later error/warning logs in this function still format and emit the raw
r/warn_list. If anAuthenticate/RefreshTokenquery fails (or partially fails), those logs can still leak tokens. Please ensure all response logging in this function uses the redacted form.
from aperturedb.Utils import censor_tokens
censored_r = censor_tokens(r)
logger.debug(f"Response={censored_r}")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ad-claw000
left a comment
There was a problem hiding this comment.
@luisremis Address review feedback from copilot: censor_tokens now preserves token prefixes (like adbr_) and handles short tokens, tests were added, and CommonLibrary now consistently censors responses in all logging statements. PTAL!
|
Addressed feedback from @Copilot:
Updated in commit 95c8045. PTAL @luisremis |
|
@luisremis I have addressed the remaining review comments: removed incorrect imports, updated the test import, and applied |
|
Applied autopep8 formatting to fix the failing pre-commit CI check. |
|
Addressed the latest feedback from Copilot: moved |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
aperturedb/CommonLibrary.py:345
- This partial-error warning dumps
queryviajson.dumps(query, ...)without redacting token-like fields. If the query includestoken/refresh_token(or other sensitive keys), it will still be exposed at WARNING level. Applycensor_tokens()to the query before serializing/logging it.
logger.warning(
f"Partial errors:\r\n{json.dumps(query, default=str)}\r\n{json.dumps(censor_tokens(warn_list), default=str)}")
ad-claw000
left a comment
There was a problem hiding this comment.
Addressed the remaining feedback on multiline f-strings and multiple calls to censor_tokens in Utils.py.
ad-claw000
left a comment
There was a problem hiding this comment.
Addressed the remaining feedback on multiline f-strings and multiple calls to censor_tokens in Utils.py.
ad-claw000
left a comment
There was a problem hiding this comment.
Addressed the remaining feedback on multiline f-strings and multiple calls to censor_tokens in Utils.py.
|
Addressed the multiline f-string syntax error and optimized |
Closes #545
This PR fixes the issue where
refresh_tokenandsession_tokeninformation was exposed in logs when a connection error or failed query occurred. It introduces acensor_tokensutility inside the logging functions ofConnector.py,Utils.py, andCommonLibrary.pyto mask the tokens (e.g.,adbr_...nuinstead of showing the full token).