Skip to content

Conversation

@sh-rp
Copy link
Collaborator

@sh-rp sh-rp commented Nov 5, 2024

Description

This change will require the DictValidationException to have the full incoming validatable object. We extract all strings from this object and make sure that the error message does not have any of these in clear text.

We could probably allow the user to set an env var to disable this obfuscator for development.

Originally the plan was to also do stuff to prevent leakage from the configs, but the __str__ method of configs currently is not implemented and just returns the default python object identifier. So I don't think there is anything to be done there.

@netlify
Copy link

netlify bot commented Nov 5, 2024

Deploy Preview for dlt-hub-docs canceled.

Name Link
🔨 Latest commit 840832c
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/672a52b15f37540008342ae1

@sh-rp sh-rp force-pushed the fix/validate_dict_value_obfuscation branch from db97219 to 840832c Compare November 5, 2024 17:15
@sh-rp sh-rp requested review from burnash and rudolfix November 5, 2024 17:50
@sh-rp sh-rp added the bug Something isn't working label Nov 5, 2024

def obfuscate_string(s: str) -> str:
"""Obfuscates string by replacing some or all characters with asterisks"""
if len(s) < 6:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider replacing this magic number with a constant and maybe use a single constant for both this place as well as here:

def _mask_secret(secret: Optional[str]) -> str:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I think we can get rid of this mask_secrets code entirely, no?

Copy link
Collaborator

@willi-mueller willi-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is nice! I think it would allow us to delete TODO comments, such as here:

# Here, we assume that OAuth2 and other custom classes that don't implement __get__()

Just in case you have discussed this already, you could maybe update the discussion here with your results:
#1797

Anton and I argued for the method this PR proposes, Marcin was against it.

@sh-rp
Copy link
Collaborator Author

sh-rp commented Nov 6, 2024

@willi-mueller what where the arguments against implementing it this way?

@sh-rp sh-rp self-assigned this Nov 19, 2024
Copy link
Collaborator

@burnash burnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, two important from my point of view to consider:

  1. Allow the user to disable the obfuscation via a setting in runtime configuration
  2. Instead of partial masking of the secrets where we leave first and last character, do the total masking as @rudolfix initially suggested.

@zilto
Copy link
Collaborator

zilto commented Mar 11, 2025

Status on this PR? The obfuscation feature could be valuable when passing the dlt config to LLMs via the MCP

@burnash burnash moved this from In Progress to Planned in dlt core library Mar 17, 2025
@sh-rp sh-rp closed this May 19, 2025
@sh-rp sh-rp deleted the fix/validate_dict_value_obfuscation branch May 19, 2025 13:51
@github-project-automation github-project-automation bot moved this from Planned to Done in dlt core library May 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants