Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: do not use '.exp.sed' as redact source #3900

Closed
wants to merge 2 commits into from
Closed

Conversation

xiangce
Copy link
Contributor

@xiangce xiangce commented Aug 31, 2023

All Pull Requests:

Check all that apply:

  • Have you followed the guidelines in our Contributing document, including the instructions about commit messages?
  • Is this PR to correct an issue?
  • Is this PR an enhancement?

Complete Description of Additions/Changes:


# exclude regex
retval = _process_content_redaction(test_file.name, ['[[:digit:]]+', 'a*(b|c)'], True)
assert retval == 'test\npassword: ********\n'.encode('utf-8')
assert retval == 'test\npassword: pAsswOrd\n'.encode('utf-8')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangce so this change removes password redaction completely?

Copy link
Contributor Author

@xiangce xiangce Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryan-blakley, yes, we want to do that, since the password redaction caused some issues (#3858, #3885, and another not-reported one in the insalled_rpms see this attachment) and resulted in false positives of some rules.
and, the current logic of the .exp.sed is not sufficient, e.g. the keyword password is not the only one for indicating passwords (there are passwd and pwd, etc.). Considering these, and we have a strict review process for Specs, We'd like to remove it from the collection. What do you think?

Copy link
Contributor Author

@xiangce xiangce Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ps, the _process_content_redaction method will be removed in the next Move obfuscation/reaction to core collection (#3679). The relevant tests will also be removed then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha yeah I noticed the issues that were caused by it. Is the plan to not merge this until the new obfuscation/redaction is merged? The reason I ask, I just don't want passwords to get stored in warehouse after making this change, and then have to purge them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @ryan-blakley, this .exp.sed or the equivalent process will not be used/implemented in the new obfuscation/redaction. Our current target is to completely discard it from the core (and count on spec reviews to avoid collecting PII information) due to 1. it causes issues on specs 2. it's not sufficient.

However, if you have come across any real cases where this "password" script works as expected, for example, actually obfuscated the password, can you share some of these cases with me?

We have talked about the above in the sync meeting too. I believe Bob will also ask you the same for me. It won't hurt that we start another e-mail or Slack thread to continue discussing this matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangce Gotcha, I apologize I thought the current .exp.sed script was actively redacting passwords. Looking through several older archives I have downloaded I see how bad it is now. I did see it redacted the proxy password from the /etc/rhsm/rhsm.conf file correctly, but that's not as big of a deal. My fear was we would start collecting service user passwords, or database user passwords for like mariadb, redis, etc. But if that's not the case, then the change looks good to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryan-blakley - Thanks for the info. So I think it's required for us to go through the existing (non-filterable) specs to check if there are any risks of collecting the sensitive information. We can merge this PR after this check.

And I opened #3916 first for rhsm.conf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryan-blakley - FYI, I opened INSGHTCORE-236 to track the password check of existing specs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds perfect to me!

Copy link
Contributor

@ahitacat ahitacat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still left the default_sed_file defined in the constants file.

Except for this it looks great. I'll approve it once the constant is removed.

@xiangce
Copy link
Contributor Author

xiangce commented Sep 13, 2023

There is still left the default_sed_file defined in the constants file.

Except for this it looks great. I'll approve it once the constant is removed.

So I made an incorrect understanding, I thought you (the client team) might want to keep these settings in the constants until remove this file from the RPM package.

I'll remove them in the next commit soon.

@ahitacat
Copy link
Contributor

There is still left the default_sed_file defined in the constants file.
Except for this it looks great. I'll approve it once the constant is removed.

So I made an incorrect understanding, I thought you (the client team) might want to keep these settings in the constants until remove this file from the RPM package.

I'll remove them in the next commit soon.

Oh, I think is no needed.
With this PR it is not longer used in the code so I feel it shouldn't be defined in constants.

- see: INSGHTCORE-227

Signed-off-by: Xiangce Liu <xiangceliu@redhat.com>
Signed-off-by: Xiangce Liu <xiangceliu@redhat.com>
@xiangce xiangce added the WIP Includes: Approved but Not Ready for Merging/Releasing label Oct 12, 2023
@candlepin-bot
Copy link

Can one of the admins verify this patch?

@xiangce xiangce marked this pull request as draft May 28, 2024 10:12
@xiangce xiangce removed the WIP Includes: Approved but Not Ready for Merging/Releasing label May 28, 2024
@xiangce xiangce closed this Aug 31, 2024
@xiangce xiangce deleted the skip_exp_sed branch August 31, 2024 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants