Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested checksum is incorrect when there are multiple files with the same name #344

Closed
wf-anniezhou opened this issue Oct 27, 2021 · 3 comments

Comments

@wf-anniezhou
Copy link

Describe the bug
When Talisman finds an error in a file at the root level, and there is another file with the exact same name in a different folder, the suggest checksum to paste into .talismanrc will not cause the errors to be ignored when running Talisman again.

To Reproduce
Steps to reproduce the behavior:

  1. Create a repo that contains a child folder, create an empty file under that folder (e.g. README.md)
  2. Make a commit
  3. Create a file with the same name (e.g. README.md) at the root level. Include a sensitive pattern in this file.
  4. Run the Talisman pre-commit hook, it will report an error related to the sensitive pattern and output the following:
If you are absolutely sure that you want to ignore the above files from talisman detectors, consider pasting the following format in .talismanrc file in the project root

fileignoreconfig:
- filename: README.md
  checksum: <someChecksumString>

  1. Paste the suggested fileignoreconfig into .talismanrc
  2. Run the Talisman pre-commit hook again, it will report the same error and suggest the same fileignoreconfig even though you have already added this in .talismanrc

Expected behavior
Pasting the suggested fileignoreconfig into .talismanrc should cause the errors to be ignored.

Likely cause
The checksum suggested by Talisman is for the individual file containing the sensitive pattern. When checking for whether a file should be ignored, the file name in .talismanrc is treated as a pattern and matched to both README.md and child/README.md.

If you run talisman --checksum README.md, it will output a different checksum string to the one generated above. Putting this different checksum into .talismanrc will cause the error to be ignored, which is the correct behavior. This indicates there's a mismatch relating to the checksum being generated.

@svishwanath-tw
Copy link
Collaborator

Hi @wf-anniezhou , thanks for taking the time to submit a detailed issue.

I believe this could be duplicate or side-effect of #342.

When talisman chooses to read a file, it uses the full the path relative to the root of the repository. This makes it highly unlikely that talisman is mixing 2 separate files in the same repository.

@wf-anniezhou
Copy link
Author

@svishwanath-tw that's an interesting theory. I'm not sure if it is related, in this case there's no dirty file that's not in the commit, and I do not see faulty checksums where there are no duplicate file names.

Another thing I noticed while looking through the source code is there's logic for calculating the collective checksum for files based on patterns (i.e. wildcards), hence the theory that the matching for file name may be pattern based rather than path specific.

@svishwanath-tw
Copy link
Collaborator

@wf-anniezhou : Please check if the latest version of talisman fixes this issue. Closing it for now. If you think it still exists. Please re-open the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants