Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classify base64 encoded tokens as belonging to particular tools/services #158

Closed
1 of 4 tasks
justineyster opened this issue Apr 9, 2019 · 3 comments
Closed
1 of 4 tasks
Labels
pending The issue still needs to be reviewed by one of the maintainers.

Comments

@justineyster
Copy link

justineyster commented Apr 9, 2019

Raising this as a parallel issue to one I opened today in the IBM fork.

Context

@jribm raised a point while I was working on #156 that, under our current approach, base64 encoded strings won't be classified as belonging to a particular tool. They may be caught by the base64 entropy scanner, but lack of association to a particular tool means that they will not be verifiable.

Examples:
X-JFrog-Art-Api: <some-base64-encoded-string>
artifactory:_password: <some-base64-encoded-string>

In both of these examples, there is a defined structure for the indicators that the key belongs to a particular service. However, the string itself won't match as an Artifactory key because the encoded string doesn't follow the expected token format.

Given this issue, we could design a two-step approach where we base64 decode suspicious strings to see if they match for a particular tool. I can imagine at least two approaches for doing this:

  1. Search for indicators of a particular tool's token (like the authentication header X-JFrog-Art-Api in the examples above), decode the suspicious string near that indicator, and test it against the regex for that service.
  2. base64 decode strings that are caught by the base64 entropy scanner and test the decoded string against all of the other secret detectors.

Subtasks & step(s)

  • Raise a parallel issue in https://github.com/Yelp/detect-secrets to gather feedback from upstream community.
  • Decide on general approach for decoding and testing suspicious strings.
  • Implement solution and merge in our codebase and upstream.

Success criteria

  • base64 encoded tokens will be classified as belonging to a particular service.
@KevinHock
Copy link
Collaborator

I'm kind of ambivalent, which approach do you prefer?

I think 1. can be accomplished with something similar to the keyword detector.

For 2. what would your example base64 strings decode to?

@KevinHock
Copy link
Collaborator

Noting for posterity, and because we have verifiability now, GitHub API tokens are 40 chars, and can easily be verified via the oauth/scopes endpoint, though I am having a hard time finding the exact link to that API. I can say I hit it yesterday.

@lorenzodb1 lorenzodb1 added pending The issue still needs to be reviewed by one of the maintainers. and removed enhancement labels Jun 13, 2022
@lorenzodb1
Copy link
Contributor

We're going to close this issue as it hasn't received any update in a very long time. Feel free to re-open it if you think it's still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending The issue still needs to be reviewed by one of the maintainers.
Projects
None yet
Development

No branches or pull requests

3 participants