Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DiscordBotTokenDetector and its tests. #628

Merged
merged 1 commit into from
Mar 27, 2023

Conversation

nuztalgia
Copy link
Contributor

Fixes #627.

See the linked issue for more context. As mentioned, this is the "bare minimum" fix for the implementation. Lots of test cases have also been added and/or clarified.

@lirantal
Copy link
Contributor

I've reached out to ChatGPT to explain the regex if it helps getting this PR merged quicker:

This is a regular expression used to match a specific pattern in a string. It is being passed to the re.compile() function, which compiles the regular expression into a pattern object that can be used for matching.

Here is a breakdown of the regular expression:

  • r': This indicates that the string is a raw string, which means that backslashes are treated literally and do not need to be escaped.

  • [MNO]: This matches any single character that is either M, N, or O.

  • [a-zA-Z\d_-]: This matches any single character that is either a lowercase letter (a-z), an uppercase letter (A-Z), a digit (\d), or a hyphen or underscore (_).

  • {23,25}: This indicates that the preceding character or character group (in this case, [a-zA-Z\d_-]) must be repeated 23 to 25 times.

  • \.: This matches a literal period (.) character.

  • [a-zA-Z\d_-]{6}\.: This matches any sequence of 6 characters that are either a letter, a digit, a hyphen, or an underscore, followed by a period.

  • [a-zA-Z\d_-]{27}: This matches any sequence of 27 characters that are either a letter, a digit, a hyphen, or an underscore.

Overall, this regular expression is used to match strings that have the following form:

  • The first character is either M, N, or O.
  • The next 23 to 25 characters are either letters, digits, hyphens, or underscores.
  • The next character is a period.
  • The next 6 characters are either letters, digits, hyphens, or underscores.
  • The next character is a period.
  • The final 27 characters are either letters, digits, hyphens, or underscores.

For example, the string "Mabcdefghijklmnopqrstuvwxy.123456.abcdefghijklmnopqrstuvwxyz123" would match this regular expression.


The regular expression itself has been changed only to support an additional O character (optional) and an optional longer string of up to 25 characters in length.


This PR also includes more tests and is passing CI.
@syn-4ck it's a relatively small change to the regex to extend it to support other token formats. think we can land this?

@jpdakran jpdakran merged commit 7d569e0 into Yelp:master Mar 27, 2023
@nuztalgia nuztalgia deleted the fix-discord-bot-token-detector branch March 27, 2023 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DiscordBotTokenDetector failing to detect some Discord bot tokens
3 participants