Skip to content

enhancement(filters): use a stricter bot token regex#2006

Merged
mbaruh merged 11 commits into
python-discord:mainfrom
onerandomusername:enhance-token-regex
Oct 14, 2022
Merged

enhancement(filters): use a stricter bot token regex#2006
mbaruh merged 11 commits into
python-discord:mainfrom
onerandomusername:enhance-token-regex

Conversation

@onerandomusername
Copy link
Copy Markdown
Contributor

@onerandomusername onerandomusername commented Dec 12, 2021

Approval: https://canary.discord.com/channels/267624335836053506/635950537262759947/919203386660884560

Enhances the regex of the token remover to use the same regex that discord itself uses, with a slight modification. The mfa section was removed, but depending on an updated #1421, may be implemented. Additionally, the sections were grouped to keep working with the current code.

I kept the existing validation to keep false positives at a minimum. The current code checks the user resolves, the timestamp is valid, and the last section has at least 3 different characters.

As per @jb3, to implement:

Rejected:

  • also check the upper bound of a token timestamp, to ensure it was in the past.

@onerandomusername onerandomusername force-pushed the enhance-token-regex branch 2 times, most recently from c676f1a to b5f1305 Compare December 12, 2021 06:02
@onerandomusername
Copy link
Copy Markdown
Contributor Author

I'll be fixing this commit history tonight

@Xithrius Xithrius requested review from Bluenix2 and MarkKoz December 15, 2021 14:03
@Xithrius Xithrius added a: backend Related to internal functionality and utilities (error_handler, logging, security, utils and core) a: filters Related to message filters: (antimalware, antispam, filtering, token_remover) p: 2 - normal Normal Priority s: needs review Author is waiting for someone to review and approve t: enhancement Changes or improvements to existing features labels Dec 15, 2021
Copy link
Copy Markdown
Contributor

@Bluenix2 Bluenix2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works as usual - I can't get access to an MFA token to test though. Thanks for this, I just have one comment that won't affect my review.

Comment thread bot/exts/filters/token_remover.py
Copy link
Copy Markdown
Contributor

@Akarys42 Akarys42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a small style change

Comment thread bot/exts/filters/token_remover.py
Copy link
Copy Markdown
Contributor

@wookie184 wookie184 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple things, could you also make the message match the other token message a bit closer, particularly:

  • having the user as a clickable mention and the ID in a codeblock afterwards
  • having the channel as a clickable mention
  • putting the censored token in a codeblock
  • adding the pfp of the user that send the message as a thumbnail

Comment thread bot/exts/filters/token_remover.py Outdated
Comment thread bot/exts/filters/token_remover.py Outdated
Co-authored-by: wookie184 <wookie1840@gmail.com>
@ToxicKidz ToxicKidz added s: waiting for author Waiting for author to address a review or respond to a comment and removed s: needs review Author is waiting for someone to review and approve labels Feb 19, 2022
@wookie184
Copy link
Copy Markdown
Contributor

This has been inactive for a while so I'll put it up for grabs.

@wookie184 wookie184 added up for grabs Available for anyone to work on and removed s: waiting for author Waiting for author to address a review or respond to a comment labels Sep 21, 2022
@gustavwilliam
Copy link
Copy Markdown
Contributor

This has been inactive for a while so I'll put it up for grabs.

What's left to do? Addressing your comment and fixing the merge conflict?

@mbaruh
Copy link
Copy Markdown
Member

mbaruh commented Oct 8, 2022

This has been inactive for a while so I'll put it up for grabs.

What's left to do? Addressing your comment and fixing the merge conflict?

Yeah, merge main, match the output embed to the existing one, and fix the bug

@onerandomusername
Copy link
Copy Markdown
Contributor Author

As far as I can tell the mfa code is no longer necessary as discord doesn't have mfa tokens like that now.

@onerandomusername
Copy link
Copy Markdown
Contributor Author

onerandomusername commented Oct 9, 2022

There's also a bit of the matter that this regex is now out of date. I don't know what the current token length is.

@onerandomusername onerandomusername marked this pull request as draft October 9, 2022 05:26
@wookie184
Copy link
Copy Markdown
Contributor

There's also a bit of the matter that this regex is now out of date. I don't know what the current token length is.

I can't recall any problems with false positives, even with the current regex, so we can afford to be very general with what we match. Could just check that the first and last parts are >= 10 and middle is >= 5 characters in length or something.

@Akarys42
Copy link
Copy Markdown
Contributor

Akarys42 commented Oct 9, 2022

I agree with wookie, I think having extensive matching for the sake of even future-proofing is okay, considering additional checks are being performed

@onerandomusername
Copy link
Copy Markdown
Contributor Author

The only inaccurate part is the last section is too short as it is, but because of how the parsing works, that isn't an issue, i suppose

@onerandomusername onerandomusername marked this pull request as ready for review October 10, 2022 01:31
@wookie184
Copy link
Copy Markdown
Contributor

The only inaccurate part is the last section is too short as it is, but because of how the parsing works, that isn't an issue, i suppose

If that's the case I think we should just have no upper limit on the length of the last section. Otherwise the code is somewhat misleading in how it works.

@mbaruh mbaruh removed the up for grabs Available for anyone to work on label Oct 10, 2022
@mbaruh
Copy link
Copy Markdown
Member

mbaruh commented Oct 14, 2022

Hey @onerandomusername, thanks for your work so far. We needed to get this merged so went ahead and implemented the comments. Sorry the MFA part didn't end up being used 😅

@mbaruh mbaruh dismissed wookie184’s stale review October 14, 2022 12:32

Requested changes were related to the removed MFA part.

@mbaruh mbaruh merged commit cdb9183 into python-discord:main Oct 14, 2022
@onerandomusername onerandomusername deleted the enhance-token-regex branch October 14, 2022 16:22
Comment thread bot/exts/filters/token_remover.py Outdated
# Each part only matches base64 URL-safe characters.
# These regexes were taken from discord-developers, which are used by the client itself.
TOKEN_RE = re.compile(r"([a-z0-9_-]{23,28})\.([a-z0-9_-]{6,7})\.([a-z0-9_-]{27})", re.IGNORECASE)
TOKEN_RE = re.compile(r"([\w_-]{10,})\.([\w_-]{5,})\.([\w_-]{10,})", re.IGNORECASE)
Copy link
Copy Markdown
Contributor Author

@onerandomusername onerandomusername Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

way too lenient, this makes clearly impossible tokens match, which was the entire point of this pr.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way too lenient is not defined by what it can match, but what it will match in practice, which so far has not been an issue. This PR made the regex a little more flexible in terms of false-positives, without increasing the possibility of false-negatives or creating something that has to be modified in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

a: backend Related to internal functionality and utilities (error_handler, logging, security, utils and core) a: filters Related to message filters: (antimalware, antispam, filtering, token_remover) p: 2 - normal Normal Priority t: enhancement Changes or improvements to existing features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants