-
Notifications
You must be signed in to change notification settings - Fork 919
Add whitelist option to URL lock #27
base: master
Are you sure you want to change the base?
Conversation
@garciaW Oo pretty good one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so far so good, but a bunch of issues i can see arising.
@@ -80,6 +83,59 @@ def locktypes(bot: Bot, update: Update): | |||
update.effective_message.reply_text("\n - ".join(["Locks: "] + list(LOCK_TYPES) + list(RESTRICTION_TYPES))) | |||
|
|||
|
|||
@user_admin | |||
def add_whitelist(bot: Bot, update: Update): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be @run_async
to ensure as few blocks as possible
Also needs to be @loggable
to ensure log channels get info -> which means returning a log at the end as well
tg_bot/modules/locks.py
Outdated
if sql.add_whitelist(chat.id, url): | ||
added.append(url) | ||
if added: | ||
message.reply_text("Added {} to whitelist.".format(', '.join(w for w in added))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary list comprehension.
Also, consider using newlines starting with -
to delimit the different urls; putting everything on one line will be messy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, fixed.
tg_bot/modules/locks.py
Outdated
def add_whitelist(bot: Bot, update: Update): | ||
chat = update.effective_chat # type: Optional[Chat] | ||
message = update.effective_message # type: Optional[Message] | ||
entities = message.parse_entities(MessageEntity.URL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse_entities takes a list
https://python-telegram-bot.readthedocs.io/en/stable/telegram.message.html?highlight=parse_entities
message.reply_text("No URLs were added to the whitelist") | ||
|
||
|
||
@user_admin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as before; @run_async
and @loggable
. applies for all funcs
tg_bot/modules/locks.py
Outdated
removed.append(url) | ||
if removed: | ||
message.reply_text("Removed `{}` from whitelist.".format('`, `'.join(escape_markdown(w) for w in removed)), | ||
parse_mode=ParseMode.MARKDOWN) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if possible, use HTML formatting to avoid message parsing issues. escape_markdown is more fragile than escape_html.
You can also get away with removing the list comp and calling escape_{whichever}() on the joined list instead of each item.
tg_bot/modules/locks.py
Outdated
@@ -279,11 +345,14 @@ def __chat_settings__(chat_id, user_id): | |||
|
|||
__help__ = """ | |||
- /locktypes: a list of possible locktypes | |||
- /whitelisted: lists urls in this chat's whitelist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indent this properly (one more space!)
tg_bot/modules/sql/locks_sql.py
Outdated
def add_whitelist(chat_id, url): | ||
global CHAT_WHITELIST | ||
with WHITELIST_LOCK: | ||
url = re.search(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?(\S*)', url, flags=re.I).group(3).lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given this regex pattern doesnt change, compile it and use it as a global.
Also, what happens if group(3) is None? lower() will die
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not convinced this should be here, given it isnt sql logic (and youre wasting CPU cycles given this bit doesnt need the lock yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only called for strings Telegram classified as a URL entity, and group(3) matches with \S* which is "any non-whitespace character zero or more times", so it really never should be None.
Will compile and move the pattern out to a global variable. I decided to have all regexp patterns in a single file so it's easy to change them all, if needed.
tg_bot/modules/sql/locks_sql.py
Outdated
whitelisted = URLWhitelist(str(chat_id), url) | ||
SESSION.add(whitelisted) | ||
SESSION.commit() | ||
chat_whitelist = CHAT_WHITELIST.setdefault(str(chat_id), {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldnt this be indented too? since if not prev
, this is already loaded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more a case of "double bagging" :-) If for whatever reason the URL was in the DB but not in the dictionary, now it sure is.
At least the return True
statement I would leave unindented, so the bot's confirmation message will include the URL even if it was already previously added.
tg_bot/modules/sql/locks_sql.py
Outdated
row.chat_id = str(new_chat_id) | ||
SESSION.commit() | ||
|
||
__load_chat_whitelist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a newline at EOF for pep8
tg_bot/modules/sql/locks_sql.py
Outdated
CHAT_WHITELIST[str(row.chat_id)].update( | ||
{row.url: re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?'+re.escape(row.url)+'($|\W)', | ||
flags=re.I | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: does this bracket really need its own line?
chat = update.effective_chat # type: Optional[Chat] | ||
user = update.effective_user # type: Optional[User] | ||
message = update.effective_message # type: Optional[Message] | ||
entities = message.parse_entities([MessageEntity.URL]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline this variable; it isn't used anywhere other than the for
loop.
entities = message.parse_entities([MessageEntity.URL]) | ||
added = [] | ||
for url in entities.values(): | ||
if sql.add_whitelist(chat.id, url): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be turned into a list comprehension.
added = [url for url in message.parse_entities([MessageEntity.URL]).values() if sql.add_whitelist(chat.id, url)]
if sql.add_whitelist(chat.id, url): | ||
added.append(url) | ||
if added: | ||
message.reply_text("Added to whitelist:\n- "+'\n- '.join(added)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep string characters consistent; use ""
for the join
"\n<b>Admin:</b> {}" \ | ||
"\nWhitelisted:\n<pre>- {}</pre>".format(html.escape(chat.title), | ||
mention_html(user.id, user.first_name), | ||
html.escape('\n- '.join(added))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since youre doing this join twice, make it a variable
return "<b>{}:</b>" \ | ||
"\n#WHITELIST" \ | ||
"\n<b>Admin:</b> {}" \ | ||
"\nWhitelisted:\n<pre>- {}</pre>".format(html.escape(chat.title), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont add the -
to the code block; the code is there so you can copy paste it just by tapping.
# url. So I must add all entities that have a 'url' field separately | ||
entities = entities | set(entity.url for entity in message.entities if entity.url) | ||
#if all URLs are any of the whitelisted ones, accept the message | ||
if all( any(regexp.search(text) for regexp in sql.get_whitelist(chat.id).values()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you handle invalid regex expressions? if any are broken, this will blow up and entirely break whitelisting for that chat. I would personally not allow regex for this, as most users have very limited knowledge of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users don't get to interact with any of this as regex. All they do is send URLs. Only text that Telegram considers a URL is added (via entities). Also, the URLs themselves are escaped with re.escape
before the regex is compiled.
|
||
|
||
PERM_LOCK = threading.RLock() | ||
RESTR_LOCK = threading.RLock() | ||
WHITELIST_LOCK = threading.RLock() | ||
CHAT_WHITELIST = {} | ||
URL_REGEXP = re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?(\S*)', flags=re.I) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic should be a part of the module, not the sql
chat_whitelist.update( | ||
{url: re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?'+re.escape(url)+'($|\W)', | ||
flags=re.I)}) | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
session.close() before returning
|
||
|
||
def add_whitelist(chat_id, url): | ||
url = URL_REGEXP.search(url).group(3).lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple chaining will blow up in case of None
return from the search, or the group.
|
||
|
||
def __load_chat_whitelist(): | ||
#whitelist for each group is a dict(url: compiled_regexp for url in group) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: pep8 comments have a space after the #
Also adds missing filters to lock text links.