-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add management command for sync suppressed emails #21784
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
from django.core.management.base import BaseCommand | ||
|
||
from olympia.users.tasks import sync_blocked_emails | ||
|
||
|
||
class Command(BaseCommand): | ||
"""Sync Socket labs suppression list to database.""" | ||
|
||
def handle(self, *args, **options): | ||
sync_blocked_emails.apply() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think using apply might be the right thing here. Then the task executes immediately, and logs are printed to the console where the management command is executed. Happy to hear feedback on that but it felt right when executing locally. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -105,6 +105,8 @@ def assert_socket_labs_settings_defined(): | |
def sync_blocked_emails(batch_size=BATCH_SIZE, **kw): | ||
assert_socket_labs_settings_defined() | ||
|
||
task_log.info('fetching suppression list from socket labs...') | ||
|
||
path = f'servers/{settings.SOCKET_LABS_SERVER_ID}/suppressions/download' | ||
params = {'sortField': 'suppressionLastUpdate', 'sortDirection': 'dsc'} | ||
url = ( | ||
|
@@ -120,6 +122,9 @@ def sync_blocked_emails(batch_size=BATCH_SIZE, **kw): | |
# Raise exception if not 200 like response | ||
response.raise_for_status() | ||
|
||
size_in_mb = len(response.content) / (1024 * 1024) | ||
task_log.info(f'Downloaded suppression list of {size_in_mb:.2f} MB.') | ||
|
||
with tempfile.NamedTemporaryFile( | ||
dir=settings.TMP_PATH, delete=not settings.DEBUG, mode='w+b' | ||
) as csv_file: | ||
|
@@ -131,6 +136,8 @@ def sync_blocked_emails(batch_size=BATCH_SIZE, **kw): | |
|
||
next(csv_suppression_list) | ||
|
||
count = 0 | ||
|
||
while True: | ||
batch = list(itertools.islice(csv_suppression_list, batch_size)) | ||
|
||
|
@@ -139,6 +146,9 @@ def sync_blocked_emails(batch_size=BATCH_SIZE, **kw): | |
|
||
email_blocks = [SuppressedEmail(email=record[3]) for record in batch] | ||
SuppressedEmail.objects.bulk_create(email_blocks, ignore_conflicts=True) | ||
count += len(batch) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe we want some logging every batch to show progress? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it's necessary, at least not yet. Only added the logging here because the download part can take some time to run and without any logs it can be confusing and made me wonder if something was wrong with the code. I added the count at the end to just verify that we have synced "something". The syncing is really fast due to the batching so it wouldn't add much value IMO. |
||
|
||
task_log.info(f'synced {count:,} suppressed emails.') | ||
|
||
|
||
@task(autoretry_for=(HTTPError, Timeout), max_retries=5, retry_backoff=True) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue might explain further, but if this is something that is expected to be run as a cron, with no args or options, it would probably be better implemented as a simple function in cron.py instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is a simple convenience for testing and if we ever want to manually resync the suppression list. We should run this task on a periodic schedule. The issue will likely not explain much unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do that with a simple function in cron.py too 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I trigger a function in cron in both a management command and periodically? Just curious? Also would there be retry logic if implemented that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in
crontab.tpl
we execute both management commands directly, and simple functions via thecron
management command.There is no retry logic in our cron setup (other that it'd be executed again the time it was scheduled). The cron.py function vs. a separate management command .py file is just about the initial execution - in both cases you'd have to either write some custom retry logic, or wrap in a task and set a retry on the task.