New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task to remove orphan symlinks #3543

Merged
merged 7 commits into from Mar 23, 2018

Conversation

Projects
None yet
3 participants
@humitos
Member

humitos commented Jan 23, 2018

This is an initial implementation of this task to remove orpahn symlinks, which should be ran periodically every 1h aprox.

In the ticket, I also suggested that the Edit action doesn't make sense to me and should be reconsidered.

Finally, if we consider to remove the Edit action, this task should be run just once and should be enough.

Closes #3493

@humitos humitos requested a review from ericholscher Jan 23, 2018

for domain_path in [symlink.PROJECT_CNAME_ROOT, symlink.CNAME_ROOT]:
for domain in os.listdir(domain_path):
try:
Domain.objects.get(domain=domain)

This comment has been minimized.

@humitos

humitos Jan 23, 2018

Member

A different approach on the implementation, making just one query to database could be like this:

            valid_cnames = set(Domain.objects.all().values_list('domain', flat=True))
            orphan_cnames = set(os.listdir(domain_path)) - valid_cnames
            for cname in orphan_cnames:
                os.unlink(orphan_domain_path)

This comment has been minimized.

@agjohnson

agjohnson Mar 9, 2018

Contributor

I like this approach, it saves a lot of queries.

@ericholscher

This comment has been minimized.

Member

ericholscher commented Jan 25, 2018

The issue link here is wrong. Would be good to know more about the background here. I imagine there are places we should fixing this logic in the code too.

@humitos

This comment has been minimized.

Member

humitos commented Jan 26, 2018

@ericholscher I fixed the issue link in the description. There are some context in the issue.

I imagine there are places we should fixing this logic in the code too.

I'm not very familiarized about how symlinks work, but this PR is half of the solution that it's described in the issue.

@agjohnson

Tests look good! I do prefer the second query you posted, so I'd say 👍 on moving to that query.

for domain_path in [symlink.PROJECT_CNAME_ROOT, symlink.CNAME_ROOT]:
for domain in os.listdir(domain_path):
try:
Domain.objects.get(domain=domain)

This comment has been minimized.

@agjohnson

agjohnson Mar 9, 2018

Contributor

I like this approach, it saves a lot of queries.

@humitos

This comment has been minimized.

Member

humitos commented Mar 9, 2018

@agjohnson I changed the query and did a quick test locally: all the cnames that comes with the test data (djangokong.com for example) were removed but temporal.humitos.ar (the one that I have been using for .com) was kept there.

[09/Mar/2018 08:15:27] readthedocs.projects.tasks:795[6934]: INFO Unlinking orphan CNAME: /home/humitos/rtfd/code/readthedocs.org/private_cname_project/djangokong.com

Please, take a new a look at this and after merged I will create a new PR under -ops to configure the periodic task.

@agjohnson

This comment has been minimized.

Contributor

agjohnson commented Mar 9, 2018

Any reason to not set the task up as a scheduled task in the application? I think we can keep this out of ops.

@humitos

This comment has been minimized.

Member

humitos commented Mar 9, 2018

No strong reason, really. I though we were setup them only in the ops repo as I think we don't have any other scheduled task in this repository. If we have, can you point me where they are? Otherwise, can you give an example (from the web) where/how I should configure this?

@agjohnson

This comment has been minimized.

Contributor

agjohnson commented Mar 9, 2018

We have scheduled tasks set up in our corporate repo. The ops tasks are just for tasks that only make sense in production. This is a task that would be useful for a development instance.

@humitos

This comment has been minimized.

Member

humitos commented Mar 13, 2018

I define the task here under CELERYBEAT setting and opened a PR in the ops repo to extend the basic definition from here.

This should be ready to be merged.

@agjohnson

Whoops! One thing i didn't catch here was that this isn't calling broadcast to all webs. Celery beat only runs one instance per cluster, not on each instance. The task will need to call broadcast instead.

The changes look okay at this point, but the call logic and tests will change with this.

@humitos

This comment has been minimized.

Member

humitos commented Mar 22, 2018

I just pushed the changes requested.

I didn't find any example for celery beat and broadcast. So, I created a new task just to broadcast the one that I already had.

The broadcast function could be a decorator (maybe) in the future if we need to do this multiple times.

@agjohnson

This comment has been minimized.

Contributor

agjohnson commented Mar 23, 2018

I resolved conflicts here, going to wait on tests to merge though

@agjohnson

This comment has been minimized.

Contributor

agjohnson commented Mar 23, 2018

Same issue with linting. Merging!

@agjohnson agjohnson merged commit da56a7d into master Mar 23, 2018

1 check failed

continuous-integration/travis-ci/pr The Travis CI build failed
Details

@agjohnson agjohnson deleted the humitos/symlink/remove-orphans branch Mar 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment