Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing: Automate site expiry #3

Closed
m-i-l opened this issue Nov 29, 2020 · 8 comments
Closed

Indexing: Automate site expiry #3

m-i-l opened this issue Nov 29, 2020 · 8 comments
Labels
bug Something isn't working

Comments

@m-i-l
Copy link
Contributor

m-i-l commented Nov 29, 2020

All indexed sites, whether submitted via Quick Add or Verified Add, have an expire_date field. This is initially set to 1 year after the validation_date (if validation_method is IndieAuth or DCV) or 1 year after the site is approved (if validation_method is QuickAdd). The idea is that sites are only indexed for a year unless further action is taken, to help try and stop too many stale and unmaintained sites building up in the system, which in turn wastes indexing resources and pollutes results.

The issue is that there's no automated code that does anything with the expire_date at the moment.

For sites submitted via Quick Add, the site should go back to the Review phase (i.e. move from tblIndexedDomains to tblPendingDomains), where the moderator(s) either reapprove for another year, or reject to move to the excluded sites.

Need to confirm the exact process for sites submitted via Verified Add. They could simply also be moved back to tblPendingDomains as per Quick Add, preserving owner_submitted, submission_method etc. status, with the owner having to enter the home page in the Validated Add and the final Verify button again. However, there may need to be some additional activity, e.g. an email reminder a week before.

As for where this code could be implemented, there is some maintenance related code already in src/indexer/search_my_site_scheduler.py given that is already run every 2 mins, although it might make more sense to pull all the maintenance related code into a new script (which doesn't need to be run so frequently).

Something needs to be implemented before July 2021 when the first expire_dates arrive.

@m-i-l m-i-l changed the title Automate site expiry Indexing: Automate site expiry Dec 4, 2020
@m-i-l m-i-l added the bug Something isn't working label Dec 4, 2020
m-i-l added a commit that referenced this issue Oct 16, 2021
…blPendingDomains and tblExcludeDomains into tblDomains (again), to simplify changes such as #3 Automate site expiry. Also updated schema for #20 and #11
@m-i-l
Copy link
Contributor Author

m-i-l commented Nov 21, 2021

I should have implemented this before July. There are now 605 expired quick add sites and 66 expired verified add sites. I'll manually expire for now to catch up, and then implement automated expire once caught up. If I do 20 a day I should catch up within a month. This is one of the flaws in the searchmysite.net approach - there are a lot of sites to manually review on an annual basis. I guess I was hoping there would be more of a community around it by now and so others who could help with moderation.

Anyway, following the (re-)simplification of the database schema in release v1.0.6, expiry is achieved in the following way:

  • Quick Add sites: set moderator_approved = NULL, and delete from search index. This will put it back in the review queue. If review is accepted it will be reindexed and have the expire_date set to a year ahead, and if rejected it will be marked moderator_approved = FALSE.
  • Verified Add sites: set owner_verified = FALSE, remove the verified benefits, i.e. set api_enabled = FALSE, indexing_frequency = '28 days', indexing_page_limit = 50, set the expire_date to be now + 1 year, and email the owner. If they re-verify it will reset these these values, and if they don't it will go through the Quick Add expiry workflow in a year.

m-i-l added a commit that referenced this issue Dec 11, 2021
@m-i-l
Copy link
Contributor Author

m-i-l commented Dec 11, 2021

Okay, I've spent the past 3 weeks manually expiring and reviewing all 605 expired unverified (quick add) sites. Not sure it was the best use of so much time, and this may be a flaw in the curated approach, although perhaps by this time next year (when all these sites will need reviewing again) there will be more than one moderator.

I've also added code to auto-expire unverified sites.

I haven't added code to auto-expire verified sites yet, because I'm cautious about code which sends emails to users, and want to manually send the first batch out. There were 62 expired verified sites, and I've begun manually expiring and sending the following email:

To: <contact_email>
Subject: searchmysite.net verified add expiry
Dear owner,

Over a year ago you verified ownership of your site with https://searchmysite.net/ to gain access to its search as a service features. Thank you for doing so, and I hope you have have found it useful in that time.

However, your annual verification has now expired. Your site is still listed, but it now no longer has the search as a service features such as increased indexing page limit and access to the API. If you would like to renew for another year, please go to https://searchmysite.net/admin/add/ and select "Verified Add (IndieAuth)" or "Verified Add (DCV)". If you select the verification method you used originally, you will not need to repeat the initial Verified Add steps.

If you have any questions or comments, please don't hesitate to contact me personally.

Regards,

Michael Lewis

@m-i-l
Copy link
Contributor Author

m-i-l commented Jan 8, 2022

I've now manually expired all the expired Verified Add sites and manually emailed all those site owners. I could now fully automate this moving forward, but I have a bit of a fear of automated email systems going wrong and spamming users (long story), so (given there aren't that many verified sites left) I'm just going to continue periodically manually doing this. SQL for finding and expiring sites is as per below, and email template is as previous comment. Marking as closed.

-- Find expired verified sites:
SELECT domain, contact_email, home_page, expire_date from tblDomains
WHERE expire_date < now()
AND validation_method IN ('IndieAuth', 'DCV')
AND owner_verified = TRUE
AND indexing_enabled = TRUE
AND api_enabled = TRUE
AND indexing_type = 'spider/default'
ORDER BY expire_date ASC;
-- And to expire a domain:
UPDATE tblDomains SET owner_verified = FALSE, api_enabled = FALSE, indexing_frequency = '28 days', indexing_page_limit = 50, expire_date = now() + '1 year', indexing_current_status = 'PENDING'
WHERE domain = (%s);
-- Plus send email as template above

@m-i-l m-i-l closed this as completed Jan 8, 2022
@m-i-l
Copy link
Contributor Author

m-i-l commented Aug 21, 2022

Reopening this.

I'm looking at a new database scheme to reduce the chance of unknown states being reached, e.g. #67 "A site submitted via Quick Add but awaiting approval, then submitted again via Verified Add, won't be indexed until moderator approval", and also to support #65 "Search as a service: Free trial mode", so that would be a good opportunity to fully automate expiry of paid listings.

BTW, there are now 5 paid listings which should have been expired which haven't been yet.

@m-i-l
Copy link
Contributor Author

m-i-l commented Sep 10, 2022

The 5 paid listings which should have expired have now been extended by a year so it doesn't cause an issue when I roll out the new schema which will better support an automated reminder and expiry system, along with an easy way to renew. Still haven't decided whether to let the site owners know.

@m-i-l
Copy link
Contributor Author

m-i-l commented Sep 18, 2022

New schema and changes rolled out. For now I'm sending an email to the site admin email and will forward accordingly, but once I've confidence it is not going to accidentally spam people I'll update to send direct to the users. The text for the email is:

Dear {email},

Thank you for subscribing {domain} to searchmysite.net. I hope you have found it useful.

Unfortunately, your subscription has now expired, and your Full listing has reverted to a Free Trial listing.
This means that you can still log on, and still use the API, for the time being.
If you would like to continue using the search as a service, you will need to resubscribe.
While the Free Trial listing is active you can do this by simply going to https://searchmysite.net/admin/manage/subscriptions/ and selecting Purchase.
Once the Free Trial expires, you will need to renew via Add Site (although you will not need to verify ownership of your site again).

If you have any questions or comments, please don't hesitate to reply.

Regards,

searchmysite.net

@m-i-l m-i-l closed this as completed Sep 18, 2022
@m-i-l
Copy link
Contributor Author

m-i-l commented Nov 13, 2022

The first of these was sent to admin today, so I forwarded to the user. Looked promising.

m-i-l added a commit that referenced this issue May 7, 2024
@m-i-l
Copy link
Contributor Author

m-i-l commented May 7, 2024

Updated so site expiry emails now get sent directly to users. Note that admin will still receive copies for visibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant