Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "re-check all sites in list" button for admin use #377

Closed
alexhaydock opened this issue Mar 8, 2019 · 7 comments
Closed

Add "re-check all sites in list" button for admin use #377

alexhaydock opened this issue Mar 8, 2019 · 7 comments
Assignees

Comments

@alexhaydock
Copy link
Member

On the Blocked frontend, a user can currently check whether information about a block in a list is up-to-date with the following process:

  • Open URL
  • Select "request unblock"
  • Select re-check site

It would be particularly useful for the purposes of this report if admins were able to re-check all sites within a list using a single button.

I understand that flooding the system with re-checks wouldn't be good, but the largest list we have I think is the Weddings list, with roughly 4500 sites on it.

Each list would only really need the re-check all button to be used once before the report's publication just to make sure the data is up-to-date and accurate.

@dantheta
Copy link

dantheta commented Mar 8, 2019 via email

@alexhaydock
Copy link
Member Author

Oh that's useful info, thanks!

If those processes are working, that might be sufficient. It was the possible bug in #376 that got me thinking about a "re-check all" button being useful.

It may still be useful if it's not a large amount of work, so I can be certain that the total sites blocked/unblocked in lists are accurate before the report gets published. But if it would be a lot of work, then this weekly checking should suffice.

@alexhaydock
Copy link
Member Author

To follow up on this, if we're already re-checking sites on frontpage lists weekly, does it make sense to automatically remove sites from the list if we get NXDOMAIN back from multiple ISPs?

This relates to openrightsgroup/cmp-issues#225

@edjw
Copy link

edjw commented Mar 8, 2019

@dantheta

  1. When a reported URL is detected as unblocked, are we recording the time that a site's entry is updated in the database to reflect it as having been detected as unblocked?

  2. Where you say "(unless there are too many of them now)", what does that mean? Say the limit is 100 sites. What happens when there are more than 100 sites to re-check? Do they just stop getting checked? Or do all sites stop getting checked when there are more than 100 sites awaiting replies to unblock requests?

Also, for how long has the system done these things above?

@dantheta
Copy link

Ed -

  1. Yes - it goes into the last_updated column on the isp_reports table.

  2. The system does 100 of the daily checks every 15 minutes, so they will start to get processed at >1 day intervals when we have (424100) 9600 reports in the system. We're currently at about 4500, so no problem there.

There was a problem with the daily checks following the disk space crunch last year - this will affect the resolution of the unblocked timestamps on isp_reports, as the checks were unable to run for about 50 days - this has been fixed now. The unblock time stats on the public site may be inflated by this, but the stats on the blocking report will be unaffected as they are driven by the ISP email replies.

The daily checks and process reports were written into the system during late 2017 (august or october).

Alex -

That's a good point about the nxdomain. The ISP report unblock status was the first system function that fed results back into the system (as opposed to just storing and displaying them). That's something we can do more of. The system already knows which lines are unfiltered and trustworthy, so an nxdomain result on one/all of those could be used to make other changes.

dantheta added a commit to dantheta/blocked-frontend-py that referenced this issue Mar 10, 2019
@dantheta
Copy link

There's a basic version of this on the live site now. Submitting 40 sites takes about 10 seconds. It's possible to do a much more polished version in the future, perhaps a version that integrates with the test scheduler.

The recheck site button is on the public site lists page, and is only visible to admins.
If the page times out, it's possible to press the button again and the next set of elegible items on the page will be submitted. List items can only be checked once in a 24 hour period.

The nxdomain behaviour should be spun off to a separate issue.

@alexhaydock
Copy link
Member Author

Thanks for this, Dan. I suggest we use openrightsgroup/cmp-issues#225 for discussing the NXDOMAIN issue. I mainly raised it as a point because among some of the lists I've been testing, as much as a fifth of the sites are returning NXDOMAIN responses according to Chrome. I think a lot of small sites or businesses simply get forgotten about and the registrations lapse. I was just thinking it might not be great for the user experience of those new to Blocked for us to have lists filled with domains that no longer exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants