Skip to content
This repository has been archived by the owner on Apr 16, 2023. It is now read-only.

Have the "audit stats" command use all audits the user has ever done #13

Closed
gunr2171 opened this issue Jan 16, 2015 · 5 comments
Closed

Comments

@gunr2171
Copy link
Member

I store the audits for a user as they say it, which is good as long as the bot is running. Eventually, I would like to have the bot give the right information regardless of if it's running or not.

One of the big things I can do is to scrape a user's profile to get a list of all close vote review items and determine if they are audits or not.

Sam was nice enough to write me a library to get that data, but I will need to make some modifications to it. The default 10 pages makes SO block my IP for a few seconds because of the number of requests.

So this is my grand plan so far:

  1. Store all scraped CV review items in my own database. This will make lookup and querying much faster. Also, review items don't change, so I don't have to worry updating data once it has been inserted.

  2. Make a "refresh audits" command. This will initiate the search for new review items. The chat bot will say:

    Searching for new close vote review items. This may take a while. I will reply when I have finished.

    I have added 43 new close vote review items of yours into the database. Use the command audit stats to see a breakdown of your audits.

  3. For the actual parsing process: start at the beginning and go until you hit the first review item that is already stored in the database. Because the order of items should not change on the website, it will be safe to assume that any item past the first item you have already processed will also be processed already. This will severely shorten the time for subsequent data refreshes.

  4. Page grabs will need to be time-delayed. I don't want to get 503'ed by SO because i'm calling the pages too often. I'm thinking a 1 second delay between page grabs. Yes, this is going to be painfully slow on the first grab, but refreshes afterwards should be under 10 seconds or so.

@gunr2171 gunr2171 self-assigned this Jan 16, 2015
@gunr2171 gunr2171 added this to the v1.0 milestone Jan 16, 2015
@ArcticEcho
Copy link
Contributor

Just a small note,

  1. [...] Also, review items don't change, so I don't have to worry updating data once it has been inserted.

My own emphasis.

This is partially incorrect. As the review item itself may still be in the process of being reviewed (i.e., it hasn't been completed yet). So, depending on whether or not you need to access info about other reviewers of a review item, in the future, you may need to wait for the review to be completed before scrapping.

@gunr2171
Copy link
Member Author

True, but for now I only care if it's an audit or not.

I agreed that if I need more info about other reviewers then I will have to wait until it says "Review Complete".

@ArcticEcho
Copy link
Contributor

Sure, I wasn't aware what data you needed, so I just wanted to bring that to your attention.

@gunr2171 gunr2171 modified the milestones: v1.1, v1.0 Jan 23, 2015
@Tyler-H
Copy link

Tyler-H commented Nov 5, 2015

We should make sure that we use String.ToLower or something on them all so that we don't end up with duplicates from users who capitalize tags when manually reporting audits.

ArcticEcho added a commit that referenced this issue Nov 5, 2015
@gunr2171
Copy link
Member Author

We can't view close vote reviews by other users, so there goes that idea.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

3 participants