- create a database of text from tilde.club (and other .club) home pages
- maybe manually prune them down to 'tweetable' or otherwise high-quality snippets
- set up a twitter bot to tweet quotes along with a link to the corresponding tilde.club page
- getusers.pl generate SQL to insert user records into sqlite3 db
- generate_usermap.sh select from db to create map of userid to username
- local_crawl.pl takes a usermap file and process index.html for each user. Produces an INSERT statement for each sentence on the page. Sentences go into tweet table. (DEPRECATED)
- crawl.py fetches and parses user pages and inserts sentences into tweet table
- tweet.py Does not actually tweet. Chooses a random row from tweet table, marks it as used, and prints out a constructed tweet that includes link to originating page.
- tweet_tilde_quote a shell script that takes the output of tweet.py and tweets it out. This is called from cron.
- report.py generates a list of the top N most prolific writers
- ~~add other ~sites~~
- make crawl.py smart enough to only process recently updated pages (fetch the .JSON file)
- add crawl.py to cron
once it is smart enough.added it anyway.