Save deleted tweets from Twitter accounts
CSS JavaScript HTML
Permalink
Failed to load latest commit information.
data Remove old files and update empty to run from anywhere Dec 16, 2016
models Fix corrupt snapshot from overwriting current snapshot Jan 16, 2017
public Improve arrangement of text Dec 18, 2016
routes Fix update time for stream and added analytics to tweet view pages Jan 3, 2017
src/js Add refetcher and unchecker to fetchers and clean up initial fetcher/… Jan 15, 2017
views Fix typo Jan 15, 2017
.gitignore Add .viewsMin to gitignore Dec 12, 2016
README.md Update README Dec 19, 2016
app.js Add refetcher and unchecker to fetchers and clean up initial fetcher/… Jan 15, 2017
benchmark.js Combined redundant tweet checker into one function Dec 20, 2016
benchmark.json Add benchmark and implemented queue system for checker Nov 21, 2016
check.js Fix percent complete calculation Jan 19, 2017
deadbird.sql Converted fetcher/getTemplate to use queues for failed attempts Jan 11, 2017
events.js Fetch replies as well Jan 7, 2017
fetch.js Fix percent complete calculation Jan 19, 2017
getTemplate.js Add fails to fetch and getTemplate Jan 15, 2017
gulpfile.js Clean up fs organization Dec 3, 2016
helpers.js Halved progress dots for cachers Jan 15, 2017
package.json Add refetcher and unchecker to fetchers and clean up initial fetcher/… Jan 15, 2017
refetcher.js Convert refetcher to queue Jan 15, 2017
server.js Implemented events and start restructuring code base Dec 20, 2016
settings.json Add new settings Jan 15, 2017
socket.js Clean up codebase in progress Dec 22, 2016
uncheck.js Fix rare false positive for uncheck Jan 16, 2017
utils.js Organized routes Dec 23, 2016

README.md

Deadbird

Save deleted tweets of Twitter users.

Setup

  1. Create a MySQL user with read/write permissions and update settings.json with your db login information (Assuming your user, pass, and db is deadbird from here on)
  2. Create a new database called deadbird.
  3. Import deadbird.sql into your new db by running this from the command line mysql -u deadbird -pdeadbird -D deadbird < deadbird.sql.
  4. Run npm install to fetch Deadbird's dependencies.
  5. Run gulp to minify ejs templates and transpile js.
  6. (optional) If your fetchers keeps timing out even with low values, you might want to adjust your ulimit nofile settings. You can read more about how to set this up (for Ubuntu at least) here.

Running

Start up the Deadbird server by running npm start. This'll start the web server on port 3000 and the socket.io server on port 8080. All of the fetchers will be running in the background at the intervals specified in settings.json.

Available fetchers and helper scripts

Fetcher
Fetchers the last ~20 tweets from a user's timeline and inserts new tweets into into the database.
  • Runs with a delay of 0 by default (fetcherRestInterval).
  • Delay should be 0 or very low to ensure that as many tweets can be fetched as possible.
  • Depending on how much traffic your internet connection can handle, you should limit the number of users you are tracking. If you track too many users, it'll take longer for the fetcher to check everyone and it might be possible that someone could have tweeted something and instantly deleted it before being detected by the fetcher.
Checker
Checks tweets from within the last 7 days. Sees if we get a 200 status code to determine if the tweet still exists.
  • Runs with a delay of 900 by default (checkerRestInterval).
  • It isn't necessary for the checker to run continuously since a tweet can be checked for deletion at any time, however, the deletion date will be inaccurate.
  • Checks tweets that were made within the last 7 days. If you want to check tweets older than 7 days, you can run the checker manually by providing an argument to the checker with the number of days back you want to check. node checker 1000 - check tweets that are up to a 1,000 days old.
GetTemplate
Updates the user profile pictures and profile pages containing tweet/follower count, bio, header image, etc.
  • Runs daily by default (templateRestInterval).
  • Not necessary to run as often unless you want to keep up to date profile pics and twitter page bio/stats.
Unchecker
Check if deleted tweets are actually still deleted.
  • Very rarely, tweets may be deleted by the checker on accident.
  • The unchecker will compress tweets that are actually live still and mark them as undeleted as well as update deleted tweet counts for affected users.
Benchmark
Determine optimal settings depending on your internet connection
  • Benchmark will increase the number of open concurrent connections as long as the rate increases and errors occur less than 5% of the time.
  • Once the 5% error rate is surpassed or the avg tweets retrieved per second doesn't increase significantly, it'll back off 2 steps to be safe.
  • Next, the timeout time will be slowly lowered until the error rate surpasses 5% again and it'll back off 2 steps as well.
  • By the end of the benchmark, the settings.json will be updated with the best values that the benchmark program was able to determine.
  • Benchmark is a slow program and may take up to half an hour to complete. You can also experiment with values yourself to determine what works for you. You can also start the programming with some initial values if you want to speed up the process and are confident with your supplied values. node benchmark <speed> <timeout> <size> (default values: node benchmark 10 3000 2000)

Settings

general
    port:                 Webserver port number
    socket:               Socket.io port number
    basehref:             Sets the base tag for absolute urls. Some links will attempt to go to twitter.com
                          unless this is updated to match your path settings.
    rate:                 Number of concurrent connections while checking/unchecking tweets
    timeout:              How long until a connection attempt times out while looking up tweets
    maxSockets:           Playing around with this number can improve performance significantly. If you are
                          getting a lot of timeout errors increase timeout and consider adjusting this value.
    retrieversEnabled:    Enable/disable fetchers
    fetcherRestInterval:  Delay until fetcher loop runs again
    checkerRestInterval:  Delay until checker loop runs again
    templateRestInterval: Delay until template loop runs again
    acceptingNewUsers:    Enable/disable new user page
    maxUsers:             Maximum number of users that can be tracked. New user page will be disabled when this
                          value is exceeded.

db
    host: MySQL host
    socketPath:        If you are using UNIX sockets, this value might be required
    username:          DB username
    password:          DB password
    database:          DB name
    keepAliveInterval: Keep DB connection open by performing basic query every x milliseconds

session
    key:    Name of session
    secret: Session secret. This value should be changed.