Skip to content

Commit

Permalink
Implement checks to prevent retweeting massive amounts of other bots.
Browse files Browse the repository at this point in the history
Currently the bot will retweet thousands and thousands of tweets within a matter of a few hours. Almost 95% or more of these tweets are not even legitimate contests -- they are other Twitter bots lacking sophisticated check methods to prevent retweeting useless content. The checks I have implemented will almost entirely reduce the amount of fake contests retweeted.
 - New minimum retweets needed setting. This will prevent the system from retweeting anything with less retweets than specified (default 10). Bots retweeting other bots almost never have above 10 retweets on them, so this filters out fake material well. Legitimate contests are going to have tens, or often hundreds of retweets.
 - Max tweets on a user setting. This is slightly overkill, but if someone has 20k+ tweets, they're USUALLY a bot. This feature can be disabled if you do not want it. However, the system will be set to ignore people with 20k+ tweets by default because, almost always, they're another bot going out of control with retweeting other bots.
 - Auto block users with too many tweets. If setting mentioned above is enabled, will also block them. Sadly, Twitter does not actually make a blocked user disappear (if they did that would be AMAZING), so this doesn't really do anything except add further checks to prevent accidentally retweeting them.

Bad tweet blacklisting. Oddly, the system sometimes attempts to retweet something you've already retweeted. Currently this is not handled well and tells the system to wait 10 minutes, and readds that tweet to the front of the array. This would result in the system trying (and failing) to retweet the same tweet forever, every 10 minutes. Instead, we blacklist the tweet so it does not get added back into the search results array from future searches, and we continue on with checking other tweets.
  • Loading branch information
henhouse committed Feb 25, 2016
1 parent 68b6e05 commit 5755ca1
Show file tree
Hide file tree
Showing 2 changed files with 77 additions and 22 deletions.
24 changes: 17 additions & 7 deletions api-functions.js
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ var callbacks = {
});

console.log("So far we have a total of:", allItems.length);

// If we have the next_results, search again for the rest (sort of a pagination)
if (result.search_metadata.next_results) {
API.searchByStringParam(result.search_metadata.next_results, callbacks.default);
Expand All @@ -29,20 +29,20 @@ var callbacks = {
result.users.forEach(function (user) {
blockedList.push(user.id);
});
console.log("Your list of blocked users:", blockedList);
console.log("Your list of blocked users:\n", blockedList);

return blockedList;
}
};

var API = {
search: function (options) {
params =
params =
"?q=" + encodeURIComponent(options.text),
"&count=" + options.count ? options.count : 100,
"&result_type=" + options.result_type ? options.result_type : 'popular',
"&since_id=" + options.since_id ? options.since_id : 0;

if (options.max_id) {
params += "&max_id=" + options.max_id;
}
Expand Down Expand Up @@ -78,23 +78,33 @@ var API = {
.then(callbacks.default)
.catch(function (err) {
console.error(err.message);
});
});
},

follow: function (userId) {
request.post({url: 'https://api.twitter.com/1.1/friendships/create.json?user_id=' + userId, oauth: oauth})
.then(callbacks.default)
.catch(function (err) {
console.error(err.message);
});
});
},

followByUsername: function (userName) {
request.post({url: 'https://api.twitter.com/1.1/friendships/create.json?screen_name=' + userName, oauth: oauth})
.then(callbacks.default)
.catch(function (err) {
console.error(err.message);
});
});
},

blockUser: function(userId)
{
request.post({url: 'https://api.twitter.com/1.1/blocks/create.json?user_id=' + userId, oauth: oauth})
.then(callbacks.default)
.catch(function(err)
{
console.error(err.message);
});
},

getBlockedUsers: function (callback) {
Expand Down
75 changes: 60 additions & 15 deletions index.js
Original file line number Diff line number Diff line change
@@ -1,29 +1,74 @@
var API = require('./api-functions'),
RATE_LIMIT_EXCEEDED_TIMEOUT = 1000 * 60 * 10, // 10 minutes
RETWEET_TIMEOUT = 1000 * 15, // 15 seconds
RATE_SEARCH_TIMEOUT = 1000 * 30; // 30 seconds
RATE_SEARCH_TIMEOUT = 1000 * 30, // 30 seconds

// Minimum amount of retweets a tweet needs before we retweet it.
// - Significantly reduces the amount of fake contests retweeted and stops
// retweeting other bots that retweet retweets of other bots.
// Default: 10
MIN_RETWEETS_NEEDED = 10,

// Maxiumum amount of tweets a user can have before we do not retweet them.
// - Accounts with an extremely large amount of tweets are often bots,
// therefore we should ignore them and not retweet their tweets.
// Default: 20000
// 0 (disables)
MAX_USER_TWEETS = 20000,

// If option above is enabled, allow us to block them.
// - Blocking users do not prevent their tweets from appearing in search,
// but this will ensure you do not accidentally retweet them still.
// Default: false
// true (will block user)
MAX_USER_TWEETS_BLOCK = false;


// Main self-initializing function
(function() {
var last_tweet_id = 0,
searchResultsArr = [],
blockedUsers = [],
badTweetIds = [],
limitLockout = false;

/** The Callback function for the Search API */
var searchCallback = function (response) {
var searchCallback = function(response)
{
var payload = JSON.parse(response);

// Iterating through tweets returned by the Search
payload.statuses.forEach(function (searchItem) {

// Further filtering out the retweets and tweets from blocked users
if (!searchItem.retweeted_status && blockedUsers.indexOf(searchItem.user.id) === -1) {

// Save the search item in the Search Results array
searchResultsArr.push(searchItem);
}
payload.statuses.forEach(function (searchItem)
{
// Lots of checks to filter out bad tweets, other bots and contests that are likely not legitimate

// is not already a retweet
if (!searchItem.retweeted_status)
{
// is not an ignored tweet
if (badTweetIds.indexOf(searchItem.id) === -1)
{
// has enough retweets on the tweet for us to retweet it too (helps prove legitimacy)
if (searchItem.retweet_count >= MIN_RETWEETS_NEEDED)
{
// user is not on our blocked list
if (blockedUsers.indexOf(searchItem.user.id) === -1)
{
if (MAX_USER_TWEETS && searchItem.user.statuses_count < MAX_USER_TWEETS) // should we ignore users with high amounts of tweets (likely bots)
{
// Save the search item in the Search Results array
searchResultsArr.push(searchItem);
}
else if (MAX_USER_TWEETS_BLOCK) // may be a spam bot, do we want to block them?
{
blockedUsers.push(searchItem.user.id);
API.blockUser(searchItem.user.id);
console.log("Blocking possible bot user " + searchItem.user.id);
}
}
}
}
}
});

// If we have the next_results, search again for the rest (sort of a pagination)
Expand Down Expand Up @@ -110,15 +155,15 @@ var API = require('./api-functions'),

function error()
{
console.error("RT Failed for", searchItem.id, ". Re-trying after a timeout.");
console.error("RT Failed for", searchItem.id, ". Likely has already been retweeted. Adding to blacklist.");

// If the RT fails, add the item back at the beginning of the array
searchResultsArr.unshift(searchItem);
// If the RT fails, blacklist it
badTweetIds.push(searchItem.id);

// Re-start after a timeout
// Then, re-start the RT Worker
setTimeout(function () {
retweetWorker();
}, RATE_LIMIT_EXCEEDED_TIMEOUT);
}, RETWEET_TIMEOUT);
}
);
}
Expand Down

0 comments on commit 5755ca1

Please sign in to comment.