<a href="https://colab.research.google.com/github/svpino/twitter-giveaway/blob/master/giveaway.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Selecting The Winner of a Twitter Giveaway

This is an absolutely over-engineered solution to select the winner of a Twitter Giveaway I ran on August 18, 2020.

I didn't just want to select somebody randomly as any sane person would do. Instead, I tried to use the opportunity and throw some Deep Learning into the mix. 

But of course, there's no way to use Deep Learning in any useful way when the only thing you need is a freaking random number, so I decided to start doing silly things.

This notebook runs through the entire process to select the winner. 



In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Conv2D, Dense, MaxPooling2D
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.datasets import mnist

# Configuration

Here are a bunch of constants that —if you want to use this code for anything— you'll need to provide:

* **Twitter's API authentication keys and tokens**. To get this information you need to register an account on https://developers.twitter.com.

* **USER**: This is your Twitter handle. 

* **STATUS**: This is the identifier of the specific tweet of the giveaway.

In [2]:
API_KEY = "<API_KEY>"
API_KEY_SECRET = "<API_KEY_SECRET>"
ACCESS_TOKEN = "<ACCESS_TOKEN>"
ACCESS_TOKEN_SECRET = "<ACCESS_TOKEN_SECRET>"

USER = "<USER SCREEN NAME>"
STATUS = 0 #<ID OF THE GIVEAWAY TWEET>

# Deep Learning

The function here are all related to the image classification problem (basically taking hand-written numbers and returning the actual numbers that they represent.)

In [3]:
def create_model():
    """
    Creates the TensorFlow model that will be used to predict digits given
    an image.
    """

    model = Sequential([
        Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
        MaxPooling2D((2, 2)),
        Conv2D(64, (3, 3), activation='relu'),
        Conv2D(64, (3, 3), activation='relu'),
        MaxPooling2D((2, 2)),
        Flatten(),
        Dense(100, activation='relu'),
        Dense(10, activation='softmax')
    ])
    
    optimizer = SGD(learning_rate=0.01, momentum=0.9)
    model.compile(
        optimizer=optimizer, 
        loss='categorical_crossentropy', 
        metrics=['accuracy']
    )

    return model

def display_image(image):
    """
    Displays the given image on the screen.
    """

    plt.imshow(image, cmap=plt.get_cmap('gray'))

def transform(image):
    """
    Transforms an image into the format understood by the model.
    """

    image = image.reshape((1, 28, 28, 1))
    image = image.astype('float32') / 255.0
    return image

def train(model, x_train, y_train):
    """
    Fits the given model using the training dataset.
    """

    x_train = x_train.reshape((x_train.shape[0], 28, 28, 1))
    x_train = x_train.astype('float32') / 255.0
    y_train = to_categorical(y_train)

    model.fit(
        x_train, 
        y_train, 
        epochs=10, 
        batch_size=32, 
        verbose=1
    )

def predict(model, image):
    """
    Returns the digit represented by the supplied image using the 
    given model.
    """

    return np.argmax(model.predict(transform(image))[0], axis=-1)

## Loading the dataset

We are going to be using the mnist dataset. We load it, and display the first few images to get an idea of what's in it.

In [None]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

for i in range(9):
    plt.subplot(330 + 1 + i)
    display_image(x_train[i])

plt.show()

print(f"Training images: {x_train.shape[0]}. Test images: {x_test.shape[0]}.")

## Creating the model

We can now create the model and display a summary of its architecture.

In [None]:
model = create_model()
model.summary()

## Training

Here we are going to train the model using the training dataset.

In [None]:
train(model, x_train, y_train)

# Twitter

Let's now define the functions that we are going to use to access the Twitter API.

In [7]:
import tweepy
import time
import random

def authenticate():
    """ 
    This function authenticates with the Twitter API using our keys and
    returns the object that we need to use to interact with the API.
    """

    auth = tweepy.OAuthHandler(API_KEY, API_KEY_SECRET)
    auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
    return tweepy.API(auth, wait_on_rate_limit=True)

def get_followers(api):
    """
    This function returns the list of followers of a given user as specified
    by the supplied user id.
    """

    followers = []
    for page in tweepy.Cursor(api.followers_ids, screen_name=USER).pages():
        followers.extend(page)

        # We are able to return 5,000 users on every call, so we need to make
        # a few calls to get the entire list. Let's make sure we sleep for a
        # little bit to avoid getting throtled by Twitter's API.
        time.sleep(60)

    return followers

def did_user_like_tweet(user_id, status_id):
    """
    This function returns True if the supplied user liked the supplied 
    status (tweet). Returns False otherwise.
    """

    page = 1

    # We are going to be looking through the last 300 tweets that this user
    # liked. Yes, there's a chance that this user likes a lot of tweets and
    # we get a false negative, but that's unlikely in the context of this
    # giveaway.
    while page <= 15:
        # On every request we can get 20 different likes for this user.
        # We need to make sure we keep going through this list until we
        # find our status, or until we have looked for long enough.
        statuses = api.favorites(id=user_id, page=page)
        for status in statuses:
            if status.id == status_id:
                return True

        page += 1

    return False

def get_user(id):
    """
    This function returns the screen name and full name of a user given
    its identifier.
    """

    user = api.get_user(id)
    return user.screen_name, user.name

## Followers

We can now authenticate with the API and get the list of followers. This process may take a little bit of time because we have to hit the API several times to retrieve the entire list.

In [None]:
api = authenticate()
followers = get_followers(api)

print(f"Number of followers: {len(followers)}")

## Retweets

Let's now load the CSV containing every single user that retweeted the giveaway. This CSV file is created on a separate application.

After having this list, we can get the intersection of the followers and those who retweeted. That list represents everyone who is elegible to receive the price (before checking for likes.)

In [9]:
retweeters = pd.read_csv("retweeters.csv")['0'].tolist()

elegible = list(set(followers) & set(retweeters)) 

# The Magic Number

Before anything else, let's define a couple of functions related to the selection process:

* The first function is `generate_magic_number()` and it's an algorithm to create a random number from the list of elegible users.

* The second function is `display_magic_number()` that focuses on displaying the magic number on the screen.

In [10]:
def generate_magic_number(model, followers):
    """
    Given the train model, and the list of elegible followers, this
    function generates the "magic number", which represents the position
    in the list of the winner of the giveaway.
    """

    # First of all, let's generate randomly how many digits we need
    # for our magic number. This should go from 1 digit to the total
    # number of digits of the total number of followers.
    digits = random.randint(1, len(str(len(followers))))

    magic_number = ""
    digit_images = []
    for i in range(digits):
        # Let's pick a random image from a dataset of 10,000 samples
        # and infer the digit that's displayed on the image.
        image = random.choice(x_test)
        digit = predict(model, image)

        # Let's now append the new digit to our magic number and ensure
        # the result is not greater the total number of followers.
        temp = magic_number + str(digit)
        if int(temp) > len(followers):
            break
        
        magic_number = temp
        digit_images.append(image)

    return int(magic_number), digit_images

def display_magic_number(magic_number, digit_images):
    """
    Displays the magic number and the array of images on the screen.
    """

    plt.figure(figsize=(10, 10))

    for index, image in enumerate(digit_images):
        plt.subplot(1, 5, index + 1)
        display_image(image)
    
    plt.show()

    print(f"Magic number: {magic_number}")


## Selecting the winner

This process generates a magic number and checks whether that elegible user liked the giveaway tweet. Since we know that the user is a follower and also retweeted, there's a very high likelihood that it also liked the tweet.

If we find a user that didn't like the tweet, we repeat the process until we find one.

In [None]:
# First of all, let's make sure our list of elegible users is properly shuffled.
random.shuffle(elegible)

# We need to repeat until we find a winner.
while True:

    # Let's generate a magic number (and also return the images so we
    # can show them and it looks cool!)
    magic_number, digit_images = generate_magic_number(model, elegible)
    display_magic_number(magic_number, digit_images)

    # The magic number will give us the potential winner from the list of
    # elegible users.
    user_id = elegible[magic_number]

    # We need to make sure this user liked the giveaway tweet. If it did,
    # we found the winner. If it didn't, we need to try again.
    liked = did_user_like_tweet(user_id, STATUS)
    if liked:
        screen_name, name = get_user(user_id)
        print(f"User @{screen_name} ({name}) is the winner!")
        break

    print("User did not like the tweet. Retrying with a new user", end="\n\n")
