#:iphone: Retrieving Twitter Wordle scores

The function of this script is to retrieve Wordle scores from Twitter, assess them for their validity, and upload them to a SQL server hosted on my computer.

In [9]:
import tweepy
import pandas as pd
import numpy as np
import pyodbc
import secret

##:newspaper:Connecting to the Twitter API and retrieving search results

In order to retrieve a list of Twitter search results, I first have to connect to the server. In this case, I've used a Bearer Token specific to my Twitter account.

After completing a Wordle, the game allows you to share a text of your game's results, which proud (or angry) players do to Twitter quite frequently. To find these scores, I specifically searched for the word "Wordle" and number "6." As seen below, each game report starts with "Wordle" followed by the day and score out of six. Since the day and score change day-to-day and by player, I limited my search results to the constants within the score reports

**Wordle** 345 3/**6**

🟩⬛🟩⬛🟨<br>
⬛⬛⬛⬛⬛<br>
🟩🟩🟩🟩🟩<br>

*An example board after completing the Wordle and the terms searched for*

In [10]:
#accessing the Twitter API
client = tweepy.Client(bearer_token = secret.BEARER)

In [11]:
#specifying query for searching Twitter
query = 'Wordle 6 -is:retweet'

#searching tweets per the query
tweets = client.search_recent_tweets(query=query,max_results= 100, tweet_fields=['author_id', 'created_at','geo'])

##Loading Tweet information into a data frame

In [12]:
data = pd.DataFrame(data = [tweet.id for tweet in tweets.data], columns =['ID'])
data['Author'] = np.array([tweet.author_id for tweet in tweets.data])

data.head(10)

Unnamed: 0,ID,Author
0,1521647662100934656,1052704807956299776
1,1521647658942750720,733031774763122688
2,1521647650239827968,242715081
3,1521647646565609473,3568613178
4,1521647646032764928,1354555050593497091
5,1521647641071058944,17919898
6,1521647639581933568,1002295020702130176
7,1521647637560127488,843296061791776769
8,1521647631118012417,1121766262827147264
9,1521647624318771200,382599550


In [13]:
score_arr = []
day_arr = []
isHard_arr = []

for tweet in tweets.data:
    score = tweet.text.find("/")-1

    if tweet.text[score].isnumeric():
        score_arr.append(tweet.text[score])
    else:
        score_arr.append(0)

    temp_day = tweet.text.partition("Wordle ")
    day = temp_day[2][0:4]

    if(day[0:3].isnumeric() and day[3] == " "):
        day_arr.append(day[0:3])
    else:
        day_arr.append('NaN')

    if(tweet.text[score+3] == '*'):
        isHard_arr.append(1)
    else:
        isHard_arr.append(0)

In [14]:
data['WordleDay'] = day_arr
data['Score'] = score_arr
data['HardMode'] = isHard_arr

In [15]:
clean_data = data[data['WordleDay'].str.contains("NaN") == False]

clean_data.head(10)

Unnamed: 0,ID,Author,WordleDay,Score,HardMode
0,1521647662100934656,1052704807956299776,319,3,0
1,1521647658942750720,733031774763122688,318,5,0
2,1521647650239827968,242715081,319,0,0
3,1521647646565609473,3568613178,319,3,0
4,1521647646032764928,1354555050593497091,319,4,0
5,1521647641071058944,17919898,319,3,0
6,1521647639581933568,1002295020702130176,319,3,0
7,1521647637560127488,843296061791776769,319,3,0
8,1521647631118012417,1121766262827147264,319,3,0
9,1521647624318771200,382599550,319,3,0


In [16]:
#uploading dataframe to the databse
conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=AUSTIN-PC\SQLEXPRESS;'
                      'Database=WordleAnalysis;'
                      'Trusted_Connection=yes;')

cursor = conn.cursor()

for index, row in clean_data.iterrows():
    cursor.execute("INSERT INTO TwitterWordle (ID,Author,WordleDay,Score,HardMode) values(?,?,?,?,?)", row.ID, row.Author, row.WordleDay, row.Score, row.HardMode)
conn.commit()
cursor.close()