# Maryland State Parks Twitter Purge

According to [this article](https://www.baltimoresun.com/news/maryland/investigations/bs-md-state-park-social-media-accounts-merging-20190109-story.html) in the Baltimore Sun the Maryland Park Service has decided to consolidate all the individual state park social media accounts, including Twitter. This would effectively remove the historical record of feeds that people have followed. Let's use [twarc](https://github.com/docnow/twarc) to determine where these accounts are, and how many followers and tweets they have.

In [1]:
import twarc

The text of the tweet that each park had to tweet out looked like this:

> Happy New Year! As part of our resolution to streamline communications from Maryland State Parks, we are merging this account with @MDStateParks. Please be sure to follow that account today to keep up-to-date with events and news! This account will be closed on January 31.

We can use some of that text to identify the Park accounts:

In [2]:
t = twarc.Twarc()
tweets = t.search('Happy New Year! As part of our resolution to streamline communications from Maryland State Parks, we are merging this account with @MDStateParks')

Now lets go through each one and print out the user account, and the number of tweets and followers they have:

In [3]:
for tweet in tweets:
    print(tweet['user']['screen_name'], tweet['user']['followers_count'], tweet['user']['statuses_count'])

JanesIslandSP 1285 459
DeepCreekLakeSP 2634 591
PointLookoutSP 1573 468
TuckahoeSP 1639 314
SenecaCreekSP 1420 632
robinsnewswire 25733 1409165
HerringtonMnrSP 1691 543
PocomokeRiverSP 1671 911
RocksStatePark 1321 179
SusquehannaSP 1538 232
TubmanSP 1670 2613
ReneeHawk1956 882 11759
GreenbrierSP 1974 998
CunninghamFalls 2027 467
NewGermanySP 2747 1736
SmallwoodSP 1051 319
GunpowderSP 2298 1549
FortFrederickSP 1231 665
RockyGapSP 2746 3834
fairhillsp 984 293
PatapscoSP 2953 3095
AssateagueSP 4028 1510


It looks like some users have retweeted that message, like [@robinnewswire](https://robbinewswire) so let's ignore the retweets.

In [4]:
for tweet in t.search('Happy New Year! As part of our resolution to streamline communications from Maryland State Parks, we are merging this account with @MDStateParks'):
    if 'retweeted_status' in tweet:
        continue
    print(tweet['user']['screen_name'], tweet['user']['followers_count'], tweet['user']['statuses_count'])

JanesIslandSP 1285 459
DeepCreekLakeSP 2634 591
PointLookoutSP 1573 468
TuckahoeSP 1639 314
SenecaCreekSP 1420 632
HerringtonMnrSP 1691 543
PocomokeRiverSP 1671 911
RocksStatePark 1321 179
SusquehannaSP 1538 232
TubmanSP 1670 2613
GreenbrierSP 1974 998
CunninghamFalls 2027 467
NewGermanySP 2747 1736
SmallwoodSP 1051 319
GunpowderSP 2298 1549
FortFrederickSP 1231 665
RockyGapSP 2746 3834
fairhillsp 984 293
PatapscoSP 2953 3095
AssateagueSP 4028 1510


Let's do the search again but put them into a list that we can then use without going back to the API.

In [5]:
users = []
for tweet in t.search('Happy New Year! As part of our resolution to streamline communications from Maryland State Parks, we are merging this account with @MDStateParks'):
    if 'retweeted_status' in tweet:
        continue
    users.append(tweet['user'])
print(len(users))

20


Now we can print out the total number of tweets generated by these accounts:

In [6]:
print(sum([u['statuses_count'] for u in users]))

21408


Or the total number of users who followed each of the accounts:

In [7]:
print(sum([u['followers_count'] for u in users]))

38481


The [Twitter API](https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline.html) will only allow you to get the last 3,200 tweets for a given user. But fortunately none of the park accounts have tweeted more than that. So we can use twarc to fetch all the 21,408 tweets. Let's use [tqdm](https://github.com/tqdm/tqdm) to create a little progress bar since this could take some time because of Twitter's [rate limiting](https://developer.twitter.com/en/docs/basics/rate-limits.html) of their API.

In [None]:
from tqdm import tqdm

progress = tqdm(total=38481)

tweets = []
for user in users:
    for tweet in t.timeline(user['screen_name']):
        tweets.append(tweet)
        progress.update(1)    





In [17]:
print(len(tweets))

0
