# tweezers
### Python library for twitter data analysis.
Tweezers leverages the power of *requests* and *pandas* to provide a simple interface to scrape Twitter data and shape-it-into easy to analyse lists, dataframes, and metadata.

In [1]:
import tweezers

#### First create a connection to the API with twitter credentials

In [2]:
# Twitter auth credentials loaded from a locally stored json file.
import json
credentials = json.load(open("../../../credentials/twitter_credentials.json"))

consumer_key = credentials['consumer_key']
consumer_secret = credentials['consumer_secret']
access_token = credentials['access_token']
access_token_secret = credentials['access_token_secret']

auth = tweezers.tweezer_auth(consumer_key, 
                             consumer_secret,
                             access_token,
                             access_token_secret)

Verify the credentials have successfully connected with Twitter

In [3]:
auth.verify()

'Congratulations! Your credentials are valid.'

#### Now perform a search. Simply specify the number of tweets!

In [4]:
my_search = tweezers.search(tweezer_auth = auth,
                            total = 1000,
                            search_term = "bitcoin")

1000 tweets requested, 1000 tweets returned


The search object create provides access to a number of methods for retrieving lists, dataframes and search metadata.

<img src="./images/search.methods.png" align="left">

Get a list of the tweet features returned by the search:

In [5]:
my_search.tweet_features

['created_at',
 'id',
 'id_str',
 'text',
 'truncated',
 'entities',
 'metadata',
 'source',
 'in_reply_to_status_id',
 'in_reply_to_status_id_str',
 'in_reply_to_user_id',
 'in_reply_to_user_id_str',
 'in_reply_to_screen_name',
 'user',
 'geo',
 'coordinates',
 'place',
 'contributors',
 'is_quote_status',
 'retweet_count',
 'favorite_count',
 'favorited',
 'retweeted',
 'possibly_sensitive',
 'lang']

Choose a feature and get a list of that feature for every tweet returned:

In [6]:
my_search.list_tweet_feature(feature = "created_at")[:10]

['Wed Aug 16 14:10:12 +0000 2017',
 'Wed Aug 16 14:10:11 +0000 2017',
 'Wed Aug 16 14:10:11 +0000 2017',
 'Wed Aug 16 14:10:07 +0000 2017',
 'Wed Aug 16 14:10:07 +0000 2017',
 'Wed Aug 16 14:10:07 +0000 2017',
 'Wed Aug 16 14:10:07 +0000 2017',
 'Wed Aug 16 14:10:07 +0000 2017',
 'Wed Aug 16 14:10:07 +0000 2017',
 'Wed Aug 16 14:10:04 +0000 2017']

Get a dataframe summary of the number of users, mentions, hashtags, or URLs in the tweets returned by the search.

In [8]:
my_search.users_df().head()

Unnamed: 0,users,count
0,rr_bitcoin,30
1,StakepoolCom,13
2,btcbreakingnews,10
3,AliceAndHerCat,8
4,Mrs_Sadler_,8


In [9]:
my_search.mentioned_df().head()

Unnamed: 0,mentions,count
0,@YouTube,10
1,@CoinDesk,8
2,@Chain,7
3,@pierre,6
4,@Cointelegraph,5


In [10]:
my_search.hashtags_df().head()

Unnamed: 0,hashtags,count
0,bitcoin,151
1,Bitcoin,82
2,blockchain,19
3,cryptocurrency,13
4,news,11


In [11]:
my_search.urls_df().head()

Unnamed: 0,url,count
0,https://t.co/q2DCN5fzra,6
1,https://t.co/0jRc0P4LS7,3
2,https://t.co/UlHpwsafD8,3
3,https://t.co/GxjCOsIHHp,3
4,https://t.co/1HTm4ls1Tc,3


Get an estimate of the average tweet frequency for the search term:

In [12]:
my_search.time_per_tweet

datetime.timedelta(0, 2, 598000)

Get an estimate of the number of tweets per week about the search term:

In [13]:
my_search.tweets_per_week

302400

Get a list of the text of all the tweets:

In [14]:
my_search.tweets[:10]

["Why Bitcoin, Cryptos Aren't Gold https://t.co/ya3BJtyBXL",
 'Bitcoin for Dummies by Prypto (English) Paperback\xa0Book https://t.co/LQJOJWuZ5f',
 'The Age of Cryptocurrency: How Bitcoin and the Blockchain Are Challenging… https://t.co/oGKQ5YKZ2c',
 'Coral Gables (Miami, FL) Homeowner Will Accept Bitcoin for His $6.4 Million Mansion https://t.co/rRC1vNASzI',
 'Companies are buying bitcoin to pay off hackers, says top cybersecurity CEO https://t.co/GHQZjS2oQi https://t.co/wwTY1ISCy8',
 'Altcoins Could Lose 20% When #Bitcoin Climbs to $5,000 #cryptocurrency #fintech https://t.co/c86iD5G3Lr https://t.co/iguWkXpdLZ',
 'Would anyone recommend the https://t.co/onEuE7IA7G BCC wallet for Bitcoin Cash? https://t.co/ycQ4tpAikw',
 'The Intrinsic Value of Bitcoin, Bubble Speculation Historically Wrong https://t.co/w2c18WNEiM',
 'The Age of Cryptocurrency: How Bitcoin and the Blockchain Are Challenging… https://t.co/JoPuhEKcHY',
 'I like how this discussion raises awareness for Bitcoin proof-of-wo

Doing some Natural Language Processing? Get the same list, but with URLs, hashtag symbols, and @ symbols removed:

In [15]:
my_search.stripped_tweets[:10]

["Why Bitcoin, Cryptos Aren't Gold",
 'Bitcoin for Dummies by Prypto (English) Paperback Book',
 'The Age of Cryptocurrency: How Bitcoin and the Blockchain Are Challenging…',
 'Coral Gables (Miami, FL) Homeowner Will Accept Bitcoin for His $6.4 Million Mansion',
 'Companies are buying bitcoin to pay off hackers, says top cybersecurity CEO',
 'Altcoins Could Lose 20% When Bitcoin Climbs to $5,000 cryptocurrency fintech',
 'Would anyone recommend the BCC wallet for Bitcoin Cash?',
 'The Intrinsic Value of Bitcoin, Bubble Speculation Historically Wrong',
 'The Age of Cryptocurrency: How Bitcoin and the Blockchain Are Challenging…',
 'I like how this discussion raises awareness for Bitcoin proof-of-work game theory, which further exemplifies where…']

#### And for a perfectly formatted pandas dataframe ready for analysis, just run:

In [17]:
my_search.pandas_df().head()

Unnamed: 0,ats,created_at,favorite_count,hashtags,id,retweet_count,stripped_tweet,tweet,urls,user
0,[],Wed Aug 16 14:10:12 +0000 2017,0,[],897822804808265728,0,"Why Bitcoin, Cryptos Aren't Gold","Why Bitcoin, Cryptos Aren't Gold https://t.co/...",[https://t.co/ya3BJtyBXL],all_btc
1,[],Wed Aug 16 14:10:11 +0000 2017,0,[],897822801364733953,0,Bitcoin for Dummies by Prypto (English) Paperb...,Bitcoin for Dummies by Prypto (English) Paperb...,[https://t.co/LQJOJWuZ5f],topnewskoeln
2,[],Wed Aug 16 14:10:11 +0000 2017,0,[],897822800492326912,0,The Age of Cryptocurrency: How Bitcoin and the...,The Age of Cryptocurrency: How Bitcoin and the...,[https://t.co/oGKQ5YKZ2c],tourismusvideo
3,[],Wed Aug 16 14:10:07 +0000 2017,0,[],897822786076516353,0,"Coral Gables (Miami, FL) Homeowner Will Accept...","Coral Gables (Miami, FL) Homeowner Will Accept...",[https://t.co/rRC1vNASzI],rr_bitcoin
4,[],Wed Aug 16 14:10:07 +0000 2017,0,[],897822785766121472,0,Companies are buying bitcoin to pay off hacker...,Companies are buying bitcoin to pay off hacker...,"[https://t.co/GHQZjS2oQi, https://t.co/wwTY1IS...",Foresite_MSP
