# Network Science


**GOAL OF THE SESSION**: Fetch data from Twitter APIs

**DATA SOURCE**: Twitter 

**DEVELOPMENT**: How to create a Python script to query Twitter APIs

**REQUIREMENTS**: 

    Twitter Developer Account
    tweepy 
    Python Pretty Print

### Pretty Print

Using **pprint** we can format in a nice way the print output. Here an example:

    import pprint
    pp = pprint.PrettyPrinter(indent=2)
    pp.pprint(OBJECT-TO-PRINT)


In [1]:
import pprint
#print help(pprint.PrettyPrinter)
pp = pprint.PrettyPrinter(indent=2)  # 2 livelli di inventazione (quando va a capo)
names = {"name":"Alex", "Surname":"Comu", "list":[1,2,3],"address":{"street":"Via Maria Vittoria", "number":1}}
print "NON PRETTY\n", names, "\n"
print "PRETTY:"
pp.pprint(names)

NON PRETTY
{'list': [1, 2, 3], 'Surname': 'Comu', 'name': 'Alex', 'address': {'street': 'Via Maria Vittoria', 'number': 1}} 

PRETTY:
{ 'Surname': 'Comu',
  'address': { 'number': 1, 'street': 'Via Maria Vittoria'},
  'list': [1, 2, 3],
  'name': 'Alex'}


# Demo with Special Effects

Inside the folder **demo** you'll find a very cool demo, a super interaction between:

* Twitter API
* Python Web Server
* D3Js visualization

Read the file **Readme.md** to have more information about the example.

The goal of the demo is to create a Connector between my PC and Twitter. After the creation of this connector I want ro retrieve all the tweets that contains a specific **hashtags**.

At the end I'll represent the tweets in a dynamic data visualization with D3Js.

# Twitter Developer Account

Sign in @ [https://dev.twitter.com/](https://dev.twitter.com/) website and create an account if you need.

After the creation of the account we need to create a new Twitter APP to fetch the APIs, so go to [https://apps.twitter.com/](https://apps.twitter.com/) and create a new one.

To allow our APP to use the Twitter APIs we need to create an Access Token, so click on **Keys and Access Tokens** and create a new one.

And now we're ready to play with Twitter:)

## Twitter Documentation

[HERE](https://dev.twitter.com/overview/api) we can find a complete overview on Twitter API.

# Tweepy Installation

We need to install the package **tweepy**:

    pip install tweepy
    
We can find the documentation of the Library:

    http://tweepy.readthedocs.io/
    
## OAuth

First of all we need to save our credentials in variables. After that we can login on twitter and start use the APIs.


In [5]:
import tweepy

In [7]:
import tweepy
import pprint
pp = pprint.PrettyPrinter(indent=2)

CONSUMER_KEY = "hw3uYRyokN0xZyoHOD4DDUuN8"
SECRET_KEY = "pbZQoD0km4shy7bQVBMP27SFJUZl9rzEaQXGVWiMHhUZM5NbRh"
ACCESS_TOKEN = "799250231548407808-Cwnhd7ZuKG5p9r28GL4imYt7Sao7yAF"
SECRET_ACCESS_TOKEN = "WNOmwN4gdzuQp7oFl94if9tBShr8hCdbBBnUzkpETQY1S"

In [8]:
# Twitter Authentication
auth = tweepy.OAuthHandler(CONSUMER_KEY, SECRET_KEY)

In [11]:
auth.set_access_token(ACCESS_TOKEN, SECRET_ACCESS_TOKEN)

In [12]:
# Create the connection to the api
api = tweepy.API(auth)
print api


<tweepy.api.API object at 0x7f991b4d9290>


In [None]:
help(api)

In [15]:
api.rate_limit_status()

{u'rate_limit_context': {u'access_token': u'799250231548407808-Cwnhd7ZuKG5p9r28GL4imYt7Sao7yAF'},
 u'resources': {u'account': {u'/account/login_verification_enrollment': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479479895},
   u'/account/settings': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479479895},
   u'/account/update_profile': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479479895},
   u'/account/verify_credentials': {u'limit': 75,
    u'remaining': 75,
    u'reset': 1479479895}},
  u'application': {u'/application/rate_limit_status': {u'limit': 180,
    u'remaining': 178,
    u'reset': 1479479892}},
  u'auth': {u'/auth/csrf_token': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479479895}},
  u'blocks': {u'/blocks/ids': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479479895},
   u'/blocks/list': {u'limit': 15, u'remaining': 15, u'reset': 1479479895}},
  u'business_experience': {u'/business_experience/dashboard_features': {u'limit': 450,
    u

## Tweet Stream

In [20]:
# download your home timeline tweets
my_tweets = api.home_timeline()

In [19]:
len(my_tweets)  # perchè appena creato, non seguo nessuno

0

In [25]:
my_tweets[0]._json   # vuoto perchè non ho nulla

IndexError: list index out of range

In [21]:
print "Tweets LEN: ", len(my_tweets), "\n"

Tweets LEN:  0 



In [26]:
my_tweets[0].user

IndexError: list index out of range

In [27]:
my_tweets[0].author

IndexError: list index out of range

In [None]:
my_tweets[0].user.screen.names

In [28]:
my_followers = api.followers()
my_followers_ids = api.followers_ids()

In [29]:
print "Followers: \t", len(my_followers)
print "Followers ids: \t", len(my_followers_ids)  # id dei follower

Followers: 	0
Followers ids: 	0


In [31]:
my_followers[0]   # mi dà 20, di più non ne tira fuori

IndexError: list index out of range

In [30]:
my_followers_ids[0]

IndexError: list index out of range

In [None]:
# Dir Command on Tweet
print "TWEET DIR: ", dir(my_tweets[0]), "\n"
print help(my_tweets[0])

In [None]:
# USER of first Tweet
print my_tweets[0].user

In [None]:
# First 3 tweets
for index, tw in enumerate(my_tweets):
    if index < 3:
        print tw.text, "\n"

## My Followers

In [None]:
## fetch follewer lists
my_followers = api.followers()
print "My_Followers LEN: ", len(my_followers)

In [None]:
my_followers[0]

In [None]:
print dir(my_followers[0])

In [None]:
print help(my_followers[0])

In [None]:
pp.pprint(my_followers[0]._json)

# Get External User

In [32]:
intesa = api.get_user("intesasanpaolo")
intesa

User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=False, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': False, u'profile_text_color': u'333333', u'default_profile_image': False, u'id': 393894382, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'translator_type': u'none', u'profile_location': None, u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/597680334474485760/Z6OMNC0B_normal.jpg', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'url': {u'urls': [{u'url': u'http://t.co/lmrSnJtN6Y', u'indices': [0, 22], u'expanded_url': u'http://www.intesasanpaolo.com', u'display_url': u'intesasanpaolo.com'}]}, u'description': {u'urls': []}}, u'followers_count': 4240, u'profile_sidebar_border_color': u'FFFFFF', u'id_str': u'393894382', u'profile_background_color': u'DBDBDB', u'listed_count': 134, u'status': {u'cont

In [171]:
help(intesa)

Help on User in module tweepy.models object:

class User(Model)
 |  Method resolution order:
 |      User
 |      Model
 |      __builtin__.object
 |  
 |  Methods defined here:
 |  
 |  follow(self)
 |  
 |  followers(self, **kargs)
 |  
 |  followers_ids(self, *args, **kargs)
 |  
 |  friends(self, **kargs)
 |  
 |  lists(self, *args, **kargs)
 |  
 |  lists_memberships(self, *args, **kargs)
 |  
 |  lists_subscriptions(self, *args, **kargs)
 |  
 |  timeline(self, **kargs)
 |  
 |  unfollow(self)
 |  
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |  
 |  parse(cls, api, json) from __builtin__.type
 |  
 |  parse_list(cls, api, json_list) from __builtin__.type
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from Model:
 |  
 |  __getstate__(self)
 |  
 |  __init__(self, api=None)
 |  
 |  __repr__(self)
 |  
 |  ----------------------------------------------------------

In [33]:
print intesa.followers_count

4240


In [34]:
intesa.friends_count

179

In [22]:
friends = api.friends_ids('intesasanpaolo')
print len(friends)

177


In [24]:
likes = api.favorites('intesasanpaolo')
print len(likes)

20


In [26]:
likes[0]

Status(contributors=None, truncated=False, text=u'Nasce oggi PowerU Digital!  Grazie a @intesasanpaolo e @DeloitteItalia per credere con noi nel progetto @HumanAgeInsIT @ManpowerGroupIT', is_quote_status=False, in_reply_to_status_id=None, id=798107981736906753, favorite_count=4, _api=<tweepy.api.API object at 0x102cdf190>, author=User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=False, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': False, u'default_profile_image': False, u'id': 1166565846, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'translator_type': u'none', u'profile_text_color': u'333333', u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/761270969730301952/X8-H-Adj_normal.jpg', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'url': {u'urls': [{u'url': u'https://t.co/HIf5VJBnyO', u'indices': [0

In [None]:
intesa_followers_count =  intesa.followers_ids()
print len(intesa_followers_count)

In [None]:
print intesa_followers_count[0]

In [None]:
api.get_user(intesa_followers_count[0])

# Cursor

In [None]:
print len(intesa_followers_count)

In [None]:
intesa_followers = intesa.followers()
print len(intesa_followers)

In [None]:
help(tweepy.Cursor)

In [36]:
intesa_cursor = tweepy.Cursor(api.followers, screen_name='intesasanpaolo')

In [None]:
print dir(intesa_cursor)

In [38]:
intesa_cursor

<tweepy.cursor.Cursor at 0x7f991abe8510>

In [39]:
print intesa_cursor.items()

<tweepy.cursor.ItemIterator object at 0x7f991b4abb50>


In [40]:
print intesa_cursor.items().next()   # generatore

User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=True, profile_sidebar_fill_color=u'DDEEF6', live_following=False, time_zone=None, id=799610634899980291, description=u'', _api=<tweepy.api.API object at 0x7f991b4d9290>, verified=False, blocked_by=False, profile_text_color=u'333333', muting=False, profile_image_url_https=u'https://abs.twimg.com/sticky/default_profile_images/default_profile_3_normal.png', _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': True, u'live_following': False, u'default_profile_image': True, u'id': 799610634899980291, u'profile_background_image_url_https': None, u'translator_type': u'none', u'verified': False, u'blocked_by': False, u'profile_text_color': u'333333', u'muting': False, u'profile_image_url_https': u'https://abs.twimg.com/sticky/default_profile_images/default_profile_3_normal.png', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'description': {u'urls

In [41]:
intesa_cursor.pages().next()

[User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=True, profile_sidebar_fill_color=u'E3E2DE', live_following=False, time_zone=u'Eastern Time (US & Canada)', id=140578968, description=u'The official channel for Cisco Financial Services Industry news, updates and events.', _api=<tweepy.api.API object at 0x7f991b4d9290>, verified=False, blocked_by=False, profile_text_color=u'634047', muting=False, profile_image_url_https=u'https://pbs.twimg.com/profile_images/727954347145891840/XdBPoMtc_normal.jpg', _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': True, u'live_following': False, u'default_profile_image': False, u'id': 140578968, u'profile_background_image_url_https': u'https://pbs.twimg.com/profile_background_images/743499327/3559a428155964b2850f71bef72c0bcd.jpeg', u'translator_type': u'none', u'verified': False, u'blocked_by': False, u'profile_text_color': u'634047', u'muting': False, u'profile_

In [44]:
intesa_followers = []
for page in intesa_cursor.pages():
    print "OK"
    intesa_followers.extend(page)

OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK


RateLimitError: [{u'message': u'Rate limit exceeded', u'code': 88}]

In [46]:
len(intesa_followers)

240

In [48]:
intesa_followers[0]._json

{u'blocked_by': False,
 u'blocking': False,
 u'contributors_enabled': False,
 u'created_at': u'Wed Mar 02 20:31:46 +0000 2016',
 u'default_profile': True,
 u'default_profile_image': False,
 u'description': u'@riberto81  RILASSATI! LO STRESS NUOCE GRAVEMENTE A TE E A CHI TI STA INTORNO',
 u'entities': {u'description': {u'urls': []}},
 u'favourites_count': 892,
 u'follow_request_sent': False,
 u'followers_count': 48,
 u'following': False,
 u'friends_count': 74,
 u'geo_enabled': False,
 u'has_extended_profile': True,
 u'id': 705128484323127296,
 u'id_str': u'705128484323127296',
 u'is_translation_enabled': False,
 u'is_translator': False,
 u'lang': u'it',
 u'listed_count': 3,
 u'live_following': False,
 u'location': u'',
 u'muting': False,
 u'name': u'Riberto',
 u'notifications': False,
 u'profile_background_color': u'F5F8FA',
 u'profile_background_image_url': None,
 u'profile_background_image_url_https': None,
 u'profile_background_tile': False,
 u'profile_banner_url': u'https://pbs.twim

In [None]:
for follower in intesa_cursor.items():
    print follower

In [None]:
mylist = []
for follower in intesa_cursor.pages():     # .pages mi restituisce i primi 20 elementi
    print mylist.extend(follower)
    break

In [None]:
len(mylist)

In [None]:
mylist[0]._json

In [None]:
for i, f in enumerate(mylist):
    print i, f.statuses_count

In [50]:
import time   #libreria time (è dentro python)
def limit_handler(cursor):
    while True:
        try:
            yield cursor.next()
        except tweepy.RateLimitError:
            print "Dormo 15 secondi...."
            time.sleep(15)
            print "Provo di nuovo!"


In [52]:
intesa_followers = []
for page in limit_handler(intesa_cursor.pages()):
    intesa_followers.extend(page)

Dormo 15 secondi....
Provo di nuovo!
Dormo 15 secondi....
Provo di nuovo!
Dormo 15 secondi....
Provo di nuovo!
Dormo 15 secondi....
Provo di nuovo!
Dormo 15 secondi....


KeyboardInterrupt: 

# Get Hashtags


In [None]:
tweets = []
for tweet in tweepy.Cursor(api.search, q='#trump').items(5):
    print tweet.text
    tweets.append(tweet)
print "\n-----\n"
print tweets[0]

# Avoid Rate Limit Exception

In [None]:
import time
def limit_handler(cursor):
    while True:
        try:
            yield cursor.next()
        except tweepy.RateLimitError:
            print "Timeout Reached, I'm going to sleep for 15 Minutes"
            time.sleep(15*60)
            print "I'm going to try again!"

In [None]:
alexcomu_cursor = tweepy.Cursor(api.followers, screen_name='comualex')

alexcomu_followers = []
for followers in limit_handler(alexcomu_cursor.pages()):
    alexcomu_followers.extend(followers)
    

In [None]:
len(alexcomu_followers)

# Live Streaming

Check the complete example on the folder **esercitazione**.

In [53]:
class BDStreamingListener(tweepy.StreamListener):
    def __init__(self, count):
        super(BDStreamingListener, self).__init__()
        # Number of tweets we want to retrieve
        self.count = count

    def on_status(self, status):
        # automatic called when a new tweet is received
        # print dir(status)
        print dict(user=status.user.screen_name, text=status.text)

        self.count -= 1
        if self.count <= 0:
            return False

    def on_error(self, status_code):
        # automatic called when an error occures
        print "Error with status code: ", status_code
        return False

In [54]:
# Create an instand set the number of tweets we want ro retrieve
listener = BDStreamingListener(5)   # al quinto tweet che ricevo mi fermo; stampo i primi 5 tweet

# Create the stream fetching object with auth and listener
stream = tweepy.streaming.Stream(auth, listener)

# Tun the stream using filter
stream.filter(track=['#Trump'])


{'text': u'#Obama se re\xfane con l\xedderes europeos entre dudas sobre #Trump https://t.co/USlYYLURrE', 'user': u'Rhakco'}
{'text': u'RT @joejusticeza: @Brianrrs37 @MakeWayForTay #Trump supporters?  Read? LOL!!!', 'user': u'MakeWayForTay'}
{'text': u'Trump: Grabbed It Mug https://t.co/E84amuJo8C\n#trump#protrump#trumpshirt#presidenttrump#grabbedit', 'user': u'soundthetrumpy'}
{'text': u'RT @DonaldJTrumpJr: Tune in tonight to @60Minutes for the first interview with our family since Election Day. #trump https://t.co/B2CfeMtuSE', 'user': u'jtblogs'}
{'text': u"As #Trump says his appointees aren't anti-Semitic or racist\nFacts prove opposite\nNever forget what he's doing, norm\u2026 https://t.co/tnc9cOrZxL", 'user': u'bcomininvisible'}


# Get INTESA Followers -- Version 1

In [55]:
import time
from datetime import datetime as dt


# Ask for Followers using Cursor (20 followers per page, with a limit of 15 requests each 15 minutes) ~ 3 Hours
class IntesaFollowers(object):
    
    def __init__(self, auth):
        self.auth = auth
        self.api = tweepy.API(self.auth)
        self.intesa_cursor = tweepy.Cursor(self.api.followers, screen_name='intesasanpaolo')

    def get_followers(self):
        while True:
            try:
                yield self.intesa_cursor.pages().next()
            except tweepy.RateLimitError:
                print "[LOG %s] Timeout reached.. I'm going to sleep for 15 minutes.." % dt.now()
                time.sleep(15*60)   # dorme per 15 sec * 60 = 15 minuti
                print "[LOG %s] Try Again!" % dt.now()
            except Exception as e:
                # Generic Exception
                print "[LOG %s] Generic error " % dt.now(), e
                print "[LOG %s] Wait 60 seconds..." % dt.now()
                time.sleep(60)

In [59]:
intesa = IntesaFollowers(auth)
intesa_followers = []
counter = 0
for follower in intesa.get_followers():
    print "Working ..", counter
    counter =+1
    intesa_followers.append(follower)

[LOG 2016-11-18 16:21:16.878553] Timeout reached.. I'm going to sleep for 15 minutes..


KeyboardInterrupt: 

# Get INTESA Followers -- Version 2 (Faster)

In [None]:
# Ask for Followers_ids and ask data for each user -> Much Much Faster!  ~ 1.5 Hours
class IntesaFollowers(object):

    def __init__(self, auth):
        self.auth = auth
        self.api = tweepy.API(self.auth)
        self.intesa = self.api.get_user('intesasanpaolo')

    def get_followers(self):
        for follower_id in self.intesa.followers_ids():
            try:
                yield self.api.get_user(follower_id)
            except tweepy.RateLimitError:
                print "[LOG %s] Timeout reached.. I'm going to sleep for 15 minutes.." % dt.now()
                time.sleep(15*60)
                print "[LOG %s] Try Again!" % dt.now()
            except Exception as e:
                # Generic Exception
                print "[LOG %s] Generic error " % dt.now(), e
                print "[LOG %s] Wait 60 seconds..." % dt.now()
                time.sleep(60)

In [None]:
intesa = IntesaFollowers(auth)
intesa_followers = []
for follower in intesa.get_followers():
    intesa_followers.append(follower)

# Get INTESA tweets

In [7]:
class IntesaTweets(object):
    
    def __init__(self, auth):
        self.auth = auth
        self.api = tweepy.API(self.auth)
        self.intesa_cursor = tweepy.Cursor(self.api.user_timeline, screen_name='intesasanpaolo')

    def get_tweets(self):
        while True:
            try:
                yield self.intesa_cursor.pages().next()
            except tweepy.RateLimitError:
                print "[LOG %s] Timeout reached.. I'm going to sleep for 15 minutes.." % dt.now()
                time.sleep(15*60)
                print "[LOG %s] Try Again!" % dt.now()
            except Exception as e:
                # Generic Exception
                print "[LOG %s] Generic error " % dt.now(), e
                print "[LOG %s] Wait 60 seconds..." % dt.now()
                time.sleep(60)


In [8]:
intesa_timeline = IntesaTweets(auth)
intesa_tweets = []
for tweet in intesa_timeline.get_tweets():
    pp.pprint(tweet[0]._json)
    break

{ u'contributors': None,
  u'coordinates': None,
  u'created_at': u'Mon Nov 14 09:25:09 +0000 2016',
  u'entities': { u'hashtags': [ { u'indices': [0, 13],
                                  u'text': u'FlashMercati'}],
                 u'symbols': [],
                 u'urls': [ { u'display_url': u'twitter.com/i/web/status/7\u2026',
                              u'expanded_url': u'https://twitter.com/i/web/status/798094406406590464',
                              u'indices': [116, 139],
                              u'url': u'https://t.co/AOHmJBzUDk'}],
                 u'user_mentions': []},
  u'favorite_count': 0,
  u'favorited': False,
  u'geo': None,
  u'id': 798094406406590464,
  u'id_str': u'798094406406590464',
  u'in_reply_to_screen_name': None,
  u'in_reply_to_status_id': None,
  u'in_reply_to_status_id_str': None,
  u'in_reply_to_user_id': None,
  u'in_reply_to_user_id_str': None,
  u'is_quote_status': False,
  u'lang': u'it',
  u'place': None,
  u'possibly_sensitive': False,


# GET Intesa Favorites

# GET Intesa Friends

# GET Intesa Data