# Network Science


**GOAL OF THE SESSION**: Fetch data from Twitter APIs

**DATA SOURCE**: Twitter 

**DEVELOPMENT**: How to create a Python script to query Twitter APIs

**REQUIREMENTS**: 

    Twitter Developer Account
    tweepy 
    Python Pretty Print

### Pretty Print

Using **pprint** we can format in a nice way the print output. Here an example:

    import pprint
    pp = pprint.PrettyPrinter(indent=2)
    pp.pprint(OBJECT-TO-PRINT)


In [1]:
import pprint
#print help(pprint.PrettyPrinter)
pp = pprint.PrettyPrinter(indent=2)
names = {"name":"Alex", "Surname":"Comu", "list":[1,2,3],"address":{"street":"Via Maria Vittoria", "number":1}}
print "NON PRETTY\n", names, "\n"
print "PRETTY:"
pp.pprint(names)

NON PRETTY
{'list': [1, 2, 3], 'Surname': 'Comu', 'name': 'Alex', 'address': {'street': 'Via Maria Vittoria', 'number': 1}} 

PRETTY:
{ 'Surname': 'Comu',
  'address': { 'number': 1, 'street': 'Via Maria Vittoria'},
  'list': [1, 2, 3],
  'name': 'Alex'}


# Demo with Special Effects

Inside the folder **demo** you'll find a very cool demo, a super interaction between:

* Twitter API
* Python Web Server
* D3Js visualization

Read the file **Readme.md** to have more information about the example.

The goal of the demo is to create a Connector between my PC and Twitter. After the creation of this connector I want ro retrieve all the tweets that contains a specific **hashtags**.

At the end I'll represent the tweets in a dynamic data visualization with D3Js.

# Twitter Developer Account

Sign in @ [https://dev.twitter.com/](https://dev.twitter.com/) website and create an account if you need.

After the creation of the account we need to create a new Twitter APP to fetch the APIs, so go to [https://apps.twitter.com/](https://apps.twitter.com/) and create a new one.

To allow our APP to use the Twitter APIs we need to create an Access Token, so click on **Keys and Access Tokens** and create a new one.

And now we're ready to play with Twitter:)

## Twitter Documentation

[HERE](https://dev.twitter.com/overview/api) we can find a complete overview on Twitter API.

# Tweepy Installation

We need to install the package **tweepy**:

    pip install tweepy
    
We can find the documentation of the Library:

    http://tweepy.readthedocs.io/
    
## OAuth

First of all we need to save our credentials in variables. After that we can login on twitter and start use the APIs.


In [2]:
import tweepy
import pprint
pp = pprint.PrettyPrinter(indent=2)

CONSUMER_KEY = "Z4Fo2V9q7NnO7mS0vXblpaaX1"
SECRET_KEY = "i4pRrMoDrPKyUlFY3IVjvQs8xHkI6Z0NGxAqMiQv5qyxV5fUSJ"
ACCESS_TOKEN = "799612892467367936-sSc54p6RLDz0cob3QdgHBgtN8v9bagd"
SECRET_ACCESS_TOKEN = "6PbuUEtV3hY5fyuJjy5by0953J9ScS3vhnWFvE1dXZMPE"

In [3]:
# Twitter Authentication
auth = tweepy.OAuthHandler(CONSUMER_KEY, SECRET_KEY)
auth.set_access_token(ACCESS_TOKEN, SECRET_ACCESS_TOKEN)

In [4]:
# Create the connection to the api
api = tweepy.API(auth)
print api


<tweepy.api.API object at 0x7fb0ad176950>


In [5]:
help(api)

Help on API in module tweepy.api object:

class API(__builtin__.object)
 |  Twitter API
 |  
 |  Methods defined here:
 |  
 |  __init__(self, auth_handler=None, host='api.twitter.com', search_host='search.twitter.com', upload_host='upload.twitter.com', cache=None, api_root='/1.1', search_root='', upload_root='/1.1', retry_count=0, retry_delay=0, retry_errors=None, timeout=60, parser=None, compression=False, wait_on_rate_limit=False, wait_on_rate_limit_notify=False, proxy='')
 |      Api instance Constructor
 |      
 |      :param auth_handler:
 |      :param host:  url of the server of the rest api, default:'api.twitter.com'
 |      :param search_host: url of the search server, default:'search.twitter.com'
 |      :param upload_host: url of the upload server, default:'upload.twitter.com'
 |      :param cache: Cache to query if a GET method is used, default:None
 |      :param api_root: suffix of the api version, default:'/1.1'
 |      :param search_root: suffix of the search version,

In [6]:
api.rate_limit_status()

{u'rate_limit_context': {u'access_token': u'799612892467367936-sSc54p6RLDz0cob3QdgHBgtN8v9bagd'},
 u'resources': {u'account': {u'/account/login_verification_enrollment': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479481501},
   u'/account/settings': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479481501},
   u'/account/update_profile': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479481501},
   u'/account/verify_credentials': {u'limit': 75,
    u'remaining': 75,
    u'reset': 1479481501}},
  u'application': {u'/application/rate_limit_status': {u'limit': 180,
    u'remaining': 179,
    u'reset': 1479481501}},
  u'auth': {u'/auth/csrf_token': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479481501}},
  u'blocks': {u'/blocks/ids': {u'limit': 15,
    u'remaining': 15,
    u'reset': 1479481501},
   u'/blocks/list': {u'limit': 15, u'remaining': 15, u'reset': 1479481501}},
  u'business_experience': {u'/business_experience/dashboard_features': {u'limit': 450,
    u

## Tweet Stream

In [7]:
# download your home timeline tweets
my_tweets = api.home_timeline()

In [8]:
print "Tweets LEN: ", len(my_tweets), "\n"

Tweets LEN:  20 



In [9]:
# Dir Command on Tweet
print "TWEET DIR: ", dir(my_tweets[0]), "\n"
print help(my_tweets[0])

TWEET DIR:  ['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__getattribute__', '__getstate__', '__hash__', '__init__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_api', '_json', 'author', 'contributors', 'coordinates', 'created_at', 'destroy', 'entities', 'favorite', 'favorite_count', 'favorited', 'geo', 'id', 'id_str', 'in_reply_to_screen_name', 'in_reply_to_status_id', 'in_reply_to_status_id_str', 'in_reply_to_user_id', 'in_reply_to_user_id_str', 'is_quote_status', 'lang', 'parse', 'parse_list', 'place', 'retweet', 'retweet_count', 'retweeted', 'retweets', 'source', 'source_url', 'text', 'truncated', 'user'] 

Help on Status in module tweepy.models object:

class Status(Model)
 |  Method resolution order:
 |      Status
 |      Model
 |      __builtin__.object
 |  
 |  Methods defined here:
 |  
 |  __eq__(self, other)
 |  
 |  __ne__(self, other

In [10]:
# USER of first Tweet
print my_tweets[0].user

User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=True, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': True, u'default_profile_image': False, u'id': 2999307453, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'translator_type': u'none', u'profile_text_color': u'333333', u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/656344191035920384/HlpdLa4T_normal.jpg', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'description': {u'urls': []}}, u'followers_count': 4127, u'profile_sidebar_border_color': u'C0DEED', u'id_str': u'2999307453', u'profile_background_color': u'C0DEED', u'listed_count': 24, u'is_translation_enabled': False, u'utc_offset': -28800, u'statuses_count': 761, u'description': u'Raccogliamo le peggiori freddure e/o cazzate che girano nel Web e/o Whatsapp e ve le proponiamo a gratis.', u'friend

In [11]:
# First 3 tweets
for index, tw in enumerate(my_tweets):
    if index < 3:
        print tw.text, "\n"

La cosa più bella del mondo 

https://t.co/iMYRk3aGnx 

I treni merci in Francia sono molto gentili 



## My Followers

In [12]:
## fetch follewer lists
my_followers = api.followers()
print "My_Followers LEN: ", len(my_followers)

My_Followers LEN:  0


In [13]:
my_followers[0]

IndexError: list index out of range

In [None]:
print dir(my_followers[0])

In [None]:
print help(my_followers[0])

In [None]:
pp.pprint(my_followers[0]._json)

# Get External User

In [14]:
intesa = api.get_user("intesasanpaolo")
intesa

User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=False, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': False, u'profile_text_color': u'333333', u'default_profile_image': False, u'id': 393894382, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'translator_type': u'none', u'profile_location': None, u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/597680334474485760/Z6OMNC0B_normal.jpg', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'url': {u'urls': [{u'url': u'http://t.co/lmrSnJtN6Y', u'indices': [0, 22], u'expanded_url': u'http://www.intesasanpaolo.com', u'display_url': u'intesasanpaolo.com'}]}, u'description': {u'urls': []}}, u'followers_count': 4240, u'profile_sidebar_border_color': u'FFFFFF', u'id_str': u'393894382', u'profile_background_color': u'DBDBDB', u'listed_count': 134, u'status': {u'cont

In [171]:
help(intesa)

Help on User in module tweepy.models object:

class User(Model)
 |  Method resolution order:
 |      User
 |      Model
 |      __builtin__.object
 |  
 |  Methods defined here:
 |  
 |  follow(self)
 |  
 |  followers(self, **kargs)
 |  
 |  followers_ids(self, *args, **kargs)
 |  
 |  friends(self, **kargs)
 |  
 |  lists(self, *args, **kargs)
 |  
 |  lists_memberships(self, *args, **kargs)
 |  
 |  lists_subscriptions(self, *args, **kargs)
 |  
 |  timeline(self, **kargs)
 |  
 |  unfollow(self)
 |  
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |  
 |  parse(cls, api, json) from __builtin__.type
 |  
 |  parse_list(cls, api, json_list) from __builtin__.type
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from Model:
 |  
 |  __getstate__(self)
 |  
 |  __init__(self, api=None)
 |  
 |  __repr__(self)
 |  
 |  ----------------------------------------------------------

In [15]:
print intesa.followers_count

4240


In [16]:
friends = api.friends_ids('intesasanpaolo')
print len(friends)

179


In [17]:
likes = api.favorites('intesasanpaolo')
print len(likes)

20


In [26]:
likes[0]

Status(contributors=None, truncated=False, text=u'Nasce oggi PowerU Digital!  Grazie a @intesasanpaolo e @DeloitteItalia per credere con noi nel progetto @HumanAgeInsIT @ManpowerGroupIT', is_quote_status=False, in_reply_to_status_id=None, id=798107981736906753, favorite_count=4, _api=<tweepy.api.API object at 0x102cdf190>, author=User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=False, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': False, u'default_profile_image': False, u'id': 1166565846, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': False, u'translator_type': u'none', u'profile_text_color': u'333333', u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/761270969730301952/X8-H-Adj_normal.jpg', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'url': {u'urls': [{u'url': u'https://t.co/HIf5VJBnyO', u'indices': [0

In [18]:
intesa_followers_count =  intesa.followers_ids()
print len(intesa_followers_count)

4240


In [19]:
print intesa_followers_count[0]

799610634899980291


In [20]:
api.get_user(intesa_followers_count[0])

User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=True, _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': True, u'profile_text_color': u'333333', u'default_profile_image': True, u'id': 799610634899980291, u'profile_background_image_url_https': None, u'verified': False, u'translator_type': u'none', u'profile_location': None, u'profile_image_url_https': u'https://abs.twimg.com/sticky/default_profile_images/default_profile_3_normal.png', u'profile_sidebar_fill_color': u'DDEEF6', u'entities': {u'description': {u'urls': []}}, u'followers_count': 0, u'profile_sidebar_border_color': u'C0DEED', u'id_str': u'799610634899980291', u'profile_background_color': u'F5F8FA', u'listed_count': 0, u'is_translation_enabled': False, u'utc_offset': None, u'statuses_count': 0, u'description': u'', u'friends_count': 0, u'location': u'', u'profile_link_color': u'1DA1F2', u'profile_image_url': u'http://abs.twimg.com/stic

# Cursor

In [None]:
print len(intesa_followers_count)

In [None]:
intesa_followers = intesa.followers()
print len(intesa_followers)

In [None]:
help(tweepy.Cursor)

In [21]:
intesa_cursor = tweepy.Cursor(api.followers, screen_name='intesasanpaolo')

In [None]:
print dir(intesa_cursor)

In [22]:
print intesa_cursor.items()

<tweepy.cursor.ItemIterator object at 0x7fb0ac484690>


In [25]:
mylist =intesa_cursor.items().next()
print mylist

User(follow_request_sent=False, has_extended_profile=False, profile_use_background_image=True, profile_sidebar_fill_color=u'E3E2DE', live_following=False, time_zone=u'Eastern Time (US & Canada)', id=140578968, description=u'The official channel for Cisco Financial Services Industry news, updates and events.', _api=<tweepy.api.API object at 0x7fb0ad176950>, verified=False, blocked_by=False, profile_text_color=u'634047', muting=False, profile_image_url_https=u'https://pbs.twimg.com/profile_images/727954347145891840/XdBPoMtc_normal.jpg', _json={u'follow_request_sent': False, u'has_extended_profile': False, u'profile_use_background_image': True, u'live_following': False, u'default_profile_image': False, u'id': 140578968, u'profile_background_image_url_https': u'https://pbs.twimg.com/profile_background_images/743499327/3559a428155964b2850f71bef72c0bcd.jpeg', u'translator_type': u'none', u'verified': False, u'blocked_by': False, u'profile_text_color': u'634047', u'muting': False, u'profile_i

I 2 codici seguenti chiamano i tweet di ISP fino a quando si raggiunge il limite

In [None]:
#for follower in intesa_cursor.items():
#    print follower

In [None]:
#mylist = []
#for follower in intesa_cursor.pages():
#    print mylist.extend(follower)
#    break

In [27]:
len(mylist)

TypeError: object of type 'User' has no len()

In [24]:
mylist[0]._json

NameError: name 'mylist' is not defined

In [None]:
for i, f in enumerate(mylist):
    print i, f.statuses_count

# Get Hashtags


In [None]:
tweets = []
for tweet in tweepy.Cursor(api.search, q='#trump').items(5):
    print tweet.text
    tweets.append(tweet)
print "\n-----\n"
print tweets[0]

# Avoid Rate Limit Exception

In [None]:
import time
def limit_handler(cursor):
    while True:
        try:
            yield cursor.next()
        except tweepy.RateLimitError:
            print "Timeout Reached, I'm going to sleep for 15 Minutes"
            time.sleep(15*60)
            print "I'm going to try again!"

# manca la gestione della fine totale della richiesta

In [None]:
alexcomu_cursor = tweepy.Cursor(api.followers, screen_name='comualex')

alexcomu_followers = []
for followers in limit_handler(alexcomu_cursor.pages()):
    alexcomu_followers.extend(followers)
    

In [None]:
len(alexcomu_followers)

# Live Streaming

Check the complete example on the folder **esercitazione**.

In [165]:
class BDStreamingListener(tweepy.StreamListener):
    def __init__(self, count):
        super(BDStreamingListener, self).__init__()
        # Number of tweets we want to retrieve
        self.count = count

    def on_status(self, status):
        # automatic called when a new tweet is received
        # print dir(status)
        print dict(user=status.user.screen_name, text=status.text)

        self.count -= 1
        if self.count <= 0:
            return False

    def on_error(self, status_code):
        # automatic called when an error occures
        print "Error with status code: ", status_code
        return False

In [168]:
# Create an instand set the number of tweets we want ro retrieve
listener = BDStreamingListener(5)

# Create the stream fetching object with auth and listener
stream = tweepy.streaming.Stream(auth, listener)

# Tun the stream using filter
stream.filter(track=['#Trump'])


{'text': u'RT @SWagenknecht: #Trump,#Brexit:SPD/CDU/GR\xdcNE immer nur geschockt&amp;machen weiter wie bisher. Soziale Wende \xfcberf\xe4llig! Mein TAZ-Interv http\u2026', 'user': u'ColdWarrior2000'}
{'text': u"RT @yevhenfedchenko: .@RT_America adding more fuel to American political turmoil with KKK rally celebrating #Trump. More 'civil war' nar\u2026 ", 'user': u'ofionnain'}
{'text': u'RT @adjunctprofessr: It\u2019s official \u2014 President-elect Donald Trump has won Michigan, bringing his electoral college count to 295.\n#Trump\nhtt\u2026', 'user': u'G6throughF5'}
{'text': u'Top Trends Switzerland-Nov11 15:40 CET\n#TedXZurich\n#BrazilGP\n#LeonardCohen\n#Trump\n#COP22\n\nhttps://t.co/3NQj9S4bHK', 'user': u'GeoHashTrend'}
{'text': u'RT @LEDOUAISIEN: #USA : le compte Twitter de @EastwoodUSA a \xe9t\xe9 suspendu pour son tweet de f\xe9licitations \xe0 #Trump\nLamentable..!!\u2026 ', 'user': u'Matrix1O1'}


# Get INTESA Followers -- Version 1

In [169]:
# Ask for Followers using Cursor (20 followers per page, with a limit of 15 requests each 15 minutes) ~ 3 Hours
class IntesaFollowers(object):
    
    def __init__(self, auth):
        self.auth = auth
        self.api = tweepy.API(self.auth)
        self.intesa_cursor = tweepy.Cursor(self.api.followers, screen_name='intesasanpaolo')

    def get_followers(self):
        while True:
            try:
                yield self.intesa_cursor.pages().next()
            except tweepy.RateLimitError:
                print "[LOG %s] Timeout reached.. I'm going to sleep for 15 minutes.." % dt.now()
                time.sleep(15*60)
                print "[LOG %s] Try Again!" % dt.now()
            except Exception as e:
                # Generic Exception
                print "[LOG %s] Generic error " % dt.now(), e
                print "[LOG %s] Wait 60 seconds..." % dt.now()
                time.sleep(60)

In [None]:
intesa = IntesaFollowers(auth)
intesa_followers = []
for followers in intesa.get_followers():
    intesa_followers.extend(followers)

# Get INTESA Followers -- Version 2 (Faster)

In [None]:
# Ask for Followers_ids and ask data for each user -> Much Much Faster!  ~ 1.5 Hours
class IntesaFollowers(object):

    def __init__(self, auth):
        self.auth = auth
        self.api = tweepy.API(self.auth)
        self.intesa = self.api.get_user('intesasanpaolo')

    def get_followers(self):
        for follower_id in self.intesa.followers_ids():
            try:
                yield self.api.get_user(follower_id)
            except tweepy.RateLimitError:
                print "[LOG %s] Timeout reached.. I'm going to sleep for 15 minutes.." % dt.now()
                time.sleep(15*60)
                print "[LOG %s] Try Again!" % dt.now()
            except Exception as e:
                # Generic Exception
                print "[LOG %s] Generic error " % dt.now(), e
                print "[LOG %s] Wait 60 seconds..." % dt.now()
                time.sleep(60)

In [None]:
intesa = IntesaFollowers(auth)
intesa_followers = []
for follower in intesa.get_followers():
    intesa_followers.append(follower)

# Get INTESA tweets

In [7]:
class IntesaTweets(object):
    
    def __init__(self, auth):
        self.auth = auth
        self.api = tweepy.API(self.auth)
        self.intesa_cursor = tweepy.Cursor(self.api.user_timeline, screen_name='intesasanpaolo')

    def get_tweets(self):
        while True:
            try:
                yield self.intesa_cursor.pages().next()
            except tweepy.RateLimitError:
                print "[LOG %s] Timeout reached.. I'm going to sleep for 15 minutes.." % dt.now()
                time.sleep(15*60)
                print "[LOG %s] Try Again!" % dt.now()
            except Exception as e:
                # Generic Exception
                print "[LOG %s] Generic error " % dt.now(), e
                print "[LOG %s] Wait 60 seconds..." % dt.now()
                time.sleep(60)


In [8]:
intesa_timeline = IntesaTweets(auth)
intesa_tweets = []
for tweet in intesa_timeline.get_tweets():
    pp.pprint(tweet[0]._json)
    break

{ u'contributors': None,
  u'coordinates': None,
  u'created_at': u'Mon Nov 14 09:25:09 +0000 2016',
  u'entities': { u'hashtags': [ { u'indices': [0, 13],
                                  u'text': u'FlashMercati'}],
                 u'symbols': [],
                 u'urls': [ { u'display_url': u'twitter.com/i/web/status/7\u2026',
                              u'expanded_url': u'https://twitter.com/i/web/status/798094406406590464',
                              u'indices': [116, 139],
                              u'url': u'https://t.co/AOHmJBzUDk'}],
                 u'user_mentions': []},
  u'favorite_count': 0,
  u'favorited': False,
  u'geo': None,
  u'id': 798094406406590464,
  u'id_str': u'798094406406590464',
  u'in_reply_to_screen_name': None,
  u'in_reply_to_status_id': None,
  u'in_reply_to_status_id_str': None,
  u'in_reply_to_user_id': None,
  u'in_reply_to_user_id_str': None,
  u'is_quote_status': False,
  u'lang': u'it',
  u'place': None,
  u'possibly_sensitive': False,


# GET Intesa Favorites

# GET Intesa Friends

# GET Intesa Data