<h3>Scraping Twitter: University Marketplace</h3>

Introduction

Here we are using twitter API to extract tweets related to buy and sell of used items in twitter. After extracting all the tweets using the API related to our search value, we will save all the records in our MySQL tables (tweet, tweet_mentions, tweet_tags). 

<h3>Importing Essential Libraries</h3>

In [19]:
import tweepy
import configparser
import pandas as pd
import pymysql
pymysql.install_as_MySQLdb()
import mysql.connector
from mysql.connector import Error

<h3>Authentication keys</h3>

Here we are defining keys to authenticate with twitter API and start calling API functions to extract tweets for our analysis.

You need to register for a Twitter dev account https://developer.twitter.com

Look at the Twitter data model https://developer.twitter.com/en/docs/twitter-api/v1/data-dictionary/object-model/tweet

Apply for a Twitter Developer Account

Go to the Twitter developer site to apply for a developer account.

Step 2: Create an Application

Twitter grants authentication credentials to apps, not accounts. An app can be any tool or bot that uses the Twitter API. So you need to register your an app to be able to make API calls.

To register your app, go to your Twitter apps page and select the Create an app option.

You need to provide the following information about your app and its purpose:

App name: a name to identify your application (such as examplebot) Application description: the purpose of your application (such as An example bot for a Real Python article) Your or your application’s website URL: required, but can be your personal site’s URL since bots don’t need a URL to work Use of the app: how users will use your app (such as This app is a bot that will automatically respond to users) Step 3: Create the Authentication Credentials

To create the authentication credentials, go to your Twitter apps page. Here’s what the Apps page looks like:

Edit app details Here you’ll find the Details button of your app. Clicking this button takes you to the next page, where you can generate the credentials.

By selecting the Keys and tokens tab, you can generate and copy the key, token, and secrets to use them in your code:

Generate keys and tokens After generating the credentials, save them to later use them in your code.

In [21]:
# read config from config.ini file
config = configparser.ConfigParser()
config.read('config.ini')

api_key = config['twitter']['api_key']
api_key_secret = config['twitter']['api_key_secret']
access_token = config['twitter']['access_token']
access_token_secret = config['twitter']['access_token_secret']

<h3>Authentication</h3>

As we have previously seen, the Twitter API requires that all requests use OAuth to authenticate. So you need to create the required authentication credentials to be able to use the API. These credentials are four text strings:

Consumer key Consumer secret Access token Access secret

In [22]:
# Using Tweepy to authenticate user using api key and access token
auth = tweepy.OAuthHandler(api_key, api_key_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)

# Checking if the api credentials is verified
try:
    api.verify_credentials()
    print("Authentication OK")
except:
    print("Error during authentication")

Authentication OK


<h3>Extracting Tweets</h3>

We are using search_tweets function to get all the tweets with keyword 'selling a table'

In [23]:
# Searching tweets based on the text message
tweets = api.search_tweets('selling a table',count=10000)

# Checking the database connection 
try:
    connection = mysql.connector.connect(host='localhost',
                                         database='twitter_schema',
                                         user='root',
                                         password='root')
    if connection.is_connected():
        db_Info = connection.get_server_info()
        print("Connected to MySQL Server version ", db_Info)
        cursor = connection.cursor()
        cursor.execute("select database();")
        record = cursor.fetchone()
        print("You're connected to database: ", record)
except Error as e:
    print("Error while connecting to MySQL", e)



Connected to MySQL Server version  8.0.31
You're connected to database:  ('twitter_schema',)


<h3>Loading the tweets into the MySQL database</h3>

In [24]:
# Looping over the entire tweets to fetch the required information and inserting the values in three twitter tables: tweet, tweet_mentions, tweet_tags
for tweet in tweets:
    tweet_id = tweet.id
    created_at = tweet.created_at
    twitter_text = tweet.text
    username = tweet.user.screen_name
    name = tweet.user.name
    userId = tweet.user.id
    follower_count = tweet.user.followers_count
    following_count = tweet.user.friends_count
    twitter_handle = tweet.user.screen_name
    profile_image_url = tweet.user.profile_image_url_https
    description = tweet.user.description
    userCreated_at = tweet.user.created_at
    status = api.get_status(tweet_id)
    retweet_count = status.retweet_count 
    
    cursor.execute('''insert into tweet ( twitter_handle, tweet_text, profile_image_url, tweet_date, user_created_at, retweets) values ( %s, %s, %s, %s, %s, %s);''', ( twitter_handle, twitter_text, profile_image_url, created_at,userCreated_at,retweet_count))
    connection.commit()
    if(len(tweet.entities['user_mentions']) > 0):
        for mention in tweet.entities['user_mentions']:
            target_user = mention['screen_name']
            cursor.execute('''insert into tweet_mentions (tweet_id,source_user, target_user) values (%s, %s, %s)''', (cursor._last_insert_id,twitter_handle, target_user))
    connection.commit()
    if(len(tweet.entities['hashtags']) > 0):
        for tag in tweet.entities['hashtags']:
            tag = tag['text']
            cursor.execute('''insert into tweet_tags (tweet_id,tag, target_user) values (%s,%s, %s)''', (cursor._last_insert_id,tag,target_user))
    connection.commit()