tweepy

Official documentation: https://docs.tweepy.org/en/stable/

Ref: https://dev.to/twitterdev/a-comprehensive-guide-for-using-the-twitter-api-v2-using-tweepy-in-python-15d9

In [1]:
import tweepy
import datetime
import pandas as pd
import json
import requests

In [2]:
# Paste in your own bearer token
BEARER_TOKEN = ''

client = tweepy.Client(bearer_token=BEARER_TOKEN, return_type=requests.Response)

In [6]:
'''
Searching for Tweets from the last 7 days

We can use the search_recent_tweets function available in Tweepy.
You will have to pass it a search query to specify the data that you are looking for.

Below, we will search for for Tweets from the last days from the 
Twitter handle realDonaldTrump and we are excluding retweets using -is:retweet.

By default, a request returns 10 Tweets.
If you want more than 10 Tweets per request, you can specify that using the max_results parameter.
The maximum Tweets per request is 100.
'''
# Replace with your own search query
handle = 'KimKardashian'
query = 'from:' + handle + ' -is:retweet'

tweets1 = client.search_recent_tweets(
            query=query,
            tweet_fields=['context_annotations', 'created_at'],
            max_results=100)

In [7]:
'''
Convert to pandas Dataframe
'''

# Save data as dictionary
tweets_dict1 = tweets1.json()

# Extract "data" value from dictionary
tweets_data1 = tweets_dict1['data'] 

# Transform to pandas Dataframe
df1 = pd.json_normalize(tweets_data1)

df1

Unnamed: 0,id,created_at,text,context_annotations
0,1485698860496285700,2022-01-24T19:39:45.000Z,"My condolences go to Manfred’s family, friends...","[{'domain': {'id': '10', 'name': 'Person', 'de..."
1,1485698841265336323,2022-01-24T19:39:41.000Z,You always said beauty will save the world - a...,"[{'domain': {'id': '10', 'name': 'Person', 'de..."
2,1485698812630863873,2022-01-24T19:39:34.000Z,There was so much more for you to show the wor...,"[{'domain': {'id': '10', 'name': 'Person', 'de..."
3,1485698790677835776,2022-01-24T19:39:29.000Z,Manfred Thierry Mugler 💔 My heart breaks. Ther...,"[{'domain': {'id': '10', 'name': 'Person', 'de..."
4,1484593931157336065,2022-01-21T18:29:10.000Z,Just added new items to my #KardashianKloset h...,"[{'domain': {'id': '10', 'name': 'Person', 'de..."
5,1484568994463895552,2022-01-21T16:50:04.000Z,Always ✌🏼💗 https://t.co/w3KBqX7RbW,"[{'domain': {'id': '10', 'name': 'Person', 'de..."
6,1484287511341899776,2022-01-20T22:11:34.000Z,Just dropped NEW @skims Outdoor! Shop now at h...,"[{'domain': {'id': '10', 'name': 'Person', 'de..."
7,1483892525014282243,2022-01-19T20:02:02.000Z,You can expect a wide range of scent notes fro...,"[{'domain': {'id': '10', 'name': 'Person', 'de..."
8,1483892260605349888,2022-01-19T20:00:58.000Z,Coming Soon: Jeff Leatham II by @KKWFragrance....,"[{'domain': {'id': '10', 'name': 'Person', 'de..."
9,1483798357633945602,2022-01-19T13:47:50.000Z,Beach 🅿️arty https://t.co/0rlc0wo7iE,"[{'domain': {'id': '10', 'name': 'Person', 'de..."


In [5]:
'''
If you want to get Tweets for a specific time-period, you can specify the time-period
using the start_time and end_time parameters, as shown in the example below:
'''

# Replace with time period of your choice
start_time = '2022-01-20T00:00:00Z'

# Replace with time period of your choice
end_time = '2022-01-25T00:00:00Z'


# Replace with your own search query
handle = 'elonmusk'
query = 'from:' + handle + ' -is:retweet'

tweets2 = client.search_recent_tweets(
            query=query,
            tweet_fields=['context_annotations', 'created_at'],
            start_time=start_time,
            end_time=end_time,
            max_results=100)

'''
Convert to pandas Dataframe
'''
df2 = pd.json_normalize(tweets2)

tweets2.json()

{'data': [{'text': '@business Great wisdom',
   'id': '1484458464806162432',
   'created_at': '2022-01-21T09:30:52.000Z',
   'context_annotations': [{'domain': {'id': '46',
      'name': 'Brand Category',
      'description': 'Categories within Brand Verticals that narrow down the scope of Brands'},
     'entity': {'id': '781974596157181956', 'name': 'Online Site'}},
    {'domain': {'id': '47',
      'name': 'Brand',
      'description': 'Brands and Companies'},
     'entity': {'id': '1139108298584518656', 'name': 'Bloomberg'}},
    {'domain': {'id': '10',
      'name': 'Person',
      'description': 'Named people in the world like Nelson Mandela'},
     'entity': {'id': '808713037230157824',
      'name': 'Elon Musk',
      'description': 'Elon Musk'}},
    {'domain': {'id': '65',
      'name': 'Interests and Hobbies Vertical',
      'description': 'Top level interests and hobbies groupings, like Food or Travel'},
     'entity': {'id': '781974596148793345', 'name': 'Business & finance

In [18]:
'''
Writing Tweets to a text file

This example shows how you can write the Tweet IDs for each Tweet obtained 
for a search result, to a text file.

Make sure to replace the file_name with the a name of your chosing.

If you wish to write other fields to the text file, make sure to adjust the script below accordingly.
'''

file_name = 'tweets.txt'

df2.to_csv(file_name, index=False, encoding='utf-8')

# If you want a different delimiter, say, tab or '\t'
# You can do:
#    df2.to_csv(file_name, index=False, encoding='utf-8', sep='\t')