# Article Notebook for Scraping Twitter Using snscrape's Python Wrapper
<br>Package Github: https://github.com/JustAnotherArchivist/snscrape
<br>This notebook will be using the development version of snscrape

Article Read-Along: https://medium.com/better-programming/how-to-scrape-tweets-with-snscrape-90124ed006af

https://github.com/igorbrigadir/twitter-advanced-search

### Notebook Author: Martin Beck
<b>Information current as of November, 28th 2020</b><br>

This notebook contains materials for scraping tweets from Twitter using snscrape's Python Wrapper

<b>Dependencies: </b> 
- Your <b>Python</b> version must be <b>3.8</b> or higher. The development version of snscrape will not work with Python 3.7 or lower. You can download the latest Python version [here](https://www.python.org/downloads/).
- <b>Development version of snscrape</b>, uncomment the pip install line in the below cell to pip install in the notebook if you don't already have it.
- <b>Pandas</b>, the dataframes allows easy manipulation and indexing of data, this is more of a preference but is what I follow in this notebook.

In [1]:
# Run the pip install command below if you don't already have the library
# !pip install git+https://github.com/JustAnotherArchivist/snscrape.git

# Run the below command if you don't already have Pandas
# !pip install pandas

# Imports
import snscrape.modules.twitter as sntwitter
import pandas as pd

# Query by Username
The code below will scrape for 100 tweets by a username then provide a CSV file with Pandas

In [None]:
# Setting variables to be used below
maxTweets = 100

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

health = ['WesternHSCTrust', 'NHSCTrust', 'setrust', 'BelfastTrust', 'HSC_NI', 'publichealthni', 'healthdpt']
ni = ['nidirect', 'niexecutive', 'InvestNI', 'niassembly', 'NIOgov', 'belfastcc', 'NISRA', 'FactCheckNI', 
      'CommunitiesNI', 'dptfinance', 'HeadNICS', 'daera_ni', 'Economy_NI', 'ADRC_NI']
# 'Education_NI', 'Ed_Authority'
pro = ['_NIMDTA', 'rcgp_ni', 'BMA_NI', 'NHSC_NI', 'HSCQI', 'PatientClient', 'RQIANews', 'healthcarelib', 'HSCInnovations', 'NIPEC_online']
emergency = ['PoliceServiceNI', 'NIFRSOFFICIAL', 'ParamedicsNi', 'NIAS999']
charity = ['IncludeYouth', 'CommunityNI', 'NICVA', 'AdviceNI', 'GroundworkNI', 'CharityCommNI', 'Age_NI', 'disabilityni', 'YouthActionNI', 
           'Mencap_NI', 'cedarfoundation', 'amhNI', 'AwareNI', 'InspireWBGroup', 'MindWisenv', 'ChildreninNI', 'MacmillanNI', 'cypsp', 
           'CancerFocusNI', 'NIHospice', 'nichildrenshosp', 'CancerFundChild', 'MarieCurieNI', 'actioncancer', 'nichildcom', 'TPHA_UK']
media = ['BBCNewsNI', 'BBCSpotlightNI', 'BBCOneNI', 'BBCnireland', 'bbcnipress', 'bbcradioulster', 'bbcnewsline', 
         'BelTel', 'coolfm', 'QUBelfast', 'UlsterUni', 'utv']

for user in media:
    # Using TwitterSearchScraper to scrape data and append tweets to list
    for i, tweet in enumerate(sntwitter.TwitterSearchScraper(f'from:{user}').get_items()):
        # include:nativeretweets
        # quoted_tweet_id:1546434503316570113 lang:en
        # conversation_id:1546434503316570113 lang:en
        #if i>maxTweets:
        #    break
        tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
        tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])
                         

In [4]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])
# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hastags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/BBCNewsNI/status/154651613...,2022-07-11 15:25:53+00:00,1546516130944237570,"The former Irish soldier was described as a ""v...",BBCNewsNI,Belfast,,2,0,1,0,1546516130944237570,en,,,,,
1,https://twitter.com/BBCNewsNI/status/154649868...,2022-07-11 14:16:33+00:00,1546498685365882882,Mid and East Antrim Borough Council has launch...,BBCNewsNI,Belfast,,5,0,5,1,1546498685365882882,en,,,,,
2,https://twitter.com/BBCNewsNI/status/154643369...,2022-07-11 09:58:19+00:00,1546433696584138752,Two common cranes have successfully hatched tw...,BBCNewsNI,Belfast,[commoncrane],2,6,46,1,1546433696584138752,en,,,[https://twitter.com/BordnaMona],,
3,https://twitter.com/BBCNewsNI/status/154641497...,2022-07-11 08:43:56+00:00,1546414978080690177,A woman has died following a crash in County S...,BBCNewsNI,Belfast,,0,2,5,0,1546414978080690177,en,,,,,
4,https://twitter.com/BBCNewsNI/status/154638502...,2022-07-11 06:44:56+00:00,1546385029491826688,A 26-year-old man has been charged with grievo...,BBCNewsNI,Belfast,,0,2,1,0,1546385029491826688,en,,,,,


In [5]:
# Export dataframe into a CSV
tweets_df1.to_csv('media-tweets.csv', sep=',', index=False)
tweets_df2.to_csv('media-tweets-detailed.csv', sep=',', index=False)

# Query by Text Search
The code below will scrape for 500 tweets between June 1st, 2020 and July 31st, 2020, by a text search then provide a CSV file with Pandas

In [6]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) since:2017-07-11 until:2022-07-11').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [7]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hastags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/Magnum_Hermosa/status/1546...,2022-07-10 23:52:51+00:00,1546281326290026502,This may be the worst diet for colorectal canc...,Magnum_Hermosa,Alabama,,0,0,0,0,1546281326290026502,en,,,,,
1,https://twitter.com/oncdoc7/status/15462808247...,2022-07-10 23:50:51+00:00,1546280824777179138,Exploring Treatment Options for High-Risk Stag...,oncdoc7,,,0,0,0,0,1546280824777179138,en,,,,,
2,https://twitter.com/workingshitout/status/1546...,2022-07-10 23:48:42+00:00,1546280282550140928,@SabinehazanMD @MhamadMhanna5 @10ecgramma @P_M...,workingshitout,,,1,0,1,0,1542301281703714816,en,,,"[https://twitter.com/SabinehazanMD, https://tw...",1.546276e+18,https://twitter.com/SabinehazanMD
3,https://twitter.com/abderazzakali/status/15462...,2022-07-10 23:48:07+00:00,1546280134335930373,This May Be the #1 Worst Diet for Colorectal C...,abderazzakali,,,0,0,0,0,1546280134335930373,en,,,,,
4,https://twitter.com/abderazzakali/status/15462...,2022-07-10 23:48:03+00:00,1546280117344866304,This May Be the #1 Worst Diet for Colorectal C...,abderazzakali,,,0,0,0,0,1546280117344866304,en,,,,,


In [8]:
# Export dataframe into a CSV
tweets_df1.to_csv('text-query-tweets.csv', sep=',', index=False)
tweets_df2.to_csv('text-query-tweets-detailed.csv', sep=',', index=False)

# Query by Hashtag
The code below will scrape for 500 tweets between June 1st, 2020 and July 31st, 2020, by a text search then provide a CSV file with Pandas

In [3]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer since:2017-07-11 until:2022-07-11 lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [4]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/stacy_hurt/status/15462761...,2022-07-10 23:32:20+00:00,1546276165635252224,"First of all, @jerrykelly13pga is the greatest...",stacy_hurt,"Pittsburgh, PA",[colorectalcancer],2,0,14,1,1546276165635252224,en,,https://twitter.com/ChampionsTour/status/15462...,"[https://twitter.com/jerrykelly13pga, https://...",,
1,https://twitter.com/BiachiTiago/status/1546274...,2022-07-10 23:24:02+00:00,1546274073960931328,"CAPOX 3 months, CAPOX 3 months, CAPOX 3 months...",BiachiTiago,"New York, NY",[colorectalcancer],1,19,78,0,1546274073960931328,en,,,,,
2,https://twitter.com/collett31uk/status/1546260...,2022-07-10 22:28:50+00:00,1546260182338002945,Coming to the end of the longest week of my li...,collett31uk,"Avonmouth, Bristol","[bowelcancer, cancerawareness, teamgrandon]",0,0,0,0,1546260182338002945,en,,,,,
3,https://twitter.com/the_poop_stick/status/1546...,2022-07-10 21:11:03+00:00,1546240608888623105,My Dad almost died of Colon Cancer.\nThis is o...,the_poop_stick,United States,"[guthealth, coloncancer, poop, easeyourpoop, h...",0,0,0,0,1546240608888623105,en,,,,,
4,https://twitter.com/MayoClinic/status/15462277...,2022-07-10 20:20:00+00:00,1546227761429336066,Mayo Clinic gastroenterologist Dr. Lisa Boardm...,MayoClinic,"Minnesota, Florida, Arizona","[ColorectalCancer, crcsm]",5,12,20,1,1546227761429336066,en,,,[https://twitter.com/MayoCancerCare],,


In [None]:
# Export dataframe into a CSV
tweets_df1.to_csv('hashtag-tweets.csv', sep=',', index=False)
tweets_df2 = tweets_df2.replace('|','', regex=True)
tweets_df2.to_csv('hashtag-tweets-detailed.csv', sep=',', index=False)

In [9]:
#tweets_df3 = tweets_df2.replace(',','', regex=True)
tweets_df2.to_csv('hashtag-tweets-detailed.csv', sep=',', escapechar='\\', index=False)

# Location based

In [4]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:Belfast within:120mi since:2017-07-11 until:2022-07-11 lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [5]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/PaulTurner140/status/15448...,2022-07-06 22:30:21+00:00,1544811014428016640,@bowelcanceruk Two years to the day since I un...,PaulTurner140,"Belfast, Northern Ireland",,2,0,0,0,1544811014428016640,en,,,"[https://twitter.com/bowelcanceruk, https://tw...",,
1,https://twitter.com/georgeinbelfast/status/154...,2022-06-29 19:41:10+00:00,1542231720774311939,BBC News - Bowel cancer: How to check your poo...,georgeinbelfast,Belfast Northern Ireland,,0,0,0,0,1542231720774311939,en,,,,,
2,https://twitter.com/Elizabe96273629/status/154...,2022-06-29 16:40:35+00:00,1542186275964440579,Deborah James - Podcaster | Bowel Cancer | The...,Elizabe96273629,"Northern Ireland, United Kingdom",,0,0,0,0,1542186275964440579,en,,,[https://twitter.com/YouTube],,
3,https://twitter.com/SteinJock/status/141608737...,2021-07-16 17:28:34+00:00,1416087376900861954,This time 3 years ago I was recovering from bo...,SteinJock,"Newcastle, Ireland",,45,5,442,0,1416087376900861954,en,,,,,
4,https://twitter.com/NewryTimes/status/13885097...,2021-05-01 15:05:05+00:00,1388509788800266242,A vital programme aimed at raising awareness o...,NewryTimes,Newry,[Newry],0,0,0,0,1388509788800266242,en,,,,,


In [6]:
# Export dataframe into a CSV
tweets_df1.to_csv('belfast-query-tweets.csv', index=False)
tweets_df2.to_csv('belfast-query-tweets-detailed.csv', index=False)

In [7]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:Belfast within:120mi since:2017-07-11 until:2022-07-11 lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [8]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/apallan/status/12077857206...,2019-12-19 22:12:10+00:00,1207785720661905408,ANN ALLAN : A REVIEW OF 2019 https://t.co/qHEg...,apallan,"Belfast, Northern Ireland","[bowelcancer, suicide, nichildrenshospice]",0,5,35,0,1207785720661905408,en,,,"[https://twitter.com/apallan, https://twitter....",,
1,https://twitter.com/LouiseBrogan_/status/11392...,2019-06-13 18:32:52+00:00,1139239229324701697,Ready to row tomorrow... #bowelcancer https://...,LouiseBrogan_,United Kingdom,[bowelcancer],1,0,1,0,1139239229324701697,en,,,,,


In [None]:
# Export dataframe into a CSV
tweets_df1.to_csv('belfast-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('belfast-hashtag-tweets-detailed.csv', index=False)

In [31]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"United Kingdom" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"United Kingdom" lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [32]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/oratoba/status/15610849583...,2022-08-20 20:17:12+00:00,1561084958340415489,"@HughTeddy_ Black Turd...\n\n""Sorry, you have ...",oratoba,"Winchester, UK",,1,0,1,0,1561084594157494275,en,,,[https://twitter.com/HughTeddy_],1.561085e+18,https://twitter.com/HughTeddy_
1,https://twitter.com/WanjiruNjoya/status/156108...,2022-08-20 20:04:43+00:00,1561081816735440896,"@wil_da_beast630 Free healthcare: ""The NHS is ...",WanjiruNjoya,East Devon,,1,1,11,0,1561077558246088705,en,,,[https://twitter.com/wil_da_beast630],1.561082e+18,https://twitter.com/WanjiruNjoya
2,https://twitter.com/LucyCunliffexx/status/1561...,2022-08-20 19:41:18+00:00,1561075923427917825,Im so proud of my husband &amp; his incredible...,LucyCunliffexx,"Newton-le-Willows, England",,0,0,1,0,1561075923427917825,en,,,"[https://twitter.com/wbhospice, https://twitte...",,
3,https://twitter.com/dalehay/status/15610666766...,2022-08-20 19:04:33+00:00,1561066676698898438,"I've shared it before, and I'll share it again...",dalehay,"Swindon, United Kingdom","[Cancer, KeepFartsFunny]",0,0,1,0,1561066676698898438,en,,,[https://twitter.com/MeredithMCCF],,
4,https://twitter.com/HamidGhanaati/status/15610...,2022-08-20 17:06:25+00:00,1561036946188689411,@GRNesh17 Father God I come before you to pray...,HamidGhanaati,"Sedella, España",,0,0,0,0,1560638200057540610,en,,,[https://twitter.com/GRNesh17],1.560638e+18,https://twitter.com/GRNesh17


In [33]:
# Export dataframe into a CSV
tweets_df1.to_csv('unitedkingdom-query-tweets.csv', index=False)
tweets_df2.to_csv('unitedkingdom-query-tweets-detailed.csv', index=False)

In [34]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"United Kingdom" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"United Kingdom" lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [35]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/Smudge92084582/status/1561...,2022-08-20 15:01:59+00:00,1561005631234080770,"Hi #oatomates , when you go swimming, do you h...",Smudge92084582,"England, United Kingdom","[oatomates, ileostomy, bowelcancer, stomaaware...",7,0,3,0,1561005631234080770,en,,,,,
1,https://twitter.com/LBofHounslow/status/156097...,2022-08-20 13:00:48+00:00,1560975133828866049,Did you know people who complete #bowelcancer ...,LBofHounslow,"Hounslow, west London","[bowelcancer, HealthyHounslow, NHS]",0,1,0,0,1560975133828866049,en,,,,,
2,https://twitter.com/BowelResearch/status/15609...,2022-08-20 10:00:51+00:00,1560929848096276480,"""In the space of four days, I went from feelin...",BowelResearch,"London, UK","[auguts, HaveYouGotTheGuts, bowelcancer, bowel...",0,0,0,0,1560929848096276480,en,,,,,
3,https://twitter.com/LSC_CA_ALLIANCE/status/156...,2022-08-20 07:45:00+00:00,1560895661033172996,The NHS is expanding #bowelcancer screening to...,LSC_CA_ALLIANCE,"Preston, England",[bowelcancer],0,1,1,0,1560895661033172996,en,,,,,
4,https://twitter.com/MrBakerKS2/status/15608933...,2022-08-20 07:35:46+00:00,1560893337686663168,"@OJBorg Morning OJ, from the middle of a very ...",MrBakerKS2,"Geneva, Switzerland","[EPICjourney, bowelcancer]",0,0,0,0,1560893337686663168,en,,,"[https://twitter.com/OJBorg, https://twitter.c...",,


In [36]:
# Export dataframe into a CSV
tweets_df1.to_csv('unitedkingdom-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('unitedkingdom-hashtag-tweets-detailed.csv', index=False)

In [71]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Dublin, Ireland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [72]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/johnekinsella/status/15606...,2022-08-19 16:58:49+00:00,1560672647440379904,@lisamorgan39 Really sorry to hear this Lisa. ...,johnekinsella,"Dublin, IRELAND",,0,0,0,0,1560208851772870657,en,,,[https://twitter.com/lisamorgan39],1.560209e+18,https://twitter.com/lisamorgan39
1,https://twitter.com/CathainSeo/status/15604809...,2022-08-19 04:17:15+00:00,1560480991214460928,@JacquelineCoug7 @worzel13 @Amojak2 @Tim_from_...,CathainSeo,"Longford, Ireland",,0,0,0,0,1559994596503031810,en,,,"[https://twitter.com/JacquelineCoug7, https://...",1.560427e+18,https://twitter.com/JacquelineCoug7
2,https://twitter.com/FionnualaMinto/status/1560...,2022-08-18 21:33:03+00:00,1560379269888581632,@gibigill All I can do is send love. I went ba...,FionnualaMinto,"Limerick, Ireland",,0,0,4,0,1559994596503031810,en,,,[https://twitter.com/gibigill],1.559995e+18,https://twitter.com/gibigill
3,https://twitter.com/CCPembs/status/15602398639...,2022-08-18 12:19:06+00:00,1560239863919779846,"If you’re living with or beyond bowel cancer, ...",CCPembs,Haverfordwest,,0,0,0,0,1560239863919779846,en,,,[https://twitter.com/bowelcanceruk],,
4,https://twitter.com/JanisLeary/status/15601738...,2022-08-18 07:56:37+00:00,1560173808060866560,Parents who have Dementia &amp; others simply ...,JanisLeary,Inishowen Co Donegal,,0,0,0,0,1560172618514415617,en,,,,1.560173e+18,https://twitter.com/JanisLeary


In [73]:
# Export dataframe into a CSV
tweets_df1.to_csv('ireland-query-tweets.csv', index=False)
tweets_df2.to_csv('ireland-query-tweets-detailed.csv', index=False)

In [83]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Republic of Ireland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [84]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/johnekinsella/status/15606...,2022-08-19 16:58:49+00:00,1560672647440379904,@lisamorgan39 Really sorry to hear this Lisa. ...,johnekinsella,"Dublin, IRELAND",,0,0,0,0,1560208851772870657,en,,,[https://twitter.com/lisamorgan39],1.560209e+18,https://twitter.com/lisamorgan39
1,https://twitter.com/CathainSeo/status/15604809...,2022-08-19 04:17:15+00:00,1560480991214460928,@JacquelineCoug7 @worzel13 @Amojak2 @Tim_from_...,CathainSeo,"Longford, Ireland",,0,0,0,0,1559994596503031810,en,,,"[https://twitter.com/JacquelineCoug7, https://...",1.560427e+18,https://twitter.com/JacquelineCoug7
2,https://twitter.com/FionnualaMinto/status/1560...,2022-08-18 21:33:03+00:00,1560379269888581632,@gibigill All I can do is send love. I went ba...,FionnualaMinto,"Limerick, Ireland",,0,0,4,0,1559994596503031810,en,,,[https://twitter.com/gibigill],1.559995e+18,https://twitter.com/gibigill
3,https://twitter.com/JanisLeary/status/15601738...,2022-08-18 07:56:37+00:00,1560173808060866560,Parents who have Dementia &amp; others simply ...,JanisLeary,Inishowen Co Donegal,,0,0,0,0,1560172618514415617,en,,,,1.560173e+18,https://twitter.com/JanisLeary
4,https://twitter.com/JohnMcCullough6/status/156...,2022-08-17 23:28:16+00:00,1560045877590368262,"@gibigill You have such quiet dignity, Gill. B...",JohnMcCullough6,"Belfast, Northern Ireland",,0,0,0,0,1559994596503031810,en,,,[https://twitter.com/gibigill],1.559995e+18,https://twitter.com/gibigill


In [85]:
# Export dataframe into a CSV
tweets_df1.to_csv('roi-query-tweets.csv', index=False)
tweets_df2.to_csv('roi-query-tweets-detailed.csv', index=False)

In [74]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Dublin, Ireland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [75]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/fidelma4Europe/status/1560...,2022-08-18 19:40:04+00:00,1560350837100220422,Bless you girl and hold on #bowelcancer https:...,fidelma4Europe,"Galway, Ireland",[bowelcancer],1,0,1,0,1560350837100220422,en,,https://twitter.com/gibigill/status/1559994596...,,,
1,https://twitter.com/FionnualaMinto/status/1559...,2022-08-17 16:53:44+00:00,1559946593322901506,First chemo session today. So far feeling good...,FionnualaMinto,"Limerick, Ireland","[coloncancer, Cancer, positivity]",2,0,3,0,1559946593322901506,en,,,,,
2,https://twitter.com/martinebran/status/1558584...,2022-08-13 22:42:45+00:00,1558584874512023552,Watching #SkyNews and so admiring late Deborah...,martinebran,Dublin,"[SkyNews, BowelCancer, stoma]",2,0,2,0,1558584874512023552,en,,,,,
3,https://twitter.com/suzi1dore/status/154575383...,2022-07-09 12:56:47+00:00,1545753835242754050,Who needs to fly anywhere when you have beauti...,suzi1dore,"Rayne, Essex","[portmeirion, portmerion, wales, cancersurvivo...",4,1,71,0,1545753835242754050,en,,,,,
4,https://twitter.com/deanjones50/status/1482372...,2022-01-15 15:22:01+00:00,1482372505486733318,Had a #bowelcancer screening. All clear 🙂,deanjones50,ammanford,[bowelcancer],0,0,1,0,1482372505486733318,en,,,,,


In [76]:
# Export dataframe into a CSV
tweets_df1.to_csv('ireland-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('ireland-hashtag-tweets-detailed.csv', index=False)

In [86]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Republic of Ireland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [87]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/fidelma4Europe/status/1560...,2022-08-18 19:40:04+00:00,1560350837100220422,Bless you girl and hold on #bowelcancer https:...,fidelma4Europe,"Galway, Ireland",[bowelcancer],1,0,1,0,1560350837100220422,en,,https://twitter.com/gibigill/status/1559994596...,,,
1,https://twitter.com/FionnualaMinto/status/1559...,2022-08-17 16:53:44+00:00,1559946593322901506,First chemo session today. So far feeling good...,FionnualaMinto,"Limerick, Ireland","[coloncancer, Cancer, positivity]",2,0,3,0,1559946593322901506,en,,,,,
2,https://twitter.com/martinebran/status/1558584...,2022-08-13 22:42:45+00:00,1558584874512023552,Watching #SkyNews and so admiring late Deborah...,martinebran,Dublin,"[SkyNews, BowelCancer, stoma]",2,0,2,0,1558584874512023552,en,,,,,
3,https://twitter.com/JillMacauley/status/153425...,2022-06-07 19:07:37+00:00,1534250747667300352,To all those struggling right now - There’s al...,JillMacauley,Rathfriland,"[RebelliousHope, YouMeBigC, BowelBabe, DameDeb...",4,11,92,0,1534250747667300352,en,,,"[https://twitter.com/bowelbabe, https://twitte...",,


In [88]:
# Export dataframe into a CSV
tweets_df1.to_csv('roi-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('roi-hashtag-tweets-detailed.csv', index=False)

In [65]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Belfast, Northern Ireland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [66]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/louise30223835/status/1560...,2022-08-19 20:50:44+00:00,1560731008881766401,@Liesl_1789 @MissPenguin1755 @caroldecker That...,louise30223835,"Stirling, Scotland",,1,0,0,0,1559941740634820608,en,,,"[https://twitter.com/Liesl_1789, https://twitt...",1.560729e+18,https://twitter.com/Liesl_1789
1,https://twitter.com/Onamattopinion/status/1560...,2022-08-19 18:53:42+00:00,1560701559364886529,@LoveMyHawks21 @kisuvior @MichaelBensonn @usyk...,Onamattopinion,"Scotland, United Kingdom",,2,0,1,0,1560658452086542336,en,,,"[https://twitter.com/LoveMyHawks21, https://tw...",1.560685e+18,https://twitter.com/LoveMyHawks21
2,https://twitter.com/johnekinsella/status/15606...,2022-08-19 16:58:49+00:00,1560672647440379904,@lisamorgan39 Really sorry to hear this Lisa. ...,johnekinsella,"Dublin, IRELAND",,0,0,0,0,1560208851772870657,en,,,[https://twitter.com/lisamorgan39],1.560209e+18,https://twitter.com/lisamorgan39
3,https://twitter.com/CathainSeo/status/15604809...,2022-08-19 04:17:15+00:00,1560480991214460928,@JacquelineCoug7 @worzel13 @Amojak2 @Tim_from_...,CathainSeo,"Longford, Ireland",,0,0,0,0,1559994596503031810,en,,,"[https://twitter.com/JacquelineCoug7, https://...",1.560427e+18,https://twitter.com/JacquelineCoug7
4,https://twitter.com/LisaChastie/status/1560330...,2022-08-18 18:20:41+00:00,1560330862327218176,Amazon delivery today @bowelbabe . An inspirat...,LisaChastie,"Scotland, United Kingdom","[bowelcancer, LifeLessons]",0,0,4,0,1560330862327218176,en,,,[https://twitter.com/bowelbabe],,


In [67]:
# Export dataframe into a CSV
tweets_df1.to_csv('northernireland-query-tweets.csv', index=False)
tweets_df2.to_csv('northernireland-query-tweets-detailed.csv', index=False)

In [68]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Belfast, Northern Ireland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [69]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/LisaChastie/status/1560330...,2022-08-18 18:20:41+00:00,1560330862327218176,Amazon delivery today @bowelbabe . An inspirat...,LisaChastie,"Scotland, United Kingdom","[bowelcancer, LifeLessons]",0,0,4,0,1560330862327218176,en,,,[https://twitter.com/bowelbabe],,
1,https://twitter.com/Daily_Record/status/155960...,2022-08-16 18:10:00+00:00,1559603395761246208,Sophie has battled Bowel Cancer twice at just ...,Daily_Record,Glasgow,[BowelCancer],0,2,7,0,1559603395761246208,en,,,,,
2,https://twitter.com/Daily_Record/status/155952...,2022-08-16 13:09:00+00:00,1559527646635106304,So inspiring 👏 Keep it up Sophie! #BowelCancer...,Daily_Record,Glasgow,[BowelCancer],0,0,0,0,1559527646635106304,en,,,,,
3,https://twitter.com/Daily_Record/status/155943...,2022-08-16 07:10:00+00:00,1559437301293498368,Sophie is so inspiring! #BowelCancer https://t...,Daily_Record,Glasgow,[BowelCancer],0,1,3,0,1559437301293498368,en,,,,,
4,https://twitter.com/martinebran/status/1558584...,2022-08-13 22:42:45+00:00,1558584874512023552,Watching #SkyNews and so admiring late Deborah...,martinebran,Dublin,"[SkyNews, BowelCancer, stoma]",2,0,2,0,1558584874512023552,en,,,,,


In [70]:
# Export dataframe into a CSV
tweets_df1.to_csv('northernireland-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('northernireland-hashtag-tweets-detailed.csv', index=False)

In [59]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Scotland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Edinburgh, Scotland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [60]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/LucyCunliffexx/status/1561...,2022-08-20 19:41:18+00:00,1561075923427917825,Im so proud of my husband &amp; his incredible...,LucyCunliffexx,"Newton-le-Willows, England",,0,0,1,0,1561075923427917825,en,,,"[https://twitter.com/wbhospice, https://twitte...",,
1,https://twitter.com/leemufc74/status/156101193...,2022-08-20 15:27:01+00:00,1561011932177600514,I'm fundraising for Bowel Cancer UK. Check out...,leemufc74,"Holywell, Wales",[JustGiving],0,0,1,0,1561011932177600514,en,,,[https://twitter.com/JustGiving],,
2,https://twitter.com/bowelcanceruk/status/15610...,2022-08-20 15:00:11+00:00,1561005181185163265,"""My hospital stay of 11 days was very pleasant...",bowelcanceruk,UK,,0,3,4,0,1561005181185163265,en,,,,,
3,https://twitter.com/pash22/status/156100265077...,2022-08-20 14:50:08+00:00,1561002650770018310,"As a cancer patient, I felt dismissed by docto...",pash22,United Kingdom,,0,0,1,0,1561002650770018310,en,,,[https://twitter.com/benbravery],,
4,https://twitter.com/Assistdotclaims/status/156...,2022-08-20 11:11:37+00:00,1560947658596007942,@TheGumFairy I’m two weeks into a bowel cancer...,Assistdotclaims,United Kingdom,,0,0,3,0,1560745229422297088,en,,,[https://twitter.com/TheGumFairy],1.560745e+18,https://twitter.com/TheGumFairy


In [61]:
# Export dataframe into a CSV
tweets_df1.to_csv('scotland-query-tweets.csv', index=False)
tweets_df2.to_csv('scotland-query-tweets-detailed.csv', index=False)

In [62]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Edinburgh, Scotland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [63]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/LSC_CA_ALLIANCE/status/156...,2022-08-20 07:45:00+00:00,1560895661033172996,The NHS is expanding #bowelcancer screening to...,LSC_CA_ALLIANCE,"Preston, England",[bowelcancer],0,1,1,0,1560895661033172996,en,,,,,
1,https://twitter.com/PennineBowelCSP/status/156...,2022-08-19 11:00:09+00:00,1560582386244431874,Apple &amp; linseed porridge\n#guthealth #bowe...,PennineBowelCSP,"Bury, England","[guthealth, bowelcancer]",0,1,1,0,1560582386244431874,en,,,,,
2,https://twitter.com/Assistdotclaims/status/156...,2022-08-19 08:57:19+00:00,1560551473385177088,@militaryhistori Thank you very much! I hope ...,Assistdotclaims,United Kingdom,[bowelcancer],0,0,0,0,1560550969682710528,en,,,[https://twitter.com/militaryhistori],1.560551e+18,https://twitter.com/militaryhistori
3,https://twitter.com/Assistdotclaims/status/156...,2022-08-19 08:31:10+00:00,1560544891402428416,@JoanneMCrompton I’ve got a vital MRI and CT s...,Assistdotclaims,United Kingdom,[bowelcancer],0,0,1,0,1560510097985601536,en,,,[https://twitter.com/JoanneMCrompton],1.56051e+18,https://twitter.com/JoanneMCrompton
4,https://twitter.com/LisaChastie/status/1560330...,2022-08-18 18:20:41+00:00,1560330862327218176,Amazon delivery today @bowelbabe . An inspirat...,LisaChastie,"Scotland, United Kingdom","[bowelcancer, LifeLessons]",0,0,4,0,1560330862327218176,en,,,[https://twitter.com/bowelbabe],,


In [64]:
# Export dataframe into a CSV
tweets_df1.to_csv('scotland-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('scotland-hashtag-tweets-detailed.csv', index=False)

In [77]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Scotland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Cardiff, Wales" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [78]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/InspireAlpine/status/15611...,2022-08-20 23:06:26+00:00,1561127548897185792,"Thousands raised in memory of ""inspirational"" ...",InspireAlpine,"Wrexham, Wales",,0,0,0,0,1561127548897185792,en,,,,,
1,https://twitter.com/oratoba/status/15610849583...,2022-08-20 20:17:12+00:00,1561084958340415489,"@HughTeddy_ Black Turd...\n\n""Sorry, you have ...",oratoba,"Winchester, UK",,1,0,1,0,1561084594157494275,en,,,[https://twitter.com/HughTeddy_],1.561085e+18,https://twitter.com/HughTeddy_
2,https://twitter.com/WanjiruNjoya/status/156108...,2022-08-20 20:04:43+00:00,1561081816735440896,"@wil_da_beast630 Free healthcare: ""The NHS is ...",WanjiruNjoya,East Devon,,1,1,12,0,1561077558246088705,en,,,[https://twitter.com/wil_da_beast630],1.561082e+18,https://twitter.com/WanjiruNjoya
3,https://twitter.com/LucyCunliffexx/status/1561...,2022-08-20 19:41:18+00:00,1561075923427917825,Im so proud of my husband &amp; his incredible...,LucyCunliffexx,"Newton-le-Willows, England",,0,0,1,0,1561075923427917825,en,,,"[https://twitter.com/wbhospice, https://twitte...",,
4,https://twitter.com/dalehay/status/15610666766...,2022-08-20 19:04:33+00:00,1561066676698898438,"I've shared it before, and I'll share it again...",dalehay,"Swindon, United Kingdom","[Cancer, KeepFartsFunny]",0,0,1,0,1561066676698898438,en,,,[https://twitter.com/MeredithMCCF],,


In [79]:
# Export dataframe into a CSV
tweets_df1.to_csv('wales-query-tweets.csv', index=False)
tweets_df2.to_csv('wales-query-tweets-detailed.csv', index=False)

In [80]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Cardiff, Wales" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [81]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/Smudge92084582/status/1561...,2022-08-20 15:01:59+00:00,1561005631234080770,"Hi #oatomates , when you go swimming, do you h...",Smudge92084582,"England, United Kingdom","[oatomates, ileostomy, bowelcancer, stomaaware...",7,0,3,0,1561005631234080770,en,,,,,
1,https://twitter.com/LSC_CA_ALLIANCE/status/156...,2022-08-20 07:45:00+00:00,1560895661033172996,The NHS is expanding #bowelcancer screening to...,LSC_CA_ALLIANCE,"Preston, England",[bowelcancer],0,1,1,0,1560895661033172996,en,,,,,
2,https://twitter.com/MrBakerKS2/status/15608933...,2022-08-20 07:35:46+00:00,1560893337686663168,"@OJBorg Morning OJ, from the middle of a very ...",MrBakerKS2,"Geneva, Switzerland","[EPICjourney, bowelcancer]",0,0,0,0,1560893337686663168,en,,,"[https://twitter.com/OJBorg, https://twitter.c...",,
3,https://twitter.com/MrBakerKS2/status/15608892...,2022-08-20 07:19:41+00:00,1560889291651293185,@BBCSouthNews Good morning from Woking... We a...,MrBakerKS2,"Geneva, Switzerland",[bowelcancer],0,0,0,0,1560889291651293185,en,,,"[https://twitter.com/BBCSouthNews, https://twi...",,
4,https://twitter.com/ali6217/status/15607305149...,2022-08-19 20:48:46+00:00,1560730514918629376,"Today I had a colonoscopy with no sedation, ju...",ali6217,"Ferndale, Wales","[bowelbabe, polyps, bowelcancer]",0,0,1,0,1560730514918629376,en,,,,,


In [82]:
# Export dataframe into a CSV
tweets_df1.to_csv('wales-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('wales-hashtag-tweets-detailed.csv', index=False)

In [89]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Scotland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"London, England" within:300mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [90]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/mojonojo3/status/156113234...,2022-08-20 23:25:30+00:00,1561132347357073408,@scalzi Although I do have sleepless nights ab...,mojonojo3,Uk,,0,0,0,0,1561119194443354112,en,,,[https://twitter.com/scalzi],1.561132e+18,https://twitter.com/mojonojo3
1,https://twitter.com/InspireAlpine/status/15611...,2022-08-20 23:06:26+00:00,1561127548897185792,"Thousands raised in memory of ""inspirational"" ...",InspireAlpine,"Wrexham, Wales",,0,0,0,0,1561127548897185792,en,,,,,
2,https://twitter.com/oratoba/status/15610849583...,2022-08-20 20:17:12+00:00,1561084958340415489,"@HughTeddy_ Black Turd...\n\n""Sorry, you have ...",oratoba,"Winchester, UK",,1,0,1,0,1561084594157494275,en,,,[https://twitter.com/HughTeddy_],1.561085e+18,https://twitter.com/HughTeddy_
3,https://twitter.com/WanjiruNjoya/status/156108...,2022-08-20 20:04:43+00:00,1561081816735440896,"@wil_da_beast630 Free healthcare: ""The NHS is ...",WanjiruNjoya,East Devon,,1,1,12,0,1561077558246088705,en,,,[https://twitter.com/wil_da_beast630],1.561082e+18,https://twitter.com/WanjiruNjoya
4,https://twitter.com/LucyCunliffexx/status/1561...,2022-08-20 19:41:18+00:00,1561075923427917825,Im so proud of my husband &amp; his incredible...,LucyCunliffexx,"Newton-le-Willows, England",,0,0,1,0,1561075923427917825,en,,,"[https://twitter.com/wbhospice, https://twitte...",,


In [91]:
# Export dataframe into a CSV
tweets_df1.to_csv('londonengland-query-tweets.csv', index=False)
tweets_df2.to_csv('londonengland-query-tweets-detailed.csv', index=False)

In [92]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"London, England" within:300mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [93]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/Smudge92084582/status/1561...,2022-08-20 15:01:59+00:00,1561005631234080770,"Hi #oatomates , when you go swimming, do you h...",Smudge92084582,"England, United Kingdom","[oatomates, ileostomy, bowelcancer, stomaaware...",7,0,3,0,1561005631234080770,en,,,,,
1,https://twitter.com/LBofHounslow/status/156097...,2022-08-20 13:00:48+00:00,1560975133828866049,Did you know people who complete #bowelcancer ...,LBofHounslow,"Hounslow, west London","[bowelcancer, HealthyHounslow, NHS]",0,1,0,0,1560975133828866049,en,,,,,
2,https://twitter.com/BowelResearch/status/15609...,2022-08-20 10:00:51+00:00,1560929848096276480,"""In the space of four days, I went from feelin...",BowelResearch,"London, UK","[auguts, HaveYouGotTheGuts, bowelcancer, bowel...",0,0,0,0,1560929848096276480,en,,,,,
3,https://twitter.com/MrBakerKS2/status/15608933...,2022-08-20 07:35:46+00:00,1560893337686663168,"@OJBorg Morning OJ, from the middle of a very ...",MrBakerKS2,"Geneva, Switzerland","[EPICjourney, bowelcancer]",0,0,0,0,1560893337686663168,en,,,"[https://twitter.com/OJBorg, https://twitter.c...",,
4,https://twitter.com/MrBakerKS2/status/15608892...,2022-08-20 07:19:41+00:00,1560889291651293185,@BBCSouthNews Good morning from Woking... We a...,MrBakerKS2,"Geneva, Switzerland",[bowelcancer],0,0,0,0,1560889291651293185,en,,,"[https://twitter.com/BBCSouthNews, https://twi...",,


In [94]:
# Export dataframe into a CSV
tweets_df1.to_csv('londonengland-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('londonengland-hashtag-tweets-detailed.csv', index=False)

In [95]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Scotland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"England, United Kingdom" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [96]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/InspireAlpine/status/15611...,2022-08-20 23:06:26+00:00,1561127548897185792,"Thousands raised in memory of ""inspirational"" ...",InspireAlpine,"Wrexham, Wales",,0,0,0,0,1561127548897185792,en,,,,,
1,https://twitter.com/oratoba/status/15610849583...,2022-08-20 20:17:12+00:00,1561084958340415489,"@HughTeddy_ Black Turd...\n\n""Sorry, you have ...",oratoba,"Winchester, UK",,1,0,1,0,1561084594157494275,en,,,[https://twitter.com/HughTeddy_],1.561085e+18,https://twitter.com/HughTeddy_
2,https://twitter.com/WanjiruNjoya/status/156108...,2022-08-20 20:04:43+00:00,1561081816735440896,"@wil_da_beast630 Free healthcare: ""The NHS is ...",WanjiruNjoya,East Devon,,1,1,12,0,1561077558246088705,en,,,[https://twitter.com/wil_da_beast630],1.561082e+18,https://twitter.com/WanjiruNjoya
3,https://twitter.com/LucyCunliffexx/status/1561...,2022-08-20 19:41:18+00:00,1561075923427917825,Im so proud of my husband &amp; his incredible...,LucyCunliffexx,"Newton-le-Willows, England",,0,0,1,0,1561075923427917825,en,,,"[https://twitter.com/wbhospice, https://twitte...",,
4,https://twitter.com/dalehay/status/15610666766...,2022-08-20 19:04:33+00:00,1561066676698898438,"I've shared it before, and I'll share it again...",dalehay,"Swindon, United Kingdom","[Cancer, KeepFartsFunny]",0,0,1,0,1561066676698898438,en,,,[https://twitter.com/MeredithMCCF],,


In [97]:
# Export dataframe into a CSV
tweets_df1.to_csv('englanduk-query-tweets.csv', index=False)
tweets_df2.to_csv('englanduk-query-tweets-detailed.csv', index=False)

In [98]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"England, United Kingdom" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [99]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/Smudge92084582/status/1561...,2022-08-20 15:01:59+00:00,1561005631234080770,"Hi #oatomates , when you go swimming, do you h...",Smudge92084582,"England, United Kingdom","[oatomates, ileostomy, bowelcancer, stomaaware...",7,0,3,0,1561005631234080770,en,,,,,
1,https://twitter.com/LBofHounslow/status/156097...,2022-08-20 13:00:48+00:00,1560975133828866049,Did you know people who complete #bowelcancer ...,LBofHounslow,"Hounslow, west London","[bowelcancer, HealthyHounslow, NHS]",0,1,0,0,1560975133828866049,en,,,,,
2,https://twitter.com/BowelResearch/status/15609...,2022-08-20 10:00:51+00:00,1560929848096276480,"""In the space of four days, I went from feelin...",BowelResearch,"London, UK","[auguts, HaveYouGotTheGuts, bowelcancer, bowel...",0,0,0,0,1560929848096276480,en,,,,,
3,https://twitter.com/LSC_CA_ALLIANCE/status/156...,2022-08-20 07:45:00+00:00,1560895661033172996,The NHS is expanding #bowelcancer screening to...,LSC_CA_ALLIANCE,"Preston, England",[bowelcancer],0,1,1,0,1560895661033172996,en,,,,,
4,https://twitter.com/MrBakerKS2/status/15608933...,2022-08-20 07:35:46+00:00,1560893337686663168,"@OJBorg Morning OJ, from the middle of a very ...",MrBakerKS2,"Geneva, Switzerland","[EPICjourney, bowelcancer]",0,0,0,0,1560893337686663168,en,,,"[https://twitter.com/OJBorg, https://twitter.c...",,


In [100]:
# Export dataframe into a CSV
tweets_df1.to_csv('englanduk-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('englanduk-hashtag-tweets-detailed.csv', index=False)

In [101]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Scotland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Wales, United Kingdom" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [102]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/mojonojo3/status/156113234...,2022-08-20 23:25:30+00:00,1561132347357073408,@scalzi Although I do have sleepless nights ab...,mojonojo3,Uk,,0,0,0,0,1561119194443354112,en,,,[https://twitter.com/scalzi],1.561132e+18,https://twitter.com/mojonojo3
1,https://twitter.com/InspireAlpine/status/15611...,2022-08-20 23:06:26+00:00,1561127548897185792,"Thousands raised in memory of ""inspirational"" ...",InspireAlpine,"Wrexham, Wales",,0,0,0,0,1561127548897185792,en,,,,,
2,https://twitter.com/oratoba/status/15610849583...,2022-08-20 20:17:12+00:00,1561084958340415489,"@HughTeddy_ Black Turd...\n\n""Sorry, you have ...",oratoba,"Winchester, UK",,1,0,1,0,1561084594157494275,en,,,[https://twitter.com/HughTeddy_],1.561085e+18,https://twitter.com/HughTeddy_
3,https://twitter.com/LucyCunliffexx/status/1561...,2022-08-20 19:41:18+00:00,1561075923427917825,Im so proud of my husband &amp; his incredible...,LucyCunliffexx,"Newton-le-Willows, England",,0,0,1,0,1561075923427917825,en,,,"[https://twitter.com/wbhospice, https://twitte...",,
4,https://twitter.com/dalehay/status/15610666766...,2022-08-20 19:04:33+00:00,1561066676698898438,"I've shared it before, and I'll share it again...",dalehay,"Swindon, United Kingdom","[Cancer, KeepFartsFunny]",0,0,1,0,1561066676698898438,en,,,[https://twitter.com/MeredithMCCF],,


In [103]:
# Export dataframe into a CSV
tweets_df1.to_csv('walesuk-query-tweets.csv', index=False)
tweets_df2.to_csv('walesuk-query-tweets-detailed.csv', index=False)

In [104]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Wales, United Kingdom" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [105]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/Smudge92084582/status/1561...,2022-08-20 15:01:59+00:00,1561005631234080770,"Hi #oatomates , when you go swimming, do you h...",Smudge92084582,"England, United Kingdom","[oatomates, ileostomy, bowelcancer, stomaaware...",7,0,3,0,1561005631234080770,en,,,,,
1,https://twitter.com/LBofHounslow/status/156097...,2022-08-20 13:00:48+00:00,1560975133828866049,Did you know people who complete #bowelcancer ...,LBofHounslow,"Hounslow, west London","[bowelcancer, HealthyHounslow, NHS]",0,1,0,0,1560975133828866049,en,,,,,
2,https://twitter.com/BowelResearch/status/15609...,2022-08-20 10:00:51+00:00,1560929848096276480,"""In the space of four days, I went from feelin...",BowelResearch,"London, UK","[auguts, HaveYouGotTheGuts, bowelcancer, bowel...",0,0,0,0,1560929848096276480,en,,,,,
3,https://twitter.com/LSC_CA_ALLIANCE/status/156...,2022-08-20 07:45:00+00:00,1560895661033172996,The NHS is expanding #bowelcancer screening to...,LSC_CA_ALLIANCE,"Preston, England",[bowelcancer],0,1,1,0,1560895661033172996,en,,,,,
4,https://twitter.com/MrBakerKS2/status/15608933...,2022-08-20 07:35:46+00:00,1560893337686663168,"@OJBorg Morning OJ, from the middle of a very ...",MrBakerKS2,"Geneva, Switzerland","[EPICjourney, bowelcancer]",0,0,0,0,1560893337686663168,en,,,"[https://twitter.com/OJBorg, https://twitter.c...",,


In [106]:
# Export dataframe into a CSV
tweets_df1.to_csv('walesuk-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('walesuk-hashtag-tweets-detailed.csv', index=False)

In [107]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Scotland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Scotland, United Kingdom" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [108]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/mojonojo3/status/156113234...,2022-08-20 23:25:30+00:00,1561132347357073408,@scalzi Although I do have sleepless nights ab...,mojonojo3,Uk,,0,0,0,0,1561119194443354112,en,,,[https://twitter.com/scalzi],1.561132e+18,https://twitter.com/mojonojo3
1,https://twitter.com/LucyCunliffexx/status/1561...,2022-08-20 19:41:18+00:00,1561075923427917825,Im so proud of my husband &amp; his incredible...,LucyCunliffexx,"Newton-le-Willows, England",,0,0,1,0,1561075923427917825,en,,,"[https://twitter.com/wbhospice, https://twitte...",,
2,https://twitter.com/leemufc74/status/156101193...,2022-08-20 15:27:01+00:00,1561011932177600514,I'm fundraising for Bowel Cancer UK. Check out...,leemufc74,"Holywell, Wales",[JustGiving],0,0,1,0,1561011932177600514,en,,,[https://twitter.com/JustGiving],,
3,https://twitter.com/bowelcanceruk/status/15610...,2022-08-20 15:00:11+00:00,1561005181185163265,"""My hospital stay of 11 days was very pleasant...",bowelcanceruk,UK,,0,3,4,0,1561005181185163265,en,,,,,
4,https://twitter.com/pash22/status/156100265077...,2022-08-20 14:50:08+00:00,1561002650770018310,"As a cancer patient, I felt dismissed by docto...",pash22,United Kingdom,,0,0,1,0,1561002650770018310,en,,,[https://twitter.com/benbravery],,


In [109]:
# Export dataframe into a CSV
tweets_df1.to_csv('scotlanduk-query-tweets.csv', index=False)
tweets_df2.to_csv('scotlanduk-query-tweets-detailed.csv', index=False)

In [110]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Scotland, United Kingdom" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [111]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/LSC_CA_ALLIANCE/status/156...,2022-08-20 07:45:00+00:00,1560895661033172996,The NHS is expanding #bowelcancer screening to...,LSC_CA_ALLIANCE,"Preston, England",[bowelcancer],0,1,1,0,1560895661033172996,en,,,,,
1,https://twitter.com/Assistdotclaims/status/156...,2022-08-19 08:57:19+00:00,1560551473385177088,@militaryhistori Thank you very much! I hope ...,Assistdotclaims,United Kingdom,[bowelcancer],0,0,0,0,1560550969682710528,en,,,[https://twitter.com/militaryhistori],1.560551e+18,https://twitter.com/militaryhistori
2,https://twitter.com/Assistdotclaims/status/156...,2022-08-19 08:31:10+00:00,1560544891402428416,@JoanneMCrompton I’ve got a vital MRI and CT s...,Assistdotclaims,United Kingdom,[bowelcancer],0,0,1,0,1560510097985601536,en,,,[https://twitter.com/JoanneMCrompton],1.56051e+18,https://twitter.com/JoanneMCrompton
3,https://twitter.com/LisaChastie/status/1560330...,2022-08-18 18:20:41+00:00,1560330862327218176,Amazon delivery today @bowelbabe . An inspirat...,LisaChastie,"Scotland, United Kingdom","[bowelcancer, LifeLessons]",0,0,4,0,1560330862327218176,en,,,[https://twitter.com/bowelbabe],,
4,https://twitter.com/UcJournals/status/15603227...,2022-08-18 17:48:38+00:00,1560322797003350016,Global Journal of Gastroenterology &amp; Hepat...,UcJournals,United Kingdom,"[coloncancer, gastrointestinal, digestivehealt...",0,0,0,0,1560322797003350016,en,,,,,


In [112]:
# Export dataframe into a CSV
tweets_df1.to_csv('scotlanduk-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('scotlanduk-hashtag-tweets-detailed.csv', index=False)

In [114]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Scotland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) near:"Northern Ireland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [115]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/c7ped/status/1561164318787...,2022-08-21 01:32:33+00:00,1561164318787469322,"@TheCynicalHun Oh wow, as a man who is (and ha...",c7ped,East Kilbride,[CancerSucks],1,0,5,0,1560805675647004672,en,,,[https://twitter.com/TheCynicalHun],1.560806e+18,https://twitter.com/TheCynicalHun
1,https://twitter.com/Onamattopinion/status/1560...,2022-08-19 18:53:42+00:00,1560701559364886529,@LoveMyHawks21 @kisuvior @MichaelBensonn @usyk...,Onamattopinion,"Scotland, United Kingdom",,2,0,1,0,1560658452086542336,en,,,"[https://twitter.com/LoveMyHawks21, https://tw...",1.560685e+18,https://twitter.com/LoveMyHawks21
2,https://twitter.com/johnekinsella/status/15606...,2022-08-19 16:58:49+00:00,1560672647440379904,@lisamorgan39 Really sorry to hear this Lisa. ...,johnekinsella,"Dublin, IRELAND",,0,0,0,0,1560208851772870657,en,,,[https://twitter.com/lisamorgan39],1.560209e+18,https://twitter.com/lisamorgan39
3,https://twitter.com/CathainSeo/status/15604809...,2022-08-19 04:17:15+00:00,1560480991214460928,@JacquelineCoug7 @worzel13 @Amojak2 @Tim_from_...,CathainSeo,"Longford, Ireland",,0,0,0,0,1559994596503031810,en,,,"[https://twitter.com/JacquelineCoug7, https://...",1.560427e+18,https://twitter.com/JacquelineCoug7
4,https://twitter.com/FionnualaMinto/status/1560...,2022-08-18 21:33:03+00:00,1560379269888581632,@gibigill All I can do is send love. I went ba...,FionnualaMinto,"Limerick, Ireland",,0,0,4,0,1559994596503031810,en,,,[https://twitter.com/gibigill],1.559995e+18,https://twitter.com/gibigill


In [116]:
# Export dataframe into a CSV
tweets_df1.to_csv('northernirelanduk-query-tweets.csv', index=False)
tweets_df2.to_csv('northernirelanduk-query-tweets-detailed.csv', index=False)

In [117]:
# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
#for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" since:2017-07-11 until:2022-07-11 lang:en').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer near:"Northern Ireland" within:200mi lang:en').get_items()):
    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [118]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/fidelma4Europe/status/1560...,2022-08-18 19:40:04+00:00,1560350837100220422,Bless you girl and hold on #bowelcancer https:...,fidelma4Europe,"Galway, Ireland",[bowelcancer],1,0,1,0,1560350837100220422,en,,https://twitter.com/gibigill/status/1559994596...,,,
1,https://twitter.com/LisaChastie/status/1560330...,2022-08-18 18:20:41+00:00,1560330862327218176,Amazon delivery today @bowelbabe . An inspirat...,LisaChastie,"Scotland, United Kingdom","[bowelcancer, LifeLessons]",0,0,4,0,1560330862327218176,en,,,[https://twitter.com/bowelbabe],,
2,https://twitter.com/FionnualaMinto/status/1559...,2022-08-17 16:53:44+00:00,1559946593322901506,First chemo session today. So far feeling good...,FionnualaMinto,"Limerick, Ireland","[coloncancer, Cancer, positivity]",2,0,3,0,1559946593322901506,en,,,,,
3,https://twitter.com/Daily_Record/status/155960...,2022-08-16 18:10:00+00:00,1559603395761246208,Sophie has battled Bowel Cancer twice at just ...,Daily_Record,Glasgow,[BowelCancer],0,2,7,0,1559603395761246208,en,,,,,
4,https://twitter.com/Daily_Record/status/155952...,2022-08-16 13:09:00+00:00,1559527646635106304,So inspiring 👏 Keep it up Sophie! #BowelCancer...,Daily_Record,Glasgow,[BowelCancer],0,0,0,0,1559527646635106304,en,,,,,


In [None]:
# Export dataframe into a CSV
tweets_df1.to_csv('northernirelanduk-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('northernirelanduk-hashtag-tweets-detailed.csv', index=False)

# Place ID

In [30]:
# bowel/colon/colorectal cancer near:Belfast within:80mi

# Setting variables to be used below
maxTweets = 24

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) place:6416b8512febefc9 since:2017-07-24 until:2022-07-24 lang:en').get_items()):
    print(i)
    print(tweet)
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])
    if i >= maxTweets:
        break

0
https://twitter.com/RadarLanark/status/1550905313712283648
1
https://twitter.com/CarolineMariaT1/status/1550793287245742080
2
https://twitter.com/nofiltersuk/status/1550751775640125440
3
https://twitter.com/PublicHealthJM/status/1550434124384673792
4
https://twitter.com/KarlASD34/status/1550413273681518592
5
https://twitter.com/suzi1dore/status/1550381183892131840
6
https://twitter.com/seb_math_bio/status/1550058042837929984
7
https://twitter.com/CarolineMariaT1/status/1550031759068241920
8
https://twitter.com/Listert72/status/1550024183454859265
9
https://twitter.com/doubleshiny/status/1549782412745494528
10
https://twitter.com/KimEdwards48/status/1549316683764011008
11
https://twitter.com/jhaywardgant/status/1549296181855096832
12
https://twitter.com/jruddy99/status/1549080562324967424
13
https://twitter.com/suzi1dore/status/1549045309073854465
14
https://twitter.com/MutliRaceMan/status/1549014400794763264
15
https://twitter.com/MaheshGajperia1/status/1484260513630543874
16
https:/

In [31]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/RadarLanark/status/1550905...,2022-07-23 18:06:55+00:00,1550905313712283648,"Once again I passed an old friend tonight, The...",RadarLanark,"Lanark, Scotland",,1,0,47,1,1550905313712283648,en,,,[https://twitter.com/Beatson_Charity],,
1,https://twitter.com/CarolineMariaT1/status/155...,2022-07-23 10:41:46+00:00,1550793287245742080,"@Evaline1954 Sorry, to read this. I've just h...",CarolineMariaT1,,,1,0,1,0,1550790026975404033,en,,,[https://twitter.com/Evaline1954],1.55079e+18,https://twitter.com/Evaline1954
2,https://twitter.com/nofiltersuk/status/1550751...,2022-07-23 07:56:49+00:00,1550751775640125440,Bowel cancer sadly kills around 16k per year a...,nofiltersuk,United Kingdom,"[bowelcancerawareness, KidneyDisease, kidneysc...",0,0,0,0,1550751775640125440,en,,https://twitter.com/carolinenokes/status/15504...,,,
3,https://twitter.com/PublicHealthJM/status/1550...,2022-07-22 10:54:35+00:00,1550434124384673792,Great article from Doug Speake and colleagues ...,PublicHealthJM,"Cardiff, Wales",,0,3,9,1,1550434124384673792,en,,,[https://twitter.com/CMRSurgical],,
4,https://twitter.com/KarlASD34/status/155041327...,2022-07-22 09:31:44+00:00,1550413273681518592,I didn't want to continue pushing myself const...,KarlASD34,"UK, Earth, Milky Way, Universe",,1,0,6,0,1550413269109805056,en,,,,1.550413e+18,https://twitter.com/KarlASD34


In [32]:
# Export dataframe into a CSV
tweets_df1.to_csv('uk-query-tweets.csv', index=False)
tweets_df2.to_csv('uk-query-tweets-detailed.csv', index=False)

In [34]:
# Setting variables to be used below
maxTweets = 0

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
#for i, tweet in enumerate(sntwitter.TwitterHashtagScraper('bowelcancer since:2017-07-11 until:2022-07-11').get_items()):
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(#bowel #cancer) OR #bowelcancer OR (#colon #cancer) OR #coloncancer OR (#colorectal #cancer) OR #colorectalcancer place:6416b8512febefc9 since:2012-07-11 until:2022-07-11 lang:en').get_items()):
    print(i)
    print(tweet)

    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])
    if i >= maxTweets:
        break

0
https://twitter.com/Roger_White/status/1115644037858516992


In [35]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User
0,https://twitter.com/Roger_White/status/1115644...,2019-04-09 15:54:00+00:00,1115644037858516992,#thisisbowelcancer or is it? This was my 2nd E...,Roger_White,"Swansea, Wales","[thisisbowelcancer, repost2017, bowelcancerawa...",0,0,2,0,1115644037858516992,en,,,,,


In [36]:
# Export dataframe into a CSV
tweets_df1.to_csv('uk-hashtag-tweets.csv', index=False)
tweets_df2.to_csv('uk-hashtag-tweets-detailed.csv', index=False)

# Place Profile

In [None]:
# This will import the Twarc2 client and expansions class from twarc library and also the json library
from twarc import Twarc2, expansions
import datetime
import json

# This is where you initialize the client with your own bearer token (replace the XXXXX with your own bearer token)
client = Twarc2(bearer_token=r"AAAAAAAAAAAAAAAAAAAAAD%2F0eAEAAAAAuocbCksE7mJ6PMygeg%2Fc9%2BTKKUs%3DfOmcElinOxCOOhMDi8jMi76gxc0Z1iAR6LVzh7FHgTqTO7kuZw")

# Specify the start time in UTC for the time period you want Tweets from
start_time = datetime.datetime(2022, 7, 15, 0, 0, 0, 0, datetime.timezone.utc)

# Specify the end time in UTC for the time period you want Tweets from
end_time = datetime.datetime(2022, 7, 20, 0, 0, 0, 0, datetime.timezone.utc)

# This is where we specify our query as discussed in module 5
query = "lakers"

# The counts_recent method call the recent Tweet counts endpoint to get Tweets based on the query, start and end times
count_results = client.counts_recent(query=query, start_time=start_time, end_time=end_time)

# Recent Tweet counts returns all the Tweet volume for the last 7 days in one page so we break after that
for page in count_results:
    print(json.dumps(page['data']))
    break

In [None]:
query = "(bowel cancer) OR (colon cancer) OR (colorectal cancer) profile_country:GB"

# The search_recent method call the recent search endpoint to get Tweets based on the query, start and end times
#search_results = client.search_recent(query=query, start_time=start_time, end_time=end_time, max_results=100)
search_results = client.search_recent(query=query, start_time=start_time, end_time=end_time)

# Twarc returns all Tweets for the criteria set above, so we page through the results
for page in search_results:
    # The Twitter API v2 returns the Tweet information and the user, media etc.  separately
    # so we use expansions.flatten to get all the information in a single JSON
    result = expansions.flatten(page)
    for tweet in result:
        # Here we are printing the recent Tweet object JSON to the console
        print(json.dumps(tweet))

In [2]:
# bowel/colon/colorectal cancer near:Belfast within:80mi
# include:nativeretweets

# Setting variables to be used below
maxTweets = 500

# Creating list to append tweet data to
tweets_list1 = []
tweets_list2 = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i, tweet in enumerate(sntwitter.TwitterSearchScraper('(bowel cancer) OR (colon cancer) OR (colorectal cancer) profile_country:GB since:2017-07-11 until:2022-07-11 lang:en').get_items()):
    print(i)
    print(tweet)

    #if i>maxTweets:
    #    break
    tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location])
    tweets_list2.append([tweet.url, tweet.date, tweet.id, tweet.content, tweet.user.username, tweet.user.location, tweet.hashtags, 
                         tweet.replyCount, tweet.retweetCount, tweet.likeCount, tweet.quoteCount, tweet.conversationId,
                         tweet.lang, tweet.retweetedTweet, tweet.quotedTweet, tweet.mentionedUsers, tweet.inReplyToTweetId, tweet.inReplyToUser])

In [3]:
# Creating a dataframe from the tweets list above
tweets_df1 = pd.DataFrame(tweets_list1, columns=['Datetime', 'Tweet Id', 'Text', 'Username', 'Location'])
tweets_df2 = pd.DataFrame(tweets_list2, columns=['Url', 'Datetime', 'Tweet Id', 'Text', 'Username', 'Location', 'Hashtags', 
                                                 'Reply Count', 'Retweet Count', 'Like Count', 'Quote Count', 'Conv. Id',
                                                 'Language', 'Retweeted Tweet', 'Quoted Tweet', 'Mentioned Users', 'Replied Tweet', 
                                                 'Replied User'])

# Display first 5 entries from dataframe
tweets_df1.head()
tweets_df2.head()

Unnamed: 0,Url,Datetime,Tweet Id,Text,Username,Location,Hashtags,Reply Count,Retweet Count,Like Count,Quote Count,Conv. Id,Language,Retweeted Tweet,Quoted Tweet,Mentioned Users,Replied Tweet,Replied User


In [None]:
# Export dataframe into a CSV
tweets_df1.to_csv('gb-query-tweets.csv', index=False)
tweets_df2.to_csv('gb-query-tweets-detailed.csv', index=False)