# Welcome to the Analyst Tool notebook! 
You can use this notebook to: <br/>
1) Search for tweets based on a keyword, and cross-check your results with Graphika's map data <br/>
2) Run a Map Activity Report <br/>
3) Run a Feature Activity Report <br/>
4) Run a Bot Activity Report <br/>
5) Run a Full Activity Report (using tags on Graphika's live maps!) <br/>
6) Get group densities for a given map ID <br/>
7) Check trending hits across Graphika data <br/>
8) Run Botometer for a given map ID <br/>


### Before getting started, run this cell to connect to the Graphika API
You may need to run this cell again to refresh your connection!

In [78]:
%run connect_to_api.ipynb

### Next, run this cell to get all the necessary functions in place

In [98]:
%run misc_functions.ipynb
%run tweepy_tweets_getter.ipynb
%run activity_reports.ipynb
%run bot_activity.ipynb
%run group_density_getter.ipynb
%run graphika_trending.ipynb
%run live_monitoring.ipynb
%run botometer.ipynb
%run hit_getter.ipynb

## Map Summary
### Run this cell to view a summary of a given map ID

In [87]:
get_map_summary()

## Tweet Search
### Run this cell to conduct your initial search
This search works the same way as Twitter's – you can refine your search results by using any combination of the fields below:

Tweets containing all words in any position (“Twitter” and “search”)  <br>
Tweets containing exact phrases (“Twitter search”)<br>
Tweets containing any of the words (“Twitter” or “search”)<br>
Tweets excluding specific words (“Twitter” but not “search”)<br>
Tweets with a specific hashtag (#twitter)<br>
Tweets in a specific language (written in English)<br>
<br>
Tweets from a specific account (Tweeted by “@TwitterComms”)<br>
Tweets sent as replies to a specific account (in reply to “@TwitterComms”)<br>
Tweets that mention a specific account (Tweet includes “@TwitterComms”)<br>
<br>
Tweets that are retweets or replies (filter:replies or filter:retweets)

In [45]:
search_hit = input('Enter your search term: ')
search_limit = int(input('Enter how many 100s of tweets you want to retrieve: '))

search_result = search_tweets (search_hit, limit = search_limit)

Enter your search term: trump filter:replies
Enter how many 100s of tweets you want to retrieve: 1
..Fetching 100 tweets
...Done!


### Now that you have your search results, you can run this cell to see the table

In [55]:
search_result

### Or run this cell to print the results to a CSV!

In [56]:
print_csv(search_result)

### If you would like to cross-check your results with Graphika data, run this cell!
This cell will identify how many unique nodes were returned, and how those nodes were spread out across Graphika's maps

In [57]:
graphika_results = check_graphika_data(search_result)

## Map Reports
Note that these reports may take some time to generate! However, if you have just run a report on a map ID, re-running it will take less time, as this tool caches the data.
### Run this cell to get a Map Activity Report
This report will return an aggregated count, by segment, of tweets collected in a given map ID

In [59]:
run_map_activity_report(debug = True)

>> Enter map id: 2434
...Map data found!
>> Please enter which to aggregate the data by – tag, group, cluster, or account: group


Unnamed: 0_level_0,group_tweet_count
group_name,Unnamed: 1_level_1
US Trump Support,3683
INT MSM Media | SMM,2043
US Left-Wing,600
INT Alt-Media | Other,579
HK | China,84
India,63


>> Do you want to save this result to a CSV? (y/n) 
y
Enter folder in your home directory in which to store the file: Downloads
Enter filename (no extension required): temp
temp.csv stored in /Users/avneeshchandra/Downloads!


### Run this cell to get a Feature Activity Report
This report will return: <br>
1) hits from a given map that correspond to a given search, and; <br>
2) aggregated counts for users and how many tweets they had that utilized the hit

In [89]:
run_feature_activity_report(debug = True)

### Run this cell to get a Bot Activity Report
This report will return a summary of activity for a given set of node IDs, along with an associated feature search

In [77]:
run_bot_activity_report(debug = False)

>> Enter map ID: 2434
>> What type of hit would you like to explore? (hashtags,retweets,urls)
hashtags
>> Enter comma-separated seedlist: 1116667990060277760,763393864229093376
...Fetching map data
...Fetching hashtags data
...Getting map nodes
...Getting hits
...Querying database
...Morphing dataframe
...Merging nodes with hits
...Merging nodes with tags
...Done!
>> Is this search case sensitive? (y/n) 
y
>> Enter a comma-separated list of hits you would like to examine: Gates,help


Unnamed: 0,bot_tweets,total_tweets,bot_share_percent,US Trump Support_count,US Trump Support_share,INT Alt-Media | Other_count,INT Alt-Media | Other_share,INT MSM Media | SMM_count,INT MSM Media | SMM_share,India_count,India_share,HK | China_count,HK | China_share,US Left-Wing_count,US Left-Wing_share
Gates,0.0,7677911.0,0.0,2138.0,0.027846,452.0,0.005887,306.0,0.003985,202.0,0.002631,56.0,0.000729,18.0,0.000234
help,0.0,7677911.0,0.0,127.0,0.001654,48.0,0.000625,168.0,0.002188,246.0,0.003204,391.0,0.005093,28.0,0.000365


>> Do you want to save the bot activity results to a CSV? (y/n) 
y
Enter folder in your home directory in which to store the file: Downloads
Enter filename (no extension required): temp
temp.csv stored in /Users/avneeshchandra/Downloads!


### Run this cell to get a Full Activity Report
This returns an experimental report that returns engagement metrics for a given map's given feature type

In [100]:
run_full_activity_report(debug = True)

## Group Densities
### Run this cell to get group densities for a given map ID

In [69]:
get_group_density()

Please enter a map ID: 2232


Unnamed: 0,nodes,arcs,density
Other Left-Leaning Media|MSM,3601,10521,0.081135
Anti-Trump,3007,222273,2.458215
Entertainment,2131,2834,0.062407
Pro-Trump,1883,66227,1.867818
Other,1322,3834,0.219376


>> Would you like to save these results to a CSV? (y/n) 
n


## Graphika Trending
The following cells scan Graphika's live maps for trending hashtags, URLs, and media on Twitter

### Run this cell to search Graphika's live maps for trending Twitter hashtags, URLs, or media, and run the next one to see the results
The results will contain the raw data resulting from the hitcache

In [None]:
trend_result = graphika_trending()

>> Enter how many days of trending hits you would like to pull: 1
>> Enter the type of hit you would like to explore (hashtags,urls,media): hashtags
...Looking at live maps:

...Computing map counts


Unnamed: 0_level_0,node_id,message_id,hit_time,map_id,cluster_id,hit_type
hit_value,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
HomeisHere,129679132,1273293281774825473,2020-06-17 16:35:49,1077,35,hashtags
txlege,52754791,1273370073365954563,2020-06-17 21:40:58,1077,45,hashtags
txlege,52754791,1273382248994177025,2020-06-17 22:29:21,1077,45,hashtags
DACA,129679132,1273293281774825473,2020-06-17 16:35:49,1077,35,hashtags
ImmigrantHeritageMonth,129679132,1273294463612248065,2020-06-17 16:40:31,1077,35,hashtags
ImmigrantHeritageMonth,129679132,1273295965122789376,2020-06-17 16:46:29,1077,35,hashtags
NationalTakeOutNight,129679132,1273295965122789376,2020-06-17 16:46:29,1077,35,hashtags
DACA,129679132,1273401136179843072,2020-06-17 23:44:24,1077,35,hashtags
SCOTUS,129679132,1273401136179843072,2020-06-17 23:44:24,1077,35,hashtags
MPP,129679132,1273486015315931145,2020-06-18 05:21:40,1077,35,hashtags


...Done!
...Found the following hits:


In [39]:
trend_result

### Run this cell to get a summary of those results, and the next one to see the summary

In [193]:
trend_summary = get_top_x_trends(trend_result,20)

...Getting a summary of top 20 hits
...Done!


In [194]:
trend_summary

Unnamed: 0_level_0,hit_count,hit_share,tweet_count,tweet_share,map_count,node_count,node_share
hit_value,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
coronavirus,10298,4.49,6422,6.38,24,1871,8.63
COVID19,7889,3.44,6438,6.39,24,3109,14.35
stop,6524,2.84,3135,3.11,5,2,0.01
WrestleMania,2590,1.13,2238,2.22,17,523,2.41
Coronavirus,1506,0.66,1259,1.25,24,803,3.71
Covid_19,1255,0.55,1142,1.13,24,851,3.93
Nowplaying,1007,0.44,536,0.53,4,3,0.01
dremtgi,1005,0.44,533,0.53,2,1,0.0
dremstuff,1005,0.44,533,0.53,2,1,0.0
CelebIOU,955,0.42,954,0.95,4,3,0.01


### Run this cell to search all the returned hits for yourself

In [202]:
trend_search = search_trends(trend_result)

### Run this cell to get the summary of a specific hit

In [221]:
summarize_hit(trend_summary,trend_result)

>> Enter a hit to get its summary: BillGatesIsEvil
--------------------
BillGatesIsEvil SUMMARY
**This is not a top hit**
...Getting the metadata of accounts that appeared across multiple maps for this hit:
These accounts break down as follows:


  keys, counts = f(values, dropna)


Unnamed: 0,tag
US Right,5
"[US Right, US Right]",1
SMM,1
Mainstream Media,1
Alt Media/Conspiracy,1


Unnamed: 0_level_0,map_type,cluster_no,group_no,node_id,username,website,account_url,description,profile_image_url,account_created_at,cluster_id,coords_2d,coords_3d,radius,global_in_degree,global_out_degree,map_in_degree,map_out_degree,map_union_degree,num_tweets,verified,listed_count,favourites_count,protected,influencer,tag,map_id
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1
Cathy,twitter,44,1,187782838,cathyspartanj,,twitter.com/cathyspartanj,"""All Posts Are Made By My Own Views Through S...",http://pbs.twimg.com/profile_images/6390195141...,2010-09-07 03:46:47,8099,"[0.6821035746585753, 0.147960778024705]","[-0.7380395104, -0.07234907230000001, -0.09513...",5.147158,26201,28748,1757,2045,2388,449875,False,137,220750,False,True,US Right,1518
STOP THE INDOCTRINATION IN SCHOOLS AND MEDIA,twitter,2,1,1849839132,6549lmartin,,twitter.com/6549lmartin,I am not a hyphenated American I AM AN AMERICA...,http://pbs.twimg.com/profile_images/1170664323...,2013-09-10 00:33:21,8108,"[0.7632190398116323, 0.142834348148285]","[-0.7493593208, -0.1526148852, -0.2259964893]",4.739009,34719,33844,1505,1538,2001,505426,False,173,67461,False,True,"[US Right, US Right]",1518
"C0RRUPTI0N, USA 🧨🔥",twitter,18,1,1122600787,C0RRUPTI0N_USA,http://corruption-usa.com,https://t.co/BEoH9PNgd4,"Library of Truth. Patriots, read the books! Fi...",,2013-01-26 17:08:16,45969,"[0.8124981147999999, -0.3726371756]","[-0.9101396996, 0.1898058237, -0.2487574404]",2.547064,80084,75129,353,140,355,33991,False,205,45803,False,True,US Right,2353
"Dr. Thomas Paul, Therapist #MindBody",twitter,56,6,1558861508,DrThomasPaul,https://www.PastLifeRegression.com/,https://t.co/oWIkOJ1sQa,"Past Life Regression Center® Founder, Therapis...",http://pbs.twimg.com/profile_images/1161002245...,2013-06-30 20:22:18,8081,"[0.09105220191017584, 0.08340582601371939]","[-0.1086599183, -0.2692692362, -0.1258668419]",4.018134,63689,55648,1296,574,1401,63664,False,680,29493,False,True,SMM,1518
Debsjj,twitter,8,9,2633885004,deborahj77,,twitter.com/deborahj77,Truth seeker. Hate the lies of governments and...,,2014-07-13 00:57:03,7951,"[0.1281801214056812, -0.1157184667511046]","[-0.2122072955, 0.132109283, 0.3157250742]",2.112634,1159,3093,114,72,115,6455,False,3,19310,False,False,Alt Media/Conspiracy,1527
Cathy,twitter,51,1,187782838,cathyspartanj,,twitter.com/cathyspartanj,"""All Posts Are Made By My Own Views Through S...",http://pbs.twimg.com/profile_images/6390195141...,2010-09-07 03:46:47,7952,"[0.8736339727129068, -0.04618723569962979]","[-0.8536242203, 0.0820569737, 0.1155109584]",3.882516,26201,28748,643,461,686,449875,False,137,220750,False,True,US Right,1527
🇺🇸LadyMacBeth 4 Trump🇺🇸,twitter,16,1,1747038559,ladymacbeth1212,,twitter.com/ladymacbeth1212,#USMC wife of👉🏻@SwampFox357❤️#Patriot #DrainTh...,,2013-09-07 13:01:39,8126,"[0.8554660254891051, -0.04353111337793847]","[-0.7117426453, 0.2525037322, -0.3412078462]",4.609105,18262,19542,1237,1719,1885,42706,False,89,24621,False,True,US Right,1518
"C0RRUPTI0N, USA 🧨🔥",twitter,21,1,1122600787,C0RRUPTI0N_USA,http://corruption-usa.com,https://t.co/BEoH9PNgd4,"Library of Truth. Patriots, read the books! Fi...",,2013-01-26 17:08:16,8094,"[0.5447981552425043, -0.06167604494263457]","[-0.5589948549, -0.3316457853, 0.0608927610000...",4.610242,80084,75129,1539,1506,1886,33991,False,205,45803,False,True,US Right,1518
"C0RRUPTI0N, USA 🧨🔥",twitter,9,17,1122600787,C0RRUPTI0N_USA,http://corruption-usa.com,https://t.co/BEoH9PNgd4,"Library of Truth. Patriots, read the books! Fi...",,2013-01-26 17:08:16,43801,"[0.3686093314000001, -0.1124847796]","[-0.1939072204, -0.5602304179, 0.354271068]",2.340836,80084,75129,354,2,354,33991,False,205,45803,False,True,Mainstream Media,2249


## Botometer
Run the following cell to execute the Botometer on a given map ID

In [126]:
run_botometer()

    Eventually can we create a single file(/s) containing all passwords and Api creds

    Tweet search - Can we add logical Ops like “AND” and “Or”?  

    Print csv:  can we save to a Downloads folder rather than Git file. Otherwise we’re likely to run into trouble eventually with people pushing cvs

    Print csv: tweet id is automatically formatted as Scientific Notation. (I’ve gotten around this by formatting as string and writing to xlsx)

    Saving to csv for search results needs screen names, cluster names and map names. Talk to Rodrigo…. do you have this access?

    feature activity report: We need a contains or exact search; ex, maga == EthicMagazine ; can we add cluster and group names to screen_names for csv export?