# Welcome to the Analyst Tool notebook! 
You can use this notebook to: <br/>
1) Search for tweets based on a keyword, and cross-check your results with Graphika's map data <br/>
2) Run a Map Activity Report <br/>
3) Run a Feature Activity Report <br/>
4) Run a Bot Activity Report <br/>
5) Get group densities for a given map ID
6) Check trending hits across Graphika data


### Before getting started, run this cell to connect to the Graphika API
You may need to run this cell again to refresh your connection!

In [101]:
%run connect_to_api.ipynb

### Next, run this cell to get all the necessary functions in place

In [113]:
%run misc_functions.ipynb
%run twarc_tweets_getter.ipynb
%run activity_counter.ipynb
%run bot_activity.ipynb
%run group_density_getter.ipynb
%run graphika_trending.ipynb
%run hit_getter.ipynb

## Tweet Search
### Run this cell to conduct your initial search

In [13]:
search_hit = input('Enter your search term: ')
search_limit = int(input('Enter how many 100s of tweets you want to retrieve: '))

search_result = search_tweets (search_hit, limit = search_limit)

Enter your search term: hantavirus
Enter how many 100s of tweets you want to retrieve: 250
..Fetching 25000 tweets


### Now that you have your search results, you can run this cell to see the table

In [15]:
search_result

Unnamed: 0,retweet_screen_name,retweet_tweet_id,user_id,retweet,text,author,screen_name,tweet_id,time,mentions,hashtags,urls
0,sldatalay,1242473973209841664,3331527207,True,Korona bitti de sanki bi de #Hantavirus çıktı ...,Pervin Gürer Akarsu,GurerPervin,1242485365207896064,2020-03-24 16:16:09,[sldatalay],[Hantavirus],[]
1,DRealIlorinBoy,1242359936891437056,2457325615,True,Y'all tell these Chinese people to stop eating...,Fatal opiate,k_obryant,1242485364960514056,2020-03-24 16:16:09,[DRealIlorinBoy],[],[]
2,amnotasnitch,1242472521057751040,1130031192575795207,True,"Le corona virus se termine, les gens recommenc...",Ngatchoo,ngatchoo,1242485364045926405,2020-03-24 16:16:09,[amnotasnitch],[Hantavirus],[]
3,,,1011408872630833152,False,b'@noramohmed12 \xd8\xa8\xd8\xb5\xd9\x8a \xd9\...,Nor Adrenalin,AzzaHas65456610,1242485363550982145,2020-03-24 16:16:09,[noramohmed12],[],[]
4,,,1323033954,False,b'@LizzyKumcu I hope so hantavirus-20 kaldiram...,Nilgun.,Nilgunkanidagli,1242485363446181893,2020-03-24 16:16:09,[LizzyKumcu],[],[]
5,etv926,1242440912850112514,1157653348524470272,True,*un nouveau virus est apparu en Chine 🇨🇳 : le ...,Doun’s🦋,dounia_nouna,1242485363416821767,2020-03-24 16:16:09,[etv926],"[Hantavirus, COVIDー19]",[]
6,ShayanKiJawani,1242483825181106178,1241705581519527943,True,“Coronavirus” left the World Chating Group but...,wasi💔,Wasii_Yar,1242485363286749184,2020-03-24 16:16:09,[ShayanKiJawani],[],[]
7,prasetya_Lh,1242470330280497157,1098294704276303872,True,Hati2 kalian kalo pada liat orang balapan #han...,wakiki,MuhammadRiskiH6,1242485363148328961,2020-03-24 16:16:09,[prasetya_Lh],[hantavirus],[]
8,yencomgh,1242444304032968704,1050047500604850178,True,.@Efiaodo1 has revealed that the coronavirus m...,Kvng Morgan😎,kvng_morgan,1242485363051905025,2020-03-24 16:16:09,"[yencomgh, efiaodo1]",[],[]
9,,,1158807874190995456,False,b'@Gangle And there\xe2\x80\x99s the new hanta...,Thunderwave'54,Thunderwave5,1242485362590539781,2020-03-24 16:16:09,[Gangle],[],[]


### Or run this cell to print the results to a CSV!

In [16]:
print_csv(search_result)

Enter filename: hantavirus_25100
hantavirus_25100.csv stored to the same directory as this file!


### If you would like to cross-check your results with Graphika data, run this cell!
This cell will identify how many unique nodes were returned, and how those nodes were spread out across Graphika's maps

In [17]:
check_graphika_data(search_result)

From this search, 20605 nodes were returned
**Permission denied for map 1992**
**Permission denied for map 2393**
**Permission denied for map 2147**
The map that have the most of these nodes are 
Unknown 
Unknown 
and Unknown


Unnamed: 0,count,percent
1992,775,7.660374
2393,444,4.388653
2147,440,4.349115
2305,401,3.963626
2348,353,3.489177
2222,344,3.400217
1991,300,2.965306
2118,290,2.866462
1221,244,2.411782
2347,240,2.372245


>> Would you like to save a table of which maps these nodes appear in? (y/n) 
y
Enter filename: hantavirusnodes
hantavirusnodes.csv stored to the same directory as this file!


Unnamed: 0_level_0,node_id,message_id,hit_time,cluster_id,date,hit_type
map_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1221,3081895959,1102290233985585153,2019-03-03 19:30:46,7,03-Mar-2019 1900,nodes
1221,3081895959,1102543791871922176,2019-03-04 12:18:18,7,04-Mar-2019 1200,nodes
1221,3081895959,1103225350765625344,2019-03-06 09:26:35,7,06-Mar-2019 0900,nodes
1221,3081895959,1103226050388082689,2019-03-06 09:29:22,7,06-Mar-2019 0900,nodes
2324,55050265,1100883422116040706,2019-02-27 22:20:35,26,27-Feb-2019 2200,nodes
2347,213254229,1101008480733941760,2019-02-28 06:37:32,0,28-Feb-2019 0600,nodes
2347,213254229,1101054667566604289,2019-02-28 09:41:04,0,28-Feb-2019 0900,nodes
2347,213254229,1101067275652743168,2019-02-28 10:31:10,0,28-Feb-2019 1000,nodes
2347,213254229,1101073229450481665,2019-02-28 10:54:49,0,28-Feb-2019 1000,nodes
2347,213254229,1101077342372483072,2019-02-28 11:11:10,0,28-Feb-2019 1100,nodes


## Map Reports
Note that these reports may take some time to generate!
### Run this cell to get a Map Activity Report

In [70]:
run_map_activity_report(debug = True)

>> Enter map id: 2232
...Fetching map data
...Getting map nodes
...Getting hits
...Querying database
...Morphing dataframe
...Merging nodes with hits
...Done!
count by group, cluster, or account: group


Unnamed: 0_level_0,group_tweet_count
group_name,Unnamed: 1_level_1
Anti-Trump,611
Other Left-Leaning Media|MSM,279
Entertainment,110


>> Do you want to save this result to a CSV? (y/n) 
n


### Run this cell to get a Feature Activity Report

In [27]:
run_feature_activity_report(debug = True)

>> Enter map id: 2232
>> Search for hashtags, urls, or retweets: hashtags
>> Is this search case sensitive? (y/n) 
n
>> Comma separate search parameters, or hit enter for all: 
...Searching hashtags for <>
>> Do you want to save the search results to a CSV? (y/n)n


Unnamed: 0_level_0,number_of_tweets
screen_name,Unnamed: 1_level_1
Rodstyme,100


>> Do you want to save the above table to a CSV? (y/n)n


### Run this cell to get a Bot Activity Report

In [3]:
run_bot_activity_report(debug = True)

>> Enter map ID: 2232
>> What type of hit would you like to explore? (hashtags,retweets,urls)
hashtags
>> Is this search case sensitive? (y/n) 
n
>> Enter a comma-separated list of hits you would like to examine: boycotthallmark, hallmark


Unnamed: 0,bot_tweets,total_tweets,bot_share_percent,Other Left-Leaning Media|MSM_count,Other Left-Leaning Media|MSM_share,Anti-Trump_count,Anti-Trump_share
boycotthallmark,11.0,1000.0,1.1,6.0,0.6,5.0,0.5
hallmark,11.0,1000.0,1.1,6.0,0.6,5.0,0.5


>> Do you want to save the bot activity results to a CSV? (y/n) 
n


## Group Densities
### Run this cell to get group densities for a given map ID

In [69]:
get_group_density()

Please enter a map ID: 2232


Unnamed: 0,nodes,arcs,density
Other Left-Leaning Media|MSM,3601,10521,0.081135
Anti-Trump,3007,222273,2.458215
Entertainment,2131,2834,0.062407
Pro-Trump,1883,66227,1.867818
Other,1322,3834,0.219376


>> Would you like to save these results to a CSV? (y/n) 
n


## Graphika Trending
The following cells scan Graphika's live maps for trending hashtags, URLs, and media on Twitter

### Run this cell to search Graphika's live maps for trending Twitter hashtags, URLs, or media, and run the next one to see the results

In [102]:
trend_result = graphika_trending()

>> Enter how many days of trending hits you would like to pull: 1
>> Enter the type of hit you would like to explore (hashtags,urls,media): hashtags
...Looking at live maps:
NFL_Live_Landscape_v2
India General Political Landscape TW 2019
Hot97_Landscape
NFL_Fan_Landscape_V3
Iran_Politics_Culture
Philippines_ABS-CBN_Media
JPA_Immunology
Climate Change Combined Landscape
NCCIH 2020
Taiwan_Landscape_2019
M1948 ZP_2019RussianProtests
Oncology_2019_1554228412
Emotional_Rescue_Mentions_2
...Computing map counts
...Done!
>> Do you want to save these results to a CSV? (y/n) 
n


In [103]:
trend_result

Unnamed: 0_level_0,node_id,message_id,hit_time,map_id,cluster_id,hit_type,map_id_count
hit_value,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
NBATwitter,1039224027343204352,1242831110758903809,2020-03-25 15:10:02,1644,51,hashtags,1
StayHomeWorkSafe,78102678,1242810712423985152,2020-03-25 13:48:58,1644,19,hashtags,1
BeKind,78102678,1242810712423985152,2020-03-25 13:48:58,1644,19,hashtags,1
thefitlabway,387416780,1242786301000237064,2020-03-25 12:11:58,1644,52,hashtags,1
Redskins,387416780,1242788837354545154,2020-03-25 12:22:03,1644,52,hashtags,1
Raiders,24620756,1242912945169821696,2020-03-25 20:35:12,1644,39,hashtags,1
RaiderNation,24620756,1242927813893574656,2020-03-25 21:34:17,1644,39,hashtags,1
WentzDay,43034742,1242874516398882818,2020-03-25 18:02:30,1644,3,hashtags,1
KnowYourWorth,278367652,1242901340961091584,2020-03-25 19:49:06,1644,46,hashtags,1
TwitterMomentsOfTheDecade,459752491,1242800314425790464,2020-03-25 13:07:39,1644,1,hashtags,2


### Run this cell to get a summary of those results, and the next one to see the summary

In [None]:
trend_summary = get_top_x_trends(trend_result)

In [None]:
# trend_summary

### Run this cell to get the summary of a specific hit

In [124]:
# temp1 = summarize_hit(trend_summary,trend_result)

# temp1.map_id.value_counts()
temp1.map_id.value_counts()

1527    64
2353    20
2225     7
1644     6
2249     3
1948     1
Name: map_id, dtype: int64

### Run this cell to search all the returned hits for yourself

In [114]:
temp = search_trends(trend_result)

>> Enter term to search: ibelievetara
...Searching results
...Done!


Unnamed: 0_level_0,node_id,message_id,hit_time,map_id,cluster_id,hit_type,map_id_count
hit_value,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
IBelieveTara,387261715,1242985754474745857,2020-03-26 01:24:32,1644,40,hashtags,1
IBelieveTara,9892892,1242930367092240384,2020-03-25 21:44:26,1644,35,hashtags,1
IBelieveTara,33427412,1242919151322234881,2020-03-25 20:59:52,1644,48,hashtags,2
IBelieveTara,24718643,1242941557679980547,2020-03-25 22:28:54,1644,59,hashtags,2
IBelieveTara,24718643,1242941746390224896,2020-03-25 22:29:39,1644,59,hashtags,2
IBelieveTara,251811933,1242942122753265665,2020-03-25 22:31:09,1644,53,hashtags,1
IBelieveTara,292164590,1242910820025225217,2020-03-25 20:26:46,1644,12,hashtags,2
IBelieveTara,292164590,1242948083543158784,2020-03-25 22:54:50,1644,12,hashtags,2
IBelieveTara,292164590,1242949109134299136,2020-03-25 22:58:55,1644,12,hashtags,2
IBelieveTara,292164590,1243024678869966848,2020-03-26 03:59:12,1644,12,hashtags,2


>> Do you want to save these results to a CSV? (y/n) 
n


In [118]:
temp.node_id.value_counts()

588229029              28
90803598               15
19509858               14
292164590              10
1193635879018999810    10
189806241              10
216123608               4
24718643                4
31310187                4
3685247237              4
48219360                3
3680608109              3
761587775225274368      3
220985647               2
363240926               2
1169438480              2
612197548               2
4147851292              2
25393191                2
33427412                2
823998756655861760      2
177584156               2
14962178                2
15326206                2
3586157234              2
59613930                2
913101799686529024      2
4786069640              2
987340978586402817      2
245084253               2
                       ..
2473143948              1
9892892                 1
350913291               1
387261715               1
314591392               1
2312532377              1
805061239               1
1187164748  

In [130]:
tar = temp1.map_id.unique()

In [131]:
for t in tar:
    print(get_map_name(t))

Emotional_Rescue_Mentions_2
Climate Change Combined Landscape
NFL_Fan_Landscape_V3
NFL_Live_Landscape_v2
Hot97_Landscape
M1948 ZP_2019RussianProtests
