# Twitter Data Analysis

In this lesson, we're going to learn how to analyze and explore Twitter data with Pandas and the Python/command line tool [twarc](https://github.com/DocNow/twarc). This tool was developed by a project called [Documenting the Now](https://www.docnow.io/). The DocNow team develops tools and ethical frameworks for social media research.

This lesson presumes that you've already installed and configured twarc and collected some Twitter data (covered in the previous lesson).

## Twarc

**Search**

Search for tweets that mentioned "Infinite Jest" from the last 7-14 days, only from verified accounts

`twarc search "Infinite Jest filter:verified" > twitter-data/infinite_jest_verified_tweets.jsonl`

**JSON to CSV**

Transform .jsonl to .csv file

In [1]:
!python twarc/utils/json2csv.py --extra-field rt_text retweeted_status.full_text twitter-data/infinite_jest_verified_tweets.jsonl > twitter-data/infinite_jest_verified_tweets.csv

## Read in Tweet CSV files with Pandas

**Import Pandas**

In [3]:
import pandas as pd

**Set Pandas Display Options**

Set Pandas display options so columns are wider and more columns are visible

In [4]:
pd.set_option('max_colwidth', 5000)
pd.set_option('max_columns', 40)
pd.set_option('max_rows', 100)

**Read in CSV file**

In [5]:
infinite_jest_df = pd.read_csv('twitter-data/infinite_jest_verified_tweets.csv')

**Check Columns**

Check what Twitter metadata exists in this CSV file

In [7]:
infinite_jest_df.columns

Index(['id', 'tweet_url', 'created_at', 'parsed_created_at',
       'user_screen_name', 'text', 'tweet_type', 'coordinates', 'hashtags',
       'media', 'urls', 'favorite_count', 'in_reply_to_screen_name',
       'in_reply_to_status_id', 'in_reply_to_user_id', 'lang', 'place',
       'possibly_sensitive', 'retweet_count', 'retweet_or_quote_id',
       'retweet_or_quote_screen_name', 'retweet_or_quote_user_id', 'source',
       'user_id', 'user_created_at', 'user_default_profile_image',
       'user_description', 'user_favourites_count', 'user_followers_count',
       'user_friends_count', 'user_listed_count', 'user_location', 'user_name',
       'user_statuses_count', 'user_time_zone', 'user_urls', 'user_verified',
       'rt_text'],
      dtype='object')

As you can see above, there is a *lot* of metadata that comes with every tweet!

**Check Shape**

Check the size of dataframe (number of rows = number of tweets)

In [8]:
infinite_jest_df.shape

(197, 38)

**Preview DataFrame**

In [9]:
infinite_jest_df.head()

Unnamed: 0,id,tweet_url,created_at,parsed_created_at,user_screen_name,text,tweet_type,coordinates,hashtags,media,urls,favorite_count,in_reply_to_screen_name,in_reply_to_status_id,in_reply_to_user_id,lang,place,possibly_sensitive,retweet_count,retweet_or_quote_id,retweet_or_quote_screen_name,retweet_or_quote_user_id,source,user_id,user_created_at,user_default_profile_image,user_description,user_favourites_count,user_followers_count,user_friends_count,user_listed_count,user_location,user_name,user_statuses_count,user_time_zone,user_urls,user_verified,rt_text
0,1298279565932900354,https://twitter.com/thememorypalace/status/1298279565932900354,Tue Aug 25 15:22:24 +0000 2020,2020-08-25 15:22:24+00:00,thememorypalace,"@JustinCChang Avoid the man who wants to put up a bookshelf solely for his copy of Infinite Jest but can't because, his Wallace? Not big enough.",reply,,,,,4,JustinCChang,1.298277e+18,96683173.0,en,,,0,,,,"<a href=""https://mobile.twitter.com"" rel=""nofollow"">Twitter Web App</a>",60200577,Sun Jul 26 01:38:52 +0000 2009,False,"Nate DiMeo is the creator of The Memory Palace, the world's only podcast. https://t.co/S0LyGTdXFg",11856,18821,580,417,Los Angeles,Nate DiMeo,8371,,http://www.thememorypalace.us,True,
1,1298267213544058885,https://twitter.com/JeremyDuns/status/1298267213544058885,Tue Aug 25 14:33:19 +0000 2020,2020-08-25 14:33:19+00:00,JeremyDuns,"Top 7 Warning Signs in a Man's Bookshelf:\n\n1. List of secret military launch codes\n2. Nest of angry hornets\n3. Vial of anthrax\n4. Commemorative ""1 million subscribers"" YouTube plaque\n5. Too much taxidermy\n6. Any issue of the Yale Daily News\n7. Infinite Jest https://t.co/Gct2JGoJ4I",retweet,,,,,105,,,,en,,,9,1.298016e+18,virgil_30,1897940000.0,"<a href=""https://mobile.twitter.com"" rel=""nofollow"">Twitter Web App</a>",37669898,Mon May 04 14:25:33 +0000 2009,False,"Author of Free Agent, Song of Treason, The Moscow Option, Spy Out The Land and non-fiction Dead Drop (Codename: HERO in the US).",29977,16473,2193,297,Åland,Jeremy Duns,102595,,https://www.jeremy-duns.com,True,"Top 7 Warning Signs in a Man's Bookshelf:\n\n1. List of secret military launch codes\n2. Nest of angry hornets\n3. Vial of anthrax\n4. Commemorative ""1 million subscribers"" YouTube plaque\n5. Too much taxidermy\n6. Any issue of the Yale Daily News\n7. Infinite Jest https://t.co/Gct2JGoJ4I"
2,1298263845320798212,https://twitter.com/juliareinstein/status/1298263845320798212,Tue Aug 25 14:19:55 +0000 2020,2020-08-25 14:19:55+00:00,juliareinstein,"""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes",retweet,,,,,9228,,,,en,,,711,1.298107e+18,bugposting,630523200.0,"<a href=""http://twitter.com/download/iphone"" rel=""nofollow"">Twitter for iPhone</a>",219718241,Thu Nov 25 17:35:57 +0000 2010,False,reporter at @BuzzFeedNews. pro-union and pro-Union Pool. this is a Fleetwood Mac stan account\n\ntell me things: julia.reinstein@buzzfeed.com or DM (she/her),25804,29592,1491,349,"Brooklyn, NY",julia reinstein 🚡,48073,,https://www.buzzfeednews.com/author/juliareinstein,True,"""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes"
3,1298258323364405251,https://twitter.com/AditiJuneja3/status/1298258323364405251,Tue Aug 25 13:57:59 +0000 2020,2020-08-25 13:57:59+00:00,AditiJuneja3,"Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""",retweet,,,,,16764,,,,en,,,1929,1.297909e+18,MchughJess,2575336000.0,"<a href=""http://twitter.com/download/iphone"" rel=""nofollow"">Twitter for iPhone</a>",545630206,Thu Apr 05 02:41:49 +0000 2012,False,Lawyer. Writer. Organizer. Personal opinions. Work @protctdemocracy. Host/creator @selfcaresundays. Board @disabrightsfund. @ForbesUnder30 @GlobalShapers.,97595,18071,1130,253,"New York, NY",Aditi Juneja,101991,,http://www.aditijuneja.me,True,"Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book."""
4,1298256212849393664,https://twitter.com/ZJemptv/status/1298256212849393664,Tue Aug 25 13:49:36 +0000 2020,2020-08-25 13:49:36+00:00,ZJemptv,"don’t avoid reading infinite jest because it’s a signifier of a certain type of dude’s aesthetic taste and academic literature circle-jerkery, avoid reading it because david foster wallace assaulted mary karr",retweet,,,,,662,,,,en,,,69,1.298008e+18,aedison,1471021.0,"<a href=""http://twitter.com/download/android"" rel=""nofollow"">Twitter for Android</a>",5780032,Sat May 05 02:24:35 +0000 2007,False,Orlando trans activist and science writer. Gender Analysis. NSFW. Wife of @ADubiousPronoun. https://t.co/RgoLBGZiGL zjemptv@gmail.com,58545,22495,6628,500,"Orlando, FL","Zinnia, adult demon female",149874,,https://genderanalysis.net/,True,"don’t avoid reading infinite jest because it’s a signifier of a certain type of dude’s aesthetic taste and academic literature circle-jerkery, avoid reading it because david foster wallace assaulted mary karr"


## Filter Twitter Data

In [11]:
infinite_jest_df[['created_at', 'tweet_type', 'media', 'text', 'rt_text','retweet_count',  'urls', 'user_name', 'user_location', 'hashtags', ]].head(100)

Unnamed: 0,created_at,tweet_type,media,text,rt_text,retweet_count,urls,user_name,user_location,hashtags
0,Tue Aug 25 15:22:24 +0000 2020,reply,,"@JustinCChang Avoid the man who wants to put up a bookshelf solely for his copy of Infinite Jest but can't because, his Wallace? Not big enough.",,0,,Nate DiMeo,Los Angeles,
1,Tue Aug 25 14:33:19 +0000 2020,retweet,,"Top 7 Warning Signs in a Man's Bookshelf:\n\n1. List of secret military launch codes\n2. Nest of angry hornets\n3. Vial of anthrax\n4. Commemorative ""1 million subscribers"" YouTube plaque\n5. Too much taxidermy\n6. Any issue of the Yale Daily News\n7. Infinite Jest https://t.co/Gct2JGoJ4I","Top 7 Warning Signs in a Man's Bookshelf:\n\n1. List of secret military launch codes\n2. Nest of angry hornets\n3. Vial of anthrax\n4. Commemorative ""1 million subscribers"" YouTube plaque\n5. Too much taxidermy\n6. Any issue of the Yale Daily News\n7. Infinite Jest https://t.co/Gct2JGoJ4I",9,,Jeremy Duns,Åland,
2,Tue Aug 25 14:19:55 +0000 2020,retweet,,"""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes","""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes",711,,julia reinstein 🚡,"Brooklyn, NY",
3,Tue Aug 25 13:57:59 +0000 2020,retweet,,"Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""","Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""",1929,,Aditi Juneja,"New York, NY",
4,Tue Aug 25 13:49:36 +0000 2020,retweet,,"don’t avoid reading infinite jest because it’s a signifier of a certain type of dude’s aesthetic taste and academic literature circle-jerkery, avoid reading it because david foster wallace assaulted mary karr","don’t avoid reading infinite jest because it’s a signifier of a certain type of dude’s aesthetic taste and academic literature circle-jerkery, avoid reading it because david foster wallace assaulted mary karr",69,,"Zinnia, adult demon female","Orlando, FL",
5,Tue Aug 25 13:47:47 +0000 2020,original,,infinite jest is the radiohead of books.,,11,,brandonstosuy,,
6,Tue Aug 25 13:39:35 +0000 2020,original,,"The Books subreddit is sort of like Book Twitter, but with much less arguing about Infinite Jest. https://t.co/cWzKusvNct",,7,https://lithub.com/everything-is-terrible-except-the-books-subreddit/,Literary Hub,,
7,Tue Aug 25 13:33:16 +0000 2020,reply,,@TheStalwart Has she read infinite jest yet?,,0,,Justin Paterno,New York,
8,Tue Aug 25 13:17:09 +0000 2020,reply,,"@TheStalwart infinite jest by david foster wallace, not sure if you've heard of him",,2,,Alexandra Scaggs,"Brooklyn, NY",
9,Tue Aug 25 13:16:39 +0000 2020,retweet,,"“Infinite Jest” changed my life. It’s true. @jamesbmeigs saw me carrying it into the Premiere office...and a little later threw a thick batch of paper on my desk and told me, “Okay, I need you to mollify David Foster Wallace.” The adventure begins...","“Infinite Jest” changed my life. It’s true. @jamesbmeigs saw me carrying it into the Premiere office...and a little later threw a thick batch of paper on my desk and told me, “Okay, I need you to mollify David Foster Wallace.” The adventure begins...",5,,Sonny Bunch,"Dallas, TX",


In [12]:
infinite_jest_df = infinite_jest_df[['created_at', 'tweet_type', 'media', 'tweet_url', 'text', 'rt_text','retweet_count',  'urls', 'user_name', 'user_location', 'hashtags', ]].head(100)


## Display Links and Images in Twitter Data

To display links and images in our Twitter dataframe, run the cells below. We're converting the image URL into an HTML image tag and then displaying our dataframe as an HTML object.

In [13]:
from IPython.core.display import HTML

In [14]:
def get_image_html(link):
    if link != "No Image":
        image_html = f"<a href= '{link}'>'<img src='{link}' width='500px'></a>                            "
    else:
        image_html = "No Image"
    return image_html

In [15]:
infinite_jest_df['media'] = infinite_jest_df['media'].fillna("No Image")
infinite_jest_df['media']= infinite_jest_df['media'].apply(get_image_html)

In [20]:
HTML(infinite_jest_df.sort_values(by='media').to_html(render_links=True, escape=False))

Unnamed: 0,created_at,tweet_type,media,tweet_url,text,rt_text,retweet_count,urls,user_name,user_location,hashtags
37,Tue Aug 25 01:42:48 +0000 2020,retweet,',https://twitter.com/getcerebral/status/1298073309528170496,remember when i smoked weed out of infinite jest https://t.co/B9tUeer3wV,remember when i smoked weed out of infinite jest https://t.co/B9tUeer3wV,472,,james cassar 🅴,philadelphia,
84,Mon Aug 24 21:00:33 +0000 2020,original,',https://twitter.com/brokenbottleboy/status/1298002277731635200,"Goethe?! So no Germans then, I guess. \n\nI don’t own Infinite Jest. \n\nI own two Hemingway books.\n\nI don’t own Lolita but I have several other Nabokov books. \n\nWhy the Turgenev hate? \n\nI like Bukowski’s poems while not thinking he was a cool guy. \n\nAyn Rand is shit. https://t.co/SOkfHFvcLM",,3,,Mic Wright 🏳️‍🌈🌋🏴‍☠️,London,
63,Mon Aug 24 22:23:48 +0000 2020,original,',https://twitter.com/ryanlawler/status/1298023227164368896,The world if dudes would just stop reading Infinite Jest and watching The Sopranos https://t.co/RGRi3GjZym,,1,,ryan lawler,philly yo,
53,Mon Aug 24 22:56:12 +0000 2020,original,',https://twitter.com/HKesvani/status/1298031383814832134,"Anyway since we're talking about Infinite Jest, here's my favourite pic of the late David Foster Wallace https://t.co/zvZxEEctqa",,13,,hk,SE London,
0,Tue Aug 25 15:22:24 +0000 2020,reply,No Image,https://twitter.com/thememorypalace/status/1298279565932900354,"@JustinCChang Avoid the man who wants to put up a bookshelf solely for his copy of Infinite Jest but can't because, his Wallace? Not big enough.",,0,,Nate DiMeo,Los Angeles,
71,Mon Aug 24 21:35:50 +0000 2020,reply,No Image,https://twitter.com/obrien/status/1298011159501602816,@MattZeitlin Theory: No one has ever read all of Infinite Jest.,,0,,Chris O'Brien,"Toulouse, France",
70,Mon Aug 24 21:40:44 +0000 2020,reply,No Image,https://twitter.com/velocciraptor/status/1298012392580734976,@Fohnicus You open up the copy of Infinite jest. Inside are five more copies of Infinite Jest,,0,,Carli Velocci ➡ 🛌,"Los Angeles, CA",
69,Mon Aug 24 21:50:40 +0000 2020,original,No Image,https://twitter.com/tylerlauletta/status/1298014890687438849,"First they came for Infinite Jest, and I did not speak out — because I had not read Infinite Jest.",,0,,tryler,,
68,Mon Aug 24 21:58:43 +0000 2020,original,No Image,https://twitter.com/AshleyDean/status/1298016918167838721,"After careful consideration and reflection, I've decided once again to stay out of the Infinite Jest discourse. Thank you.",,0,,Ashley Dean,"New Orleans, LA",
67,Mon Aug 24 22:17:32 +0000 2020,original,No Image,https://twitter.com/joelgolby/status/1298021649699295233,what am i going to do with all my infinite jest funko pops,,0,,Joel Golby,London,


Filter to just text, images, and retweet count

In [19]:
HTML(infinite_jest_df[['media', 'text', 'retweet_count']].sort_values(by='media').to_html(render_links=True, escape=False))

Unnamed: 0,media,text,retweet_count
37,',remember when i smoked weed out of infinite jest https://t.co/B9tUeer3wV,472
84,',"Goethe?! So no Germans then, I guess. \n\nI don’t own Infinite Jest. \n\nI own two Hemingway books.\n\nI don’t own Lolita but I have several other Nabokov books. \n\nWhy the Turgenev hate? \n\nI like Bukowski’s poems while not thinking he was a cool guy. \n\nAyn Rand is shit. https://t.co/SOkfHFvcLM",3
63,',The world if dudes would just stop reading Infinite Jest and watching The Sopranos https://t.co/RGRi3GjZym,1
53,',"Anyway since we're talking about Infinite Jest, here's my favourite pic of the late David Foster Wallace https://t.co/zvZxEEctqa",13
0,No Image,"@JustinCChang Avoid the man who wants to put up a bookshelf solely for his copy of Infinite Jest but can't because, his Wallace? Not big enough.",0
71,No Image,@MattZeitlin Theory: No one has ever read all of Infinite Jest.,0
70,No Image,@Fohnicus You open up the copy of Infinite jest. Inside are five more copies of Infinite Jest,0
69,No Image,"First they came for Infinite Jest, and I did not speak out — because I had not read Infinite Jest.",0
68,No Image,"After careful consideration and reflection, I've decided once again to stay out of the Infinite Jest discourse. Thank you.",0
67,No Image,what am i going to do with all my infinite jest funko pops,0


## Sort By Top Retweets

In [22]:
infinite_jest_df.sort_values(by='retweet_count', ascending=False)

Unnamed: 0,created_at,tweet_type,media,tweet_url,text,rt_text,retweet_count,urls,user_name,user_location,hashtags
47,Mon Aug 24 23:30:59 +0000 2020,retweet,No Image,https://twitter.com/stuarthazeldine/status/1298040136450613253,"Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""","Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""",1929,,Stuart Hazeldine,London,
3,Tue Aug 25 13:57:59 +0000 2020,retweet,No Image,https://twitter.com/AditiJuneja3/status/1298258323364405251,"Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""","Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""",1929,,Aditi Juneja,"New York, NY",
35,Tue Aug 25 01:47:56 +0000 2020,retweet,No Image,https://twitter.com/danielle_binks/status/1298074601226776577,"Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""","Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""",1929,,Danielle Binks,"Melbourne, Australia",
41,Tue Aug 25 00:54:43 +0000 2020,retweet,No Image,https://twitter.com/Be_Herrero/status/1298061207216230402,"Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""","Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""",1929,,Berta Herrero,🇪🇺,
90,Mon Aug 24 20:49:19 +0000 2020,retweet,No Image,https://twitter.com/slarkpope/status/1297999452515508236,"Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""","Top 7 Warning Signs In a Man's Bookshelf:\n1. A Dog-eared copy of Infinite Jest\n2. Too Much Hemingway\n3. Any amount of Bukowski\n4. AYN. RAND.\n5. Goethe\n6. ""Lolita is my favorite book.""\n7. ""'Fathers and Sons' Is my favorite book.""",1929,,brian braiker,http://instagram.com/braiker,
28,Tue Aug 25 04:26:23 +0000 2020,retweet,No Image,https://twitter.com/bkerogers/status/1298114474562527238,"""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes","""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes",711,,Brooke Rogers 🌻,"Queens, NY",
25,Tue Aug 25 06:02:51 +0000 2020,retweet,No Image,https://twitter.com/SafyHallanFarah/status/1298138751345074176,"""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes","""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes",711,,safy,mpls,
31,Tue Aug 25 03:59:08 +0000 2020,retweet,No Image,https://twitter.com/sasimons/status/1298107618615951361,"""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes","""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes",711,,Seth Simons,i’m in boise,
32,Tue Aug 25 03:58:27 +0000 2020,retweet,No Image,https://twitter.com/getcerebral/status/1298107444548308993,"""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes","""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes",711,,james cassar 🅴,philadelphia,
29,Tue Aug 25 04:02:11 +0000 2020,retweet,No Image,https://twitter.com/ivadixit/status/1298108385850806272,"""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes","""all boys love infinite jest"" is just a misconception. all boys have a copy of infinite jest due to its large size, which makes it the ideal book to hollow out and use to hide a copy of Louis Sachar's 1998 masterpiece Holes",711,,Iva Dixit,iva.dixit@nytimes.com,


## Identify Top Hashtags

`twarc/utils/tags.py tweets.jsonl`

 > <img src=https://upload.wikimedia.org/wikipedia/commons/thumb/3/34/Windows_logo_-_2012_derivative.svg/1024px-Windows_logo_-_2012_derivative.svg.png width=20 align='left' > Heads up Windows users! Remember that twarc utilities may not work on your computer by default. If you get a UnicodeEncodeError, it's because Windows computers do not use Unicode (UTF-8) by default. However, you can make UTF-8 your default by following [these instructions](https://scholarslab.github.io/learn-twarc/08-win-region-settings) and restarting your comptuer. Then twarc utilities should work.

In [23]:
!python twarc/utils/tags.py twitter-data/infinite_jest_verified_tweets.jsonl

    1 mementomorimonday


## Create a Word Cloud

`twarc/utils/wordcloud.py tweets.jsonl`

In [24]:
!python twarc/utils/wordcloud.py twitter-data/infinite_jest_verified_tweets.jsonl > twitter-data/infinite_jest_verified_tweets.html

[infinite_jest_verified_tweets.html](infinite_jest_verified_tweets.html)

In [25]:
%%html
<iframe src="twitter-data/infinite_jest_verified_tweets.html" width=800, height=800></iframe>

## Count Emojis

`python twarc/utils/emojis.py tweets.jsonl --number 10`

**Install emoji package**

In [None]:
!pip install emoji

In [28]:
!python twarc/utils/emojis.py twitter-data/infinite_jest_verified_tweets.jsonl --number 10

📚     6
💔     1
😤     1
🌏     1
☄     1
🤷     1


## Your Turn!

Now choose your own Twitter search term or query.

**Collect Tweets From Last 7 Days**

In [None]:
!twarc search "your search query" > your_search.jsonl 

**Count How Many Tweets You Collected**

 Mac/Chrome OS

In [None]:
!wc -l your_search.jsonl

<img src=https://upload.wikimedia.org/wikipedia/commons/thumb/3/34/Windows_logo_-_2012_derivative.svg/1024px-Windows_logo_-_2012_derivative.svg.png width=20 align='left' > Windows 

In [None]:
!find /v /c "" your_search.jsonl

**Convert Your JSON data to CSV data**

In [963]:
!python twarc/utils/json2csv.py --extra-field rt_text retweeted_status.full_text  your_search.jsonl > your_search.csv

**Read in CSV file**

In [None]:
import pandas as pd

In [None]:
your_df = pd.read_csv('your_search.csv')

**Add Metadata**

Filter your dataframe and add at least one new metadata column that we haven't explored yet.

In [None]:
your_df.columns

When you run the cell below, right-click to "Enable Scrolling for Outputs" and scroll through to see what the new metadata category looks like. Discuss this category with your group and how you might use it for a Twitter analysis.

In [None]:
your_df[['created_at', 'tweet_type', '#YOUR NEW METADATA HERE','media', 'tweet_url', 'text', 'rt_text','retweet_count',  'urls', 'user_name', 'user_location', 'hashtags', ]].head(100)

Now save your filtered dataframe as `filtered_df`

In [None]:
#Your Code Here

**Explore Data with Links and Images**

In [35]:
from IPython.core.display import HTML

In [972]:
def get_image_html(link):
    if link != "No Image":
        image_html = f"<a href= '{link}'>'<img src='{link}' width='500px'></a>                            "
    else:
        image_html = "No Image"
    return image_html

In [973]:
filtered_df['media'] = filtered_df['media'].fillna("No Image")
filtered_df['media']= filtered_df['media'].apply(get_image_html)

In [None]:
HTML(filtered_df.to_html(render_links=True, escape=False))

**Sort Your Twitter Data by Top Retweets**

In [None]:
#Your code here

What is the most retweeted tweet in your dataset?

**#**Your Answer Here

**Count Most Frequent Emojis**

In [None]:
!python twarc/utils/emojis.py your_search.jsonl --number 10