# Article Notebook for Scraping Twitter Using snscrape's CLI Commands With Python
<br>Package Github: https://github.com/JustAnotherArchivist/snscrape
<br>This notebook will be using the development version of snscrape

Article Read-Along: ...

### Notebook Author: Martin Beck
<b>Information current as of November, 26th 2020</b><br>

This notebook contains materials for scraping tweets from Twitter using snscrape's CLI commands with Python

<b>Dependencies: </b> Your Python version must be <b>3.8</b> or higher. The development version of snscrape will not work with Python 3.7 or lower. You can download the latest Python version [here](https://www.python.org/downloads/).

In [2]:
# Run the pip install command below if you don't already have the library
# !pip install git+https://github.com/JustAnotherArchivist/snscrape.git

# Imports
import os
import pandas as pd

# Query by Username
The code below will scrape for 100 tweets by a username then provide a CSV file with Pandas

In [None]:
# Setting variables to be used in format string command below
tweet_count = 100
username = 'jack'

# Using OS library to call CLI commands in Python
os.system("snscrape --jsonl --max-results {} twitter-search 'from:{}'> user-tweets.json".format(tweet_count, username))

In [10]:
# Reads the json generated from the CLI command above and creates a pandas dataframe
tweets_df1 = pd.read_json('user-tweets.json', lines=True)

# Displays first 5 entries from dataframe
tweets_df1.head()

Unnamed: 0,url,date,content,renderedContent,id,user,outlinks,tcooutlinks,replyCount,retweetCount,likeCount,quoteCount,conversationId,lang,source,media,retweetedTweet,quotedTweet,mentionedUsers
0,https://twitter.com/jack/status/13291496370060...,2020-11-18 19:49:02+00:00,@NeerajKA Welcome!,@NeerajKA Welcome!,1329149637006041088,"{'username': 'jack', 'displayname': 'jack', 'i...",[],[],73,14,799,8,1329140522565439490,en,"<a href=""http://twitter.com/download/iphone"" r...",,,,"[{'username': 'NeerajKA', 'displayname': 'Neer..."
1,https://twitter.com/jack/status/13291372550263...,2020-11-18 18:59:50+00:00,Join @CashApp! #Bitcoin https://t.co/SbYANIZyix,Join @CashApp! #Bitcoin twitter.com/owenbjenni...,1329137255026311168,"{'username': 'jack', 'displayname': 'jack', 'i...",[https://twitter.com/owenbjennings/status/1329...,[https://t.co/SbYANIZyix],577,272,2488,131,1329137255026311168,en,"<a href=""http://twitter.com/download/iphone"" r...",,,{'url': 'https://twitter.com/owenbjennings/sta...,"[{'username': 'CashApp', 'displayname': 'Cash ..."
2,https://twitter.com/jack/status/13291366656847...,2020-11-18 18:57:29+00:00,@kateconger @sarahintampa Nah,@kateconger @sarahintampa Nah,1329136665684705280,"{'username': 'jack', 'displayname': 'jack', 'i...",[],[],38,5,177,10,1329126492731699203,und,"<a href=""http://twitter.com/download/iphone"" r...",,,,"[{'username': 'kateconger', 'displayname': 'o...."
3,https://twitter.com/jack/status/13291358061921...,2020-11-18 18:54:05+00:00,@mmasnick Terrible idea! And terribly false.,@mmasnick Terrible idea! And terribly false.,1329135806192107521,"{'username': 'jack', 'displayname': 'jack', 'i...",[],[],50,13,222,16,1329128773845860352,en,"<a href=""http://twitter.com/download/iphone"" r...",,,,"[{'username': 'mmasnick', 'displayname': 'Mike..."
4,https://twitter.com/jack/status/13287213055799...,2020-11-17 15:27:00+00:00,"Thank you for the time, and I look forward to ...","Thank you for the time, and I look forward to ...",1328721305579921409,"{'username': 'jack', 'displayname': 'jack', 'i...",[],[],810,112,2233,105,1328721286474788865,en,"<a href=""http://twitter.com/download/iphone"" r...",,,,


In [13]:
# Export dataframe into a CSV
tweets_df1.to_csv('user-tweets.csv', sep=',', index=False)

# Query by Text Search
This function will scrape for 100 tweets by a text search then provide a CSV file with Pandas

In [None]:
# Setting variables to be used in format string command below
tweet_count = 100
text_query = 'coronavirus'

# Using OS library to call CLI commands in Python
os.system("snscrape --jsonl --max-results {} twitter-search '{}'> text-query-tweets.json".format(tweet_count, text_query))

In [14]:
# Reads the json generated from the CLI command above and creates a pandas dataframe
tweets_df2 = pd.read_json('text-query-tweets.json', lines=True)

# Displays first 5 entries from dataframe
tweets_df2.head()

Unnamed: 0,url,date,content,renderedContent,id,user,outlinks,tcooutlinks,replyCount,retweetCount,likeCount,quoteCount,conversationId,lang,source,media,retweetedTweet,quotedTweet,mentionedUsers
0,https://twitter.com/TechKashif/status/13321906...,2020-11-27 05:13:00+00:00,Germany’s coronavirus infections pass one mill...,Germany’s coronavirus infections pass one mill...,1332190664902270976,"{'username': 'TechKashif', 'displayname': 'Tec...",[https://todayssnews.com/germanys-coronavirus-...,[https://t.co/S18ZtPNWHm],0,0,0,0,1332190664902270976,en,"<a href=""http://publicize.wp.com/"" rel=""nofoll...",,,,
1,https://twitter.com/Geography102/status/133219...,2020-11-27 05:12:58+00:00,@RepsForBiden Is he sitting is a bizarrely sho...,@RepsForBiden Is he sitting is a bizarrely sho...,1332190658078126082,"{'username': 'Geography102', 'displayname': 'C...",[],[],0,0,0,0,1332171077104046080,en,"<a href=""https://mobile.twitter.com"" rel=""nofo...",,,,"[{'username': 'RepsForBiden', 'displayname': '..."
2,https://twitter.com/himbojoseph/status/1332190...,2020-11-27 05:12:56+00:00,um amigo meu me chamou pra sair domingo e to m...,um amigo meu me chamou pra sair domingo e to m...,1332190650830385152,"{'username': 'himbojoseph', 'displayname': 'ؘ'...",[],[],0,0,0,0,1332190650830385152,pt,"<a href=""http://twitter.com/download/android"" ...",,,,
3,https://twitter.com/dembouby/status/1332190644...,2020-11-27 05:12:55+00:00,Mientras que cuando salió YHLQMDLG estaba cami...,Mientras que cuando salió YHLQMDLG estaba cami...,1332190644803166209,"{'username': 'dembouby', 'displayname': 'broco...",[],[],0,0,0,0,1332190644803166209,es,"<a href=""http://twitter.com/download/android"" ...",,,,
4,https://twitter.com/Patriotpeter2/status/13321...,2020-11-27 05:12:54+00:00,Supreme Court rules against Cuomo's coronaviru...,Supreme Court rules against Cuomo's coronaviru...,1332190641170878465,"{'username': 'Patriotpeter2', 'displayname': '...",[https://www.foxnews.com/us/supreme-court-rule...,[https://t.co/isFPR7stvz],0,0,0,0,1332190641170878465,en,"<a href=""https://mobile.twitter.com"" rel=""nofo...",,,,


In [17]:
# Export dataframe into a CSV
tweets_df2.to_csv('text-query-tweets.csv', sep=',', index=False)