# Public Sentiment Monitor on COVID-19 Vaccines Tweets
# Part I: Data Acquisition and Cleaning

## Introduction
This notebook will go through the following steps:
1. Data Acquisition 
2. Data Cleaning
3. Feature Engineering

Our goal is to collect tweets from Twitter and collect COVID-19 updates from The American Journal of Managed Care® to deploy a sentiment monitor and understand the public sentiment trend.

In [1]:
# Download the packages needed
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
pd.set_option('display.max_colwidth', 0) # To display entire text content of a column
pd.set_option("max_rows", None) # To display all rows

import csv

import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
nltk.download('vader_lexicon')

[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /Users/rachelchen/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

## 1. Data Acquisition
In this project, we will manipulate with three data sets.
- COVID-19 development timeline
- Geographic coordinate of major Cities in North America
- Tweet

### Web Scarping: COVID-19 Developments Timeline 

We want to analyze how COVID-19 developments updates would affect people's sentiment change on Twitter. We will scrape the date and event title from the below AJMC websites:
- https://www.ajmc.com/view/a-timeline-of-covid19-developments-in-2020
- https://www.ajmc.com/view/a-timeline-of-covid-19-vaccine-developments-in-2021

We will use Selenium for web scraping because it can connect my code to the browser and easily recognize the data pattern on the web page.

In [52]:
# Use selenium for web scraping
from selenium import webdriver

# Initializes the Chrome Driver to access the website
driver = webdriver.Chrome('/Applications/chromedriver')

#### Timeline 2021

In [53]:
# Initializes the Chrome Driver to access the URL
driver.get('https://www.ajmc.com/view/a-timeline-of-covid-19-vaccine-developments-in-2021')

# Web scraping covid timeline 2021

# Scrape all the bold texts from the website
links_21 = driver.find_elements_by_tag_name('strong')

# Add bold texts to a disctionary with date as 
timeline_21 = []

for i in range(0,len(links_21)):
    timeline_21.append(links_21[i].text)

In [54]:
# Check first 10 recores
timeline_21[0:10]

['January 4',
 'Operation Warp Speed Initiates Talks With Moderna on Half-Dose Vaccines',
 'UK Begins Distributing AstraZeneca/Oxford Vaccine',
 'January 5',
 'FDA Advises Against Altering Vaccine Schedules',
 'Moderna to Produce 600 Million Vaccine Doses',
 'January 6',
 'HHS to Provide $22 Billion to Fund Testing, Vaccine Distribution',
 'January 7',
 'CDC: COVID-19 Vaccine Benefits Outweigh Allergic Reaction Risk']

We want to create a column for date and a column for the event; Also, we want to combine the events happened in the same day into one cell.

In [55]:
# Store outputs in a dictionary with data as key and event as value
date_list = []
text_list = []
date_event_dict = {}
current_date = None

for element in timeline_21:
    # We found that date element has a length less than 12
    if len(element) <= 12:
        current_date = element
        date_event_dict[current_date] = []
    else:
        date_event_dict[current_date].append(element)

In [56]:
# show all rows to check the data
pd.set_option('max_rows', None)

# Create a dataframe from dictionary
df_tl_21 = pd.DataFrame(list(date_event_dict.items()),columns = ['date','covid_update']) 
# Add year to complete a date format
df_tl_21['date'] = df_tl_21['date'].apply(lambda x: x + ', 2021')

# Check the dataframe
df_tl_21.head()

Unnamed: 0,date,covid_update
0,"January 4, 2021","[Operation Warp Speed Initiates Talks With Moderna on Half-Dose Vaccines, UK Begins Distributing AstraZeneca/Oxford Vaccine]"
1,"January 5, 2021","[FDA Advises Against Altering Vaccine Schedules, Moderna to Produce 600 Million Vaccine Doses]"
2,"January 6, 2021","[HHS to Provide $22 Billion to Fund Testing, Vaccine Distribution]"
3,"January 7, 2021","[CDC: COVID-19 Vaccine Benefits Outweigh Allergic Reaction Risk, Study Shows Patients With Heart Failure Should Be Prioritized for Vaccines]"
4,"January 8, 2021","[American Hospital Association Pushes for Faster Vaccine Rollout, Pharmacies Tapped to Distribute Vaccines, Biden Plans to Rapidly Release Most COVID-19 Doses, States Face Significant Rollout Hurdles]"


The data frame looks good except that we don't want the square brackets outside the text.

In [57]:
# remove square brackets
df_tl_21['covid_update'] = df_tl_21['covid_update'].apply(lambda s: ', '.join([str(elem) for elem in s]))
df_tl_21

Unnamed: 0,date,covid_update
0,"January 4, 2021","Operation Warp Speed Initiates Talks With Moderna on Half-Dose Vaccines, UK Begins Distributing AstraZeneca/Oxford Vaccine"
1,"January 5, 2021","FDA Advises Against Altering Vaccine Schedules, Moderna to Produce 600 Million Vaccine Doses"
2,"January 6, 2021","HHS to Provide $22 Billion to Fund Testing, Vaccine Distribution"
3,"January 7, 2021","CDC: COVID-19 Vaccine Benefits Outweigh Allergic Reaction Risk, Study Shows Patients With Heart Failure Should Be Prioritized for Vaccines"
4,"January 8, 2021","American Hospital Association Pushes for Faster Vaccine Rollout, Pharmacies Tapped to Distribute Vaccines, Biden Plans to Rapidly Release Most COVID-19 Doses, States Face Significant Rollout Hurdles"
5,"January 11, 2021",Vaccine Doses Go Unused or Are Trashed
6,"January 12, 2021","CDC, HHS Update Vaccine Allocation Guidance"
7,"January 14, 2021","Elderly Los Angeles County Residents Report Confusion, Delays, GoodRx Report Documents Vaccine Deserts"
8,"January 18, 2021",
9,"Reports of, 2021",Racial Disparities in Vaccination Rates


We found one two rows of data are not in correct format.

In [58]:
# Manually correct some spotted error 
df_tl_21.iloc[8,1] = 'Racial Disparities in Vaccination Rate'
df_tl_21 = df_tl_21.drop([9])
df_tl_21

Unnamed: 0,date,covid_update
0,"January 4, 2021","Operation Warp Speed Initiates Talks With Moderna on Half-Dose Vaccines, UK Begins Distributing AstraZeneca/Oxford Vaccine"
1,"January 5, 2021","FDA Advises Against Altering Vaccine Schedules, Moderna to Produce 600 Million Vaccine Doses"
2,"January 6, 2021","HHS to Provide $22 Billion to Fund Testing, Vaccine Distribution"
3,"January 7, 2021","CDC: COVID-19 Vaccine Benefits Outweigh Allergic Reaction Risk, Study Shows Patients With Heart Failure Should Be Prioritized for Vaccines"
4,"January 8, 2021","American Hospital Association Pushes for Faster Vaccine Rollout, Pharmacies Tapped to Distribute Vaccines, Biden Plans to Rapidly Release Most COVID-19 Doses, States Face Significant Rollout Hurdles"
5,"January 11, 2021",Vaccine Doses Go Unused or Are Trashed
6,"January 12, 2021","CDC, HHS Update Vaccine Allocation Guidance"
7,"January 14, 2021","Elderly Los Angeles County Residents Report Confusion, Delays, GoodRx Report Documents Vaccine Deserts"
8,"January 18, 2021",Racial Disparities in Vaccination Rate
10,"January 19, 2021","California COVID-19 Variant May Be Vaccine Resistant, Pfizer, Moderna, AstraZeneca to Test Vaccines in Adolescents, Incoming CDC Director Walensky to Prioritize Vaccine Rollout"


The table looks good now.

#### Timeline 2020

Let's try to repeat the above steps

In [59]:
# Initializes the Chrome Driver to access the URL
driver.get('https://www.ajmc.com/view/a-timeline-of-covid19-developments-in-2020')

links_20 = driver.find_elements_by_tag_name('strong')
timeline_20 = []

for i in range(0,len(links_20)):
    timeline_20.append(links_20[i].text)

In [60]:
# Check the first 10 records
timeline_20[0:10]

['January 9 — WHO Announces Mysterious Coronavirus-Related Pneumonia in Wuhan, China',
 'January 20 — CDC Says 3 US Airports Will Begin Screening for Coronavirus',
 'January 21 — CDC Confirms First US Coronavirus Case',
 'January 21 — Chinese Scientist Confirms COVID-19 Human Transmission',
 'January 23 — Wuhan Now Under Quarantine',
 'January 31 — WHO Issues Global Health Emergency',
 'February 2 — Global Air Travel Is Restricted',
 'February 3 — US Declares Public Health Emergency',
 'February 10 — China’s COVID-19 Deaths Exceed Those of SARS Crisis',
 'February 25 — CDC Says COVID-19 Is Heading Toward Pandemic Status']

The format of the timeline 2020 website format is different from the 2021 one. Let's clean it accordingly so it appears in the same format as previous table.

In [61]:
# Store data into a dataframe
df_tl_20 = pd.DataFrame(timeline_20)
# Split string into column date and column event
df_tl_20 = df_tl_20[0].str.split(' — ', expand = True)
# Make sure we have the same column names as previous table
df_tl_20 = df_tl_20.rename(columns={0: 'date', 1: 'covid_update'})
# Add year to complete the date fromat
df_tl_20['date'] = df_tl_20['date'].apply(lambda x: x + ', 2020' )

df_tl_20

Unnamed: 0,date,covid_update
0,"January 9, 2020","WHO Announces Mysterious Coronavirus-Related Pneumonia in Wuhan, China"
1,"January 20, 2020",CDC Says 3 US Airports Will Begin Screening for Coronavirus
2,"January 21, 2020",CDC Confirms First US Coronavirus Case
3,"January 21, 2020",Chinese Scientist Confirms COVID-19 Human Transmission
4,"January 23, 2020",Wuhan Now Under Quarantine
5,"January 31, 2020",WHO Issues Global Health Emergency
6,"February 2, 2020",Global Air Travel Is Restricted
7,"February 3, 2020",US Declares Public Health Emergency
8,"February 10, 2020",China’s COVID-19 Deaths Exceed Those of SARS Crisis
9,"February 25, 2020",CDC Says COVID-19 Is Heading Toward Pandemic Status


In [62]:
# manually correct some spotted errors 
df_tl_20.iloc[107,0] = 'November 16, 2020'
df_tl_20.iloc[107,1] = 'Moderna Reveals Vaccine Efficacy Results'
df_tl_20.iloc[114,0] = 'December 11, 2020'
df_tl_20.iloc[114,1] = 'FDA Agrees to EUA for COVID-19 Vaccine From Pfizer, BioNTech'
df_tl_20.iloc[120,0] = 'December 28, 2020'
df_tl_20.iloc[120,1] = 'Novavax Starts Phase 3 Trial of COVID-19 Vaccine'
df_tl_20.iloc[124,0] = 'December 31, 2020'
df_tl_20.iloc[124,1] = 'US Falls Short of Goal to Give 20 Million Vaccinations by Year End'

df_tl_20 = df_tl_20.drop([23, 115, 121 ])

df_tl_20

Unnamed: 0,date,covid_update
0,"January 9, 2020","WHO Announces Mysterious Coronavirus-Related Pneumonia in Wuhan, China"
1,"January 20, 2020",CDC Says 3 US Airports Will Begin Screening for Coronavirus
2,"January 21, 2020",CDC Confirms First US Coronavirus Case
3,"January 21, 2020",Chinese Scientist Confirms COVID-19 Human Transmission
4,"January 23, 2020",Wuhan Now Under Quarantine
5,"January 31, 2020",WHO Issues Global Health Emergency
6,"February 2, 2020",Global Air Travel Is Restricted
7,"February 3, 2020",US Declares Public Health Emergency
8,"February 10, 2020",China’s COVID-19 Deaths Exceed Those of SARS Crisis
9,"February 25, 2020",CDC Says COVID-19 Is Heading Toward Pandemic Status


In [63]:
# combine events happened in the same day
df_tl_20 = df_tl_20.groupby('date')['covid_update'].apply(', '.join).reset_index()
df_tl_20

Unnamed: 0,date,covid_update
0,"April 16, 2020",“Gating Criteria” Emerge as a Way to Reopen the Economy
1,"April 28, 2020","Young, Poor Avoid Care for COVID-19 Symptoms"
2,"April 29, 2020",NIH Trial Shows Early Promise for Remdesivir
3,"April 8, 2020",Troubles With the COVID-19 Cocktail
4,"August 11, 2020",Trump Administration Reaches Deal With Moderna
5,"August 12, 2020",Severe Obesity Increases Mortality Risk From COVID-19
6,"August 13, 2020",Biden Calls for 3-Month Mask Mandate
7,"August 15, 2020",FDA Approves Saliva Test
8,"August 17, 2020",COVID-19 Now the Third-Leading Cause of Death in the US
9,"August 23, 2020",Convalescent Plasma Is Cleared for Use by FDA


The table looks good now.

#### Combine the Timeline 

In [64]:
# concat two dataframes
df_tl = pd.concat([df_tl_20, df_tl_21], ignore_index=True)
df_tl

Unnamed: 0,date,covid_update
0,"April 16, 2020",“Gating Criteria” Emerge as a Way to Reopen the Economy
1,"April 28, 2020","Young, Poor Avoid Care for COVID-19 Symptoms"
2,"April 29, 2020",NIH Trial Shows Early Promise for Remdesivir
3,"April 8, 2020",Troubles With the COVID-19 Cocktail
4,"August 11, 2020",Trump Administration Reaches Deal With Moderna
5,"August 12, 2020",Severe Obesity Increases Mortality Risk From COVID-19
6,"August 13, 2020",Biden Calls for 3-Month Mask Mandate
7,"August 15, 2020",FDA Approves Saliva Test
8,"August 17, 2020",COVID-19 Now the Third-Leading Cause of Death in the US
9,"August 23, 2020",Convalescent Plasma Is Cleared for Use by FDA


In [65]:
# check columns and dtype
df_tl.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 186 entries, 0 to 185
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   date          186 non-null    object
 1   covid_update  186 non-null    object
dtypes: object(2)
memory usage: 3.0+ KB


In [66]:
# convert object date to numerical date
df_tl['date'] = pd.to_datetime(df_tl['date'])
df_tl = df_tl.sort_values(by = 'date', ignore_index = True)
df_tl.head()

Unnamed: 0,date,covid_update
0,2020-01-09,"WHO Announces Mysterious Coronavirus-Related Pneumonia in Wuhan, China"
1,2020-01-20,CDC Says 3 US Airports Will Begin Screening for Coronavirus
2,2020-01-21,"CDC Confirms First US Coronavirus Case, Chinese Scientist Confirms COVID-19 Human Transmission"
3,2020-01-23,Wuhan Now Under Quarantine
4,2020-01-31,WHO Issues Global Health Emergency


The combined table looks good now, however, a news or an update normally won't be reflected on social media until next day. Before we join the table to tweet data, let's move all the date to one day after.

In [67]:
df_tl['date'] = df_tl['date'].apply(lambda x: x + pd.Timedelta('1 day'))
df_tl.head()

Unnamed: 0,date,covid_update
0,2020-01-10,"WHO Announces Mysterious Coronavirus-Related Pneumonia in Wuhan, China"
1,2020-01-21,CDC Says 3 US Airports Will Begin Screening for Coronavirus
2,2020-01-22,"CDC Confirms First US Coronavirus Case, Chinese Scientist Confirms COVID-19 Human Transmission"
3,2020-01-24,Wuhan Now Under Quarantine
4,2020-02-01,WHO Issues Global Health Emergency


Also, news' effects on social media usually last for few days. Let's create a logic that every update would last for two days unless the next day has a new update.

In [68]:
for i in range(0, len(df_tl)):
    today = df_tl.iloc[i,0]
    nextday = today + pd.Timedelta('1 day')
    nexteventday = df_tl.iloc[i + 1, 0]
    event = df_tl.iloc[i, 1]
    
    if nextday != nexteventday:
        new_row = {'date': nextday, 
                   'covid_update': event}
        df_tl = df_tl.append(new_row, ignore_index = True)

In [73]:
df_tl = df_tl.sort_values(by = 'date', ignore_index = True)
df_tl.head(5)

Unnamed: 0,date,covid_update
0,2020-01-10,"WHO Announces Mysterious Coronavirus-Related Pneumonia in Wuhan, China"
1,2020-01-11,"WHO Announces Mysterious Coronavirus-Related Pneumonia in Wuhan, China"
2,2020-01-21,CDC Says 3 US Airports Will Begin Screening for Coronavirus
3,2020-01-22,"CDC Confirms First US Coronavirus Case, Chinese Scientist Confirms COVID-19 Human Transmission"
4,2020-01-23,"CDC Confirms First US Coronavirus Case, Chinese Scientist Confirms COVID-19 Human Transmission"


In [74]:
# Take a look at the shape of dataset
print(f'There are {df_tl.shape[0]:,} rows and {df_tl.shape[1]} columns in the dataset.')

There are 286 rows and 2 columns in the dataset.


The data set of COVID-19 updates looks perfect, let's store it in a csv for further analysis.

In [75]:
# Export to .csv
df_tl.to_csv('covid19_timeline.csv', index = False)

### Major Cities in North America 

In this project, we will analyze all the tweets regarding Covid-19 vaccine from major cities of North America. Let's get a list of these cities.

**Source**: 

https://www.geodatos.net/en/coordinates/canada

https://www.geodatos.net/en/coordinates/united-states

The website did not allow pandas read_html to web scrape the table due to HTTPError: HTTP Error 403: Forbidden, so I manually copy paste the data to a CSV file, then import it here using pandas.

In [76]:
# Import Canada major cites and their coordinates
coord_ca = pd.read_csv('/Users/rachelchen/Desktop/CapstoneProject/ca_latlng.csv')
# iImport US major cites and their coordinates
coord_us = pd.read_csv('/Users/rachelchen/Desktop/CapstoneProject/us_latlng.csv')
# Combine two tables
coord = pd.concat([coord_ca, coord_us], ignore_index=True)
coord

Unnamed: 0,City,Coordinates
0,Toronto,"43.70011, -79.4163"
1,Ottawa,"45.41117, -75.69812"
2,Montréal,"45.50884, -73.58781"
3,Edmonton,"53.55014, -113.46871"
4,Mississauga,"43.5789, -79.6583"
5,Winnipeg,"49.8844, -97.14704"
6,Vancouver,"49.24966, -123.11934"
7,Hamilton,"43.25011, -79.84963"
8,Calgary,"51.05011, -114.08529"
9,Brampton,"43.68341, -79.76633"


We want to see the tweets posted from North Americas major cities and their surrounding areas. To ensure we get enough tweets to analyze, I set up the search radius as 200km.

In [77]:
# Transfer coordinates to list
coord_list = coord['Coordinates'].tolist()
geo_list=[]

# Format the list so it will be accepted by Twint
for i in coord_list:
    i = i + ', 200 km'
    geo_list.append(i)

geo_list

['43.70011, -79.4163, 200 km',
 '45.41117, -75.69812, 200 km',
 '45.50884, -73.58781, 200 km',
 '53.55014, -113.46871, 200 km',
 '43.5789, -79.6583, 200 km',
 '49.8844, -97.14704, 200 km',
 '49.24966, -123.11934, 200 km',
 '43.25011, -79.84963, 200 km',
 '51.05011, -114.08529, 200 km',
 '43.68341, -79.76633, 200 km',
 '49.10635, -122.82509, 200 km',
 '45.56995, -73.692, 200 km',
 '44.6464, -63.57291, 200 km',
 '42.98339, -81.23304, 200 km',
 '43.90012, -78.84957, 200 km',
 '50.36386, -119.34997, 200 km',
 '48.4359, -123.35155, 200 km',
 '42.30008, -83.01654, 200 km',
 '46.81228, -71.21454, 200 km',
 '43.86682, -79.2663, 200 km',
 '40.71427, -74.00597, 200 km',
 '34.05223, -118.24368, 200 km',
 '41.85003, -87.65005, 200 km',
 '29.76328, -95.36327, 200 km',
 '39.95233, -75.16379, 200 km',
 '33.44838, -112.07404, 200 km',
 '29.42412, -98.49363, 200 km',
 '32.71571, -117.16472, 200 km',
 '32.78306, -96.80667, 200 km',
 '40.6501, -73.94958, 200 km',
 '40.68149, -73.83652, 200 km',
 '37.3393

### Tweet Scarping

We can customize the search using Twint. Our scope is:
- Search all tweets discussing COVID vaccines from non-officials twitter account.
- Search tweets written in English only.
- Exclude retweets from our data because we don't want repetitive information.
- Search tweets from the date of first COVID case confirmed in North America to today.
- Search tweets posted from the North Americas major cities and their surrounding areas.

Because the tweet scarping taking long time to run, so we will save the data into a CSV so we don't need to run it every time. 

In [115]:
import twint
import nest_asyncio

nest_asyncio.apply()

for geo in geo_list:
    c = twint.Config()
    c.Search = '(covid OR coronavirus) AND (vaccination OR vaccine OR vaccinated) AND -filter:verified'
    c.Lang = 'en'
    c.Filter_retweets = True
    c.Lowercase = True
    c.Since = '2020-01-21'
    c.Until = '2021-05-15'
    c.Custom['tweet'] = ['id', 'created_at','user_id', 'username', 'tweet', 'replies_count', 'retweets_count', 'likes_count',
                         'hashtags', 'geo']
    c.Geo = geo
    c.Limit = 10
    #c.Store_csv = True
    #c.Output = '/Users/rachelchen/Desktop/CapstoneProject/tweets.csv'   
    twint.run.Search(c)

1392992803576242177 2021-05-13 19:58:59 -0400 <Rick_City> Since fully vaccinated people can still get covid, can't they spread it even if they don't have symptoms? Or has it been proven they can't?
1392992194357776385 2021-05-13 19:56:34 -0400 <peeair> why does America think that being fully vaxxed means they’re protected from getting Covid?  full vaccinated people can still get Covid and spread it.  🤦🏻‍♀️🤦🏻‍♀️🤦🏻‍♀️
1392992185654644743 2021-05-13 19:56:32 -0400 <politixNplay> Daily COVID-19 vaccine doses administered per 100 people  https://t.co/mOv2XHe6ig
1392990538706661377 2021-05-13 19:49:59 -0400 <renne_fuller> So I officially got my 1st covid19 vaccine tonight I just got home from my covid shot. This one was safer. But mostly all get side effects such as headaches fever and chills might be sick on my 3rd  day. Or could be tonight tomorrow. But I will update it.
1392990297131569153 2021-05-13 19:49:02 -0400 <frenchnerd> Feeling very fortunate that Stephen and I got our first covid

1392992796265562112 2021-05-13 19:58:57 -0400 <NPoushinsky> @IrfanDhalla @AlexMunter This actually makes me feel a little sick, thinking of how many people have just had their AZ vaccines cancelled, and who booked those vaccination appts with full knowledge of the risk, may now contract COVID.   If they can easily/quickly get mRNA, great, if not, then what?
1392992656435859458 2021-05-13 19:58:24 -0400 <PRHhospital> The COVID-19 Vaccination Clinic Team at the PMC were very appreciative to receive a lunch at today’s clinic sponsored by Derek Nighbor and catered by The Kitchen Eatery.   @DerekNighbor  https://t.co/gpqL5Zbiyr
1392985156542210054 2021-05-13 19:28:36 -0400 <mmjmikeelkin> After 9 months of curfew and lockdowns we’ve learned r lesson. Now we’ve been blessed with mass #vaccination sites. 2nd shot June 27! How Montreal has so far dodged a third COVID-19 wave and what other cities can learn from its success - The Globe and Mail  https://t.co/YBgagFGp1m
1392981774624608262 2021-0

1392992814477332481 2021-05-13 19:59:02 -0400 <jamaislu> Les personnes de 18 ans et plus peuvent déjà prendre rendez-vous dans les différentes cliniques de vaccination contre la COVID-19 de la province, jeudi après-midi, soit quelques heures avant l’échéancier initial annoncé par le gouvernement Legault.  https://t.co/clNsfy2guP
1392992796265562112 2021-05-13 19:58:57 -0400 <NPoushinsky> @IrfanDhalla @AlexMunter This actually makes me feel a little sick, thinking of how many people have just had their AZ vaccines cancelled, and who booked those vaccination appts with full knowledge of the risk, may now contract COVID.   If they can easily/quickly get mRNA, great, if not, then what?
1392985156542210054 2021-05-13 19:28:36 -0400 <mmjmikeelkin> After 9 months of curfew and lockdowns we’ve learned r lesson. Now we’ve been blessed with mass #vaccination sites. 2nd shot June 27! How Montreal has so far dodged a third COVID-19 wave and what other cities can learn from its success - The Globe 

1392992637368504324 2021-05-13 19:58:19 -0400 <YianniMacris> Being pretty sick has sucked (it’s not COVID, don’t worry). It’s meant that I’ve had to take it easy, reschedule my vaccination, and been unable to do much exercise. But if there’s one thing that’s getting me through this, it’s listening to sea shanties.   https://t.co/05vGDNPF1g
1392992301568471042 2021-05-13 19:56:59 -0400 <natnatow> @susannchau Oh yeah 100% they’ll use it as an excuse to not wear a mask. And you can still spread and catch covid with the vaccine soooo ... I think it’s careless to issue this statement in the USA when the vaccination progress is finally going ok
1392977998098444288 2021-05-13 19:00:09 -0400 <CityNewsYEG> Expert says the late-stage trials have been going very well for Quebec-based Medicago COVID-19 vaccine.  https://t.co/94Dq25s6Hh
1392975376557563904 2021-05-13 18:49:44 -0400 <Winnie1986> The #Republicans are the only one who are not vaccinated nor wear a mask. When are #Republicans getting t

1392992803576242177 2021-05-13 19:58:59 -0400 <Rick_City> Since fully vaccinated people can still get covid, can't they spread it even if they don't have symptoms? Or has it been proven they can't?
1392992194357776385 2021-05-13 19:56:34 -0400 <peeair> why does America think that being fully vaxxed means they’re protected from getting Covid?  full vaccinated people can still get Covid and spread it.  🤦🏻‍♀️🤦🏻‍♀️🤦🏻‍♀️
1392992185654644743 2021-05-13 19:56:32 -0400 <politixNplay> Daily COVID-19 vaccine doses administered per 100 people  https://t.co/mOv2XHe6ig
1392990538706661377 2021-05-13 19:49:59 -0400 <renne_fuller> So I officially got my 1st covid19 vaccine tonight I just got home from my covid shot. This one was safer. But mostly all get side effects such as headaches fever and chills might be sick on my 3rd  day. Or could be tonight tomorrow. But I will update it.
1392990297131569153 2021-05-13 19:49:02 -0400 <frenchnerd> Feeling very fortunate that Stephen and I got our first covid

1392989238606565380 2021-05-13 19:44:49 -0400 <JohnAllen6> @vocabularyspilz @jeremyfaust This is risky. Have seen fully vaccinated getting COVID. Still need to wear a mask unfortunately
1392988254350286850 2021-05-13 19:40:54 -0400 <WanderingGirl10> @LastChanceCraft @TBots88 @WinnipegNews Nonetheless, the impact of Covid on someone with Type 1 diabetes is a WORSE risk than any vaccine. So they're still better off getting vaccinated.  Her story just makes me all the more pissed off at the anti-vaxxers who lie and lie and lie about the vaccine. They lie, people die.
1392977529506385922 2021-05-13 18:58:17 -0400 <whikloj> @jonvankin @BillWylie3rd @RadioFreeTom Health care workers probably get more than average exposure to Covid. If I use the numbers for this CDC page ( https://t.co/EJpTS8r1Q5) of 9,245 infections among 95,000,000 vaccinated Americans. Then (I think) the infection rate is 0.0097%. Which is not zero, but is pretty close.
1392962274491572224 2021-05-13 17:57:40 -0400 <LMBrys

1392992679416320000 2021-05-13 19:58:29 -0400 <motomotoyama> ALSO, you can still get Covid while being fully vaccinated, especially with the new variants.
1392992678548172802 2021-05-13 19:58:29 -0400 <motomotoyama> According to the NYT, the 7 week average for Covid deaths is over 600.  That is not insignificant, and there are still many many people who are at risk because they haven't gotten vaccinated yet, or can't.
1392992590459326467 2021-05-13 19:58:08 -0400 <957CoastFM> Feds surpass 20-M COVID-19 vaccines distributed, closely watching UK trial on vaccine mixing  https://t.co/s6v5tHBLVl
1392992290591834116 2021-05-13 19:56:57 -0400 <wa_beaver> @GovInslee You do know that even with the vaccine one can still get C19 and pass it on?  Oh wait, it's not about Covid, it's about Control!!  Given his age he had about ~0% chance of getting C19, but he actually has a slightly higher chance of getting a serious side effect from the vaccine.
1392992202171637766 2021-05-13 19:56:36 -0400 <1971

1392992803576242177 2021-05-13 19:58:59 -0400 <Rick_City> Since fully vaccinated people can still get covid, can't they spread it even if they don't have symptoms? Or has it been proven they can't?
1392992194357776385 2021-05-13 19:56:34 -0400 <peeair> why does America think that being fully vaxxed means they’re protected from getting Covid?  full vaccinated people can still get Covid and spread it.  🤦🏻‍♀️🤦🏻‍♀️🤦🏻‍♀️
1392992185654644743 2021-05-13 19:56:32 -0400 <politixNplay> Daily COVID-19 vaccine doses administered per 100 people  https://t.co/mOv2XHe6ig
1392990538706661377 2021-05-13 19:49:59 -0400 <renne_fuller> So I officially got my 1st covid19 vaccine tonight I just got home from my covid shot. This one was safer. But mostly all get side effects such as headaches fever and chills might be sick on my 3rd  day. Or could be tonight tomorrow. But I will update it.
1392990297131569153 2021-05-13 19:49:02 -0400 <frenchnerd> Feeling very fortunate that Stephen and I got our first covid

1392991433674559488 2021-05-13 19:53:32 -0400 <ConnieL63243185> Your first shot of COVID vaccine will be available in the near future (hopefully end of next week) at our office. It will be Moderna or Pfizer. At this time we will take requests from you if you are over 18yo to go on our wait list. Call 403-247-0787 or email travel@bowmont.ca
1392991146855387136 2021-05-13 19:52:24 -0400 <petrodude_> #Canada #yyc #covid ...sigh only a dream for us for now...meanwhile we got Dr. Hinshaw worried who can and can't wear a godamn mask....Fully vaccinated Americans can return to life without masks, CDC says   https://t.co/I2vTPHNZiz
1392987273516380162 2021-05-13 19:37:01 -0400 <prune_55> @jamiesgronski @BroadwayWorld @HamiltonMusical Seriously? The Covid vaccine will now possibly cause cancer ??!!
1392986798813372419 2021-05-13 19:35:07 -0400 <sup2today> @JBlackstaffe @jkenney Exactly and I think covid is a over blown flu , but if that’s who’s dying not 12 years olds why wouldn’t the old folks

1392992803576242177 2021-05-13 19:58:59 -0400 <Rick_City> Since fully vaccinated people can still get covid, can't they spread it even if they don't have symptoms? Or has it been proven they can't?
1392992194357776385 2021-05-13 19:56:34 -0400 <peeair> why does America think that being fully vaxxed means they’re protected from getting Covid?  full vaccinated people can still get Covid and spread it.  🤦🏻‍♀️🤦🏻‍♀️🤦🏻‍♀️
1392992185654644743 2021-05-13 19:56:32 -0400 <politixNplay> Daily COVID-19 vaccine doses administered per 100 people  https://t.co/mOv2XHe6ig
1392990538706661377 2021-05-13 19:49:59 -0400 <renne_fuller> So I officially got my 1st covid19 vaccine tonight I just got home from my covid shot. This one was safer. But mostly all get side effects such as headaches fever and chills might be sick on my 3rd  day. Or could be tonight tomorrow. But I will update it.
1392990297131569153 2021-05-13 19:49:02 -0400 <frenchnerd> Feeling very fortunate that Stephen and I got our first covid

1392992679416320000 2021-05-13 19:58:29 -0400 <motomotoyama> ALSO, you can still get Covid while being fully vaccinated, especially with the new variants.
1392992678548172802 2021-05-13 19:58:29 -0400 <motomotoyama> According to the NYT, the 7 week average for Covid deaths is over 600.  That is not insignificant, and there are still many many people who are at risk because they haven't gotten vaccinated yet, or can't.
1392992590459326467 2021-05-13 19:58:08 -0400 <957CoastFM> Feds surpass 20-M COVID-19 vaccines distributed, closely watching UK trial on vaccine mixing  https://t.co/s6v5tHBLVl
1392992290591834116 2021-05-13 19:56:57 -0400 <wa_beaver> @GovInslee You do know that even with the vaccine one can still get C19 and pass it on?  Oh wait, it's not about Covid, it's about Control!!  Given his age he had about ~0% chance of getting C19, but he actually has a slightly higher chance of getting a serious side effect from the vaccine.
1392992202171637766 2021-05-13 19:56:36 -0400 <1971

1392992814477332481 2021-05-13 19:59:02 -0400 <jamaislu> Les personnes de 18 ans et plus peuvent déjà prendre rendez-vous dans les différentes cliniques de vaccination contre la COVID-19 de la province, jeudi après-midi, soit quelques heures avant l’échéancier initial annoncé par le gouvernement Legault.  https://t.co/clNsfy2guP
1392992796265562112 2021-05-13 19:58:57 -0400 <NPoushinsky> @IrfanDhalla @AlexMunter This actually makes me feel a little sick, thinking of how many people have just had their AZ vaccines cancelled, and who booked those vaccination appts with full knowledge of the risk, may now contract COVID.   If they can easily/quickly get mRNA, great, if not, then what?
1392985156542210054 2021-05-13 19:28:36 -0400 <mmjmikeelkin> After 9 months of curfew and lockdowns we’ve learned r lesson. Now we’ve been blessed with mass #vaccination sites. 2nd shot June 27! How Montreal has so far dodged a third COVID-19 wave and what other cities can learn from its success - The Globe 

1392991787912998913 2021-05-13 19:54:57 -0400 <mentallyeel_> @katie24195748 @TheQuartering @POTUS I meant your risks are higher to die from covid than from the vaccine.
1392988811328643072 2021-05-13 19:43:07 -0400 <mattdagley> I guess in BC they give out stickers when you get the COVID-19 vaccine.
1392988457174282250 2021-05-13 19:41:43 -0400 <mentallyeel_> @katie24195748 @TheQuartering @POTUS The risks of covid killing you or permanently damaging you seems to be higher. Plus, with the vaccine you will be helping to stop the spread to others. Would you just rather die from a virus rather than a vaccine?
1392982829110333444 2021-05-13 19:19:21 -0400 <jump_thenfall13> @LauraTheSwiftie the bright side I see to this news is that it seems that they now have enough data to suggest that vaccinated people aren't at much risk of spreading or contracting Covid, and therefore don't pose much risk to themselves or others. that part seems like good news.
1392982589057679360 2021-05-13 19:18:24 -04

1392992803576242177 2021-05-13 19:58:59 -0400 <Rick_City> Since fully vaccinated people can still get covid, can't they spread it even if they don't have symptoms? Or has it been proven they can't?
1392992566119915524 2021-05-13 19:58:02 -0400 <pashulman> @jodiecongirl Also the issue is probably exposure. A  bunch of fully vaccinated people in close quarters breathing on each other where one person has covid probably won’t stop spreading.
1392992475845910542 2021-05-13 19:57:41 -0400 <Britt_Mont> Had my 2nd dose of the covid vaccine and I have so many pimples it’s like I’m hitting puberty all over again. Good things masks are socially acceptable now 😅
1392992194357776385 2021-05-13 19:56:34 -0400 <peeair> why does America think that being fully vaxxed means they’re protected from getting Covid?  full vaccinated people can still get Covid and spread it.  🤦🏻‍♀️🤦🏻‍♀️🤦🏻‍♀️
1392992185654644743 2021-05-13 19:56:32 -0400 <politixNplay> Daily COVID-19 vaccine doses administered per 100 people 

1392992803576242177 2021-05-13 19:58:59 -0400 <Rick_City> Since fully vaccinated people can still get covid, can't they spread it even if they don't have symptoms? Or has it been proven they can't?
1392992194357776385 2021-05-13 19:56:34 -0400 <peeair> why does America think that being fully vaxxed means they’re protected from getting Covid?  full vaccinated people can still get Covid and spread it.  🤦🏻‍♀️🤦🏻‍♀️🤦🏻‍♀️
1392992185654644743 2021-05-13 19:56:32 -0400 <politixNplay> Daily COVID-19 vaccine doses administered per 100 people  https://t.co/mOv2XHe6ig
1392990538706661377 2021-05-13 19:49:59 -0400 <renne_fuller> So I officially got my 1st covid19 vaccine tonight I just got home from my covid shot. This one was safer. But mostly all get side effects such as headaches fever and chills might be sick on my 3rd  day. Or could be tonight tomorrow. But I will update it.
1392990297131569153 2021-05-13 19:49:02 -0400 <frenchnerd> Feeling very fortunate that Stephen and I got our first covid

1392991212517216256 2021-05-13 19:52:40 -0400 <carriemae68> @twtnando88 @wendy_waters @richardzussman I never said the vaccine was 100% safe. Of course it isn’t. Nothing is. Water isn’t 100% safe. No vaccine is risk free. But the risk of serious events in approved vaccines far outweighs the risk of death &amp; disability caused by COVID-19
1392988819222265860 2021-05-13 19:43:09 -0400 <carriemae68> @twtnando88 @wendy_waters @richardzussman You posted an Alex Jones video with Tenpenny spreading disinformation about animal trials and the COVID-19 vaccine development.
1392984403811307524 2021-05-13 19:25:36 -0400 <CastanetKam> Single vaccine dose proving effective lowering COVID-19 rates  https://t.co/ewViNUKWmD  https://t.co/DtlO4nhyUT
1392976878433038337 2021-05-13 18:55:42 -0400 <carriemae68> @twtnando88 @wendy_waters @richardzussman SARS-CoV-2 is highly related to SARS-CoV-1 (see the numbering thing ?) Research into vaccines for SARS &amp; MERS started in 2003 ... the COVID-19 Vaccine

1392992679416320000 2021-05-13 19:58:29 -0400 <motomotoyama> ALSO, you can still get Covid while being fully vaccinated, especially with the new variants.
1392992678548172802 2021-05-13 19:58:29 -0400 <motomotoyama> According to the NYT, the 7 week average for Covid deaths is over 600.  That is not insignificant, and there are still many many people who are at risk because they haven't gotten vaccinated yet, or can't.
1392992590459326467 2021-05-13 19:58:08 -0400 <957CoastFM> Feds surpass 20-M COVID-19 vaccines distributed, closely watching UK trial on vaccine mixing  https://t.co/s6v5tHBLVl
1392992290591834116 2021-05-13 19:56:57 -0400 <wa_beaver> @GovInslee You do know that even with the vaccine one can still get C19 and pass it on?  Oh wait, it's not about Covid, it's about Control!!  Given his age he had about ~0% chance of getting C19, but he actually has a slightly higher chance of getting a serious side effect from the vaccine.
1392992202171637766 2021-05-13 19:56:36 -0400 <1971

1392992917304848386 2021-05-13 19:59:26 -0400 <MichiganBlues20> For the record, my first COVID vaccine dose was today, May 13, 2021.
1392992566119915524 2021-05-13 19:58:02 -0400 <pashulman> @jodiecongirl Also the issue is probably exposure. A  bunch of fully vaccinated people in close quarters breathing on each other where one person has covid probably won’t stop spreading.
1392992475845910542 2021-05-13 19:57:41 -0400 <Britt_Mont> Had my 2nd dose of the covid vaccine and I have so many pimples it’s like I’m hitting puberty all over again. Good things masks are socially acceptable now 😅
1392991854694645760 2021-05-13 19:55:13 -0400 <VickyKramer16> @Jessicam6946 @MrTAchilles Especially since people can catch Covid-19 even after getting vaccinated.
1392991631515820034 2021-05-13 19:54:20 -0400 <melissa_zaksek> @lriversiii Love this! Congrats to both you and your little one. Along the same vein - I was soooo pumped to get my covid vaccine - best feeling ever UNTIL today when my 12-YO got

1392991367756931073 2021-05-13 19:53:17 -0400 <Rakiko_Hime> The Deadly COVID-19 Vaccine Cover-up  https://t.co/q9AlLuHYve
1392983164721713154 2021-05-13 19:20:41 -0400 <sophieetkath> Finally was able to get an appointment to get the first dose of the Covid Vaccine. I can't wait for for it to be the June 6 already. (Yes only my first dose and in 3 weeks, welcome to Canada where we do not have enough doses to vaccinate every one who wants to be!)
1392983044609527808 2021-05-13 19:20:12 -0400 <AdamThouin> @JoeBiden If I may Say @POTUS @JoeBiden @VP @KamalaHarris @AOC @SenSchumer @LeaderMcConnell Finally A Vaccine for ages 12 years and Over! In Over 40 days from now in Allowance of the Governors Ease the The Heavy Burden Covid-19 has waged on Americans CDC Guidelines for All who Took the Vax  https://t.co/lS5BXSGfmZ
1392976840692834306 2021-05-13 18:55:33 -0400 <AnnieCBrisson> Début des activités pour un pôle de vaccination en entreprise dans le haut du #LacStJean Collaboration entre le cé

1392992803576242177 2021-05-13 19:58:59 -0400 <Rick_City> Since fully vaccinated people can still get covid, can't they spread it even if they don't have symptoms? Or has it been proven they can't?
1392992194357776385 2021-05-13 19:56:34 -0400 <peeair> why does America think that being fully vaxxed means they’re protected from getting Covid?  full vaccinated people can still get Covid and spread it.  🤦🏻‍♀️🤦🏻‍♀️🤦🏻‍♀️
1392992185654644743 2021-05-13 19:56:32 -0400 <politixNplay> Daily COVID-19 vaccine doses administered per 100 people  https://t.co/mOv2XHe6ig
1392990538706661377 2021-05-13 19:49:59 -0400 <renne_fuller> So I officially got my 1st covid19 vaccine tonight I just got home from my covid shot. This one was safer. But mostly all get side effects such as headaches fever and chills might be sick on my 3rd  day. Or could be tonight tomorrow. But I will update it.
1392990297131569153 2021-05-13 19:49:02 -0400 <frenchnerd> Feeling very fortunate that Stephen and I got our first covid

1392993016894435329 2021-05-13 19:59:50 -0400 <sassydogmom_> @TraumaRama_RN We got hand sanitizer and leftover pens from the covid vaccination centers
1392992972682235904 2021-05-13 19:59:39 -0400 <GeorgiaBoyCore1> People on here calling Biden a dictator because he's for COVID-19 safety til most people get vaccinated. As much as I dislike him he's right this time. Those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives.
1392992794579513348 2021-05-13 19:58:57 -0400 <shidlewitz> @nypost The vaccine is garbage. I know 4 family fully vaccinated got covid right afterwards... wore masks etc. Fauci sucks
1392992743991939079 2021-05-13 19:58:45 -0400 <DebraHange1> @davidhogg111 Until your spaces, areas, and state are less than 75% Fully Vaccinated, you are doing the right thing in my opinion. It's still here.  I'd hate to see an increase in spread that shows up as a new variant that is more deadly or causes more long Covid Synd

1392992763176579074 2021-05-13 19:58:49 -0400 <kasjas> @aslavitt46 This guidance will INCREASE the COVID risk in every indoor space for people who can’t be vaccinated yet (kids, immune comprised) b/c, in the real world, many  unvaccinated will just lie. It’s unfairly shifting the risk to people who have no vaccine access from those who refuse it
1392992528207687688 2021-05-13 19:57:53 -0400 <flauralll> Def gonna be late for my stream tonight :( getting my second dose of the Covid vaccine and I’m stuck downtown with only growing traffic problems
1392992326729965568 2021-05-13 19:57:05 -0400 <antlantern> @brooke_irl It does seem to be getting a bit lighter, and in fairness to laser, covid made it really hard to keep a consistent appointment schedule.
1392992276222136323 2021-05-13 19:56:53 -0400 <OutLawStarLord9> @luigifool The NY Yankees are fully vaccinated and 7 players got covid
1392992247466004482 2021-05-13 19:56:46 -0400 <Brandon_Beaber> @Jessnimm Fair enough.  I am in Los Angeles

1392992819619536910 2021-05-13 19:59:03 -0400 <LenoreBurger1> @KatiePhang Doesn't matter, as fully vaccinated people are not at risk of catching Covid
1392992707639918595 2021-05-13 19:58:36 -0400 <rwgranny> @LawrenceGostin There is no good reason to force vaccinated people to spend another minute wearing a mask.  If you want to wear a mask, go for it.  I’m fully vaccinated and already had Covid.  I’m done with it.
1392992664786718720 2021-05-13 19:58:26 -0400 <Beckrmr> @artj420 @POTUS If the virus does not die off, it will continue to mutate &amp; eventually the vaccine becomes ineffective. Brilliant scientist have been studying virus’, in particular, corona viruses for decades, so were able to develop quickly for Covid 19; vaccine for next strain may take longer.
1392992561674006532 2021-05-13 19:58:01 -0400 <iamjasonsteward> @nosoupforgeorge Getting a vaccine doesn't make you healthy, especially if you have to take it each year. Living healthier, eating healthier, and being a better

1392992929359220740 2021-05-13 19:59:29 -0400 <IzzyCEros> @nazadelic @POTUS I have been waiting for @Novavax as well, but I don’t think they have a vaccine at all...  I think they needed a financial life-line and capitalized on the Covid-craze.   They have hardly manufactured any doses and haven’t even passed crucial testing yet lol!
1392992907657940994 2021-05-13 19:59:24 -0400 <anne_halsey> Free! Covid vaccination clinic at Dunbar Recreation Center, tonight until 10:00. 12 and up! Let’s do this, y’all! 💜 #RattlerUp  https://t.co/BfKfNlKoiL
1392992256538386432 2021-05-13 19:56:49 -0400 <felicititty1> @weezersucks I kinda don’t understand why it changed. COVID only protects you, not others. So I figure unless you’re hanging with fully vaccinated people, you should still wear a mask 🧐
1392992191883132930 2021-05-13 19:56:33 -0400 <961nowsa> COVID Vaccine Not Required For Texas Students  https://t.co/M1x2bPmQzJ
1392990951711387649 2021-05-13 19:51:38 -0400 <Sundog512> @nudog71 We have a 

1392993016894435329 2021-05-13 19:59:50 -0400 <sassydogmom_> @TraumaRama_RN We got hand sanitizer and leftover pens from the covid vaccination centers
1392992972682235904 2021-05-13 19:59:39 -0400 <GeorgiaBoyCore1> People on here calling Biden a dictator because he's for COVID-19 safety til most people get vaccinated. As much as I dislike him he's right this time. Those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives.
1392992939186561025 2021-05-13 19:59:31 -0400 <Tyheam_Csedr> @LoMasVogue @JoeBiden That’s because the vaccine wasn’t designed to stop you from catching or transmitting it, it was designed to lessen the effect of Covid so you don’t get deathly symptoms.  It seems like not a lot of people understand that.  They turned Covid-19 into Flu-2.0
1392992897029611524 2021-05-13 19:59:21 -0400 <TheOneWhoTalks4> @Sgt_Butt Watch to reach a bit more? I am tired of morons not wearing their masks and not vaccinated then 

1392992388155461634 2021-05-13 19:57:20 -0400 <deceptibling> @ACTBrigitte cause the vaccine dosnt protect against covid
1392991952409296900 2021-05-13 19:55:36 -0400 <CarltonHawkins> Yankees Covid Outbreak: Gleyber Torres Is Eighth Member Of Club To Test Positive Despite Being Vaccinated  https://t.co/MhcW7AZgTd via @SportsMoneyBlog
1392990414647353344 2021-05-13 19:49:30 -0400 <bucks520> @POTUS “The people” doesn’t mean government! It scares me how hard your pushing this vaccination. I identify as vaccinated so leave me alone. I haven’t worn a mask and I’ve never caught covid, but if I do I know I got a 99% of living
1392990322112688128 2021-05-13 19:49:07 -0400 <gracebenanti> But they will still be at risk from wanting to quit their job in healthcare because they got vaccinated and wore their mask and didn't horde toilet paper. But there are still idiots not wearing masks and giving covid to each other. And then we run out of gasoline literally.
1392990321198333952 2021-05-13 19:49:0

1392992929359220740 2021-05-13 19:59:29 -0400 <IzzyCEros> @nazadelic @POTUS I have been waiting for @Novavax as well, but I don’t think they have a vaccine at all...  I think they needed a financial life-line and capitalized on the Covid-craze.   They have hardly manufactured any doses and haven’t even passed crucial testing yet lol!
1392992907657940994 2021-05-13 19:59:24 -0400 <anne_halsey> Free! Covid vaccination clinic at Dunbar Recreation Center, tonight until 10:00. 12 and up! Let’s do this, y’all! 💜 #RattlerUp  https://t.co/BfKfNlKoiL
1392992256538386432 2021-05-13 19:56:49 -0400 <felicititty1> @weezersucks I kinda don’t understand why it changed. COVID only protects you, not others. So I figure unless you’re hanging with fully vaccinated people, you should still wear a mask 🧐
1392992191883132930 2021-05-13 19:56:33 -0400 <961nowsa> COVID Vaccine Not Required For Texas Students  https://t.co/M1x2bPmQzJ
1392990544431882245 2021-05-13 19:50:00 -0400 <LPPanther> I'm genuinely confu

1392992763176579074 2021-05-13 19:58:49 -0400 <kasjas> @aslavitt46 This guidance will INCREASE the COVID risk in every indoor space for people who can’t be vaccinated yet (kids, immune comprised) b/c, in the real world, many  unvaccinated will just lie. It’s unfairly shifting the risk to people who have no vaccine access from those who refuse it
1392992528207687688 2021-05-13 19:57:53 -0400 <flauralll> Def gonna be late for my stream tonight :( getting my second dose of the Covid vaccine and I’m stuck downtown with only growing traffic problems
1392992326729965568 2021-05-13 19:57:05 -0400 <antlantern> @brooke_irl It does seem to be getting a bit lighter, and in fairness to laser, covid made it really hard to keep a consistent appointment schedule.
1392992276222136323 2021-05-13 19:56:53 -0400 <OutLawStarLord9> @luigifool The NY Yankees are fully vaccinated and 7 players got covid
1392992247466004482 2021-05-13 19:56:46 -0400 <Brandon_Beaber> @Jessnimm Fair enough.  I am in Los Angeles

1392989029776429063 2021-05-13 19:43:59 -0400 <ChrisRThornton> @theTonyGee Yep! Bill Maher is vaccinated and was just diagnosed with covid. Several New York Yankees also vaccinated and also tested positive for COVID.
1392988518717263878 2021-05-13 19:41:57 -0400 <Gooma2seven> @TomCottonAR as usual, your racism is showing. Exactly why are you afraid of examining what is clearly systemic racism in America?
1392988058954371076 2021-05-13 19:40:08 -0400 <kdhnews> Amazon is seeking to hire 75,000 people in a tight job market and is offering bonuses to attract workers, including $100 for new hires who are already vaccinated for COVID-19.  https://t.co/0FjwfaRPH2
1392987862631596034 2021-05-13 19:39:21 -0400 <LIFESTYLETASTE> CVS, Walgreens and Macy's said they are reviewing their requirements for facial coverings following new CDC guidance easing mask wearing for people vaccinated against Covid-19
1392987288741756928 2021-05-13 19:37:04 -0400 <TonyaLampley> @StycksOfficial @CNN Serious questi

1392993016894435329 2021-05-13 19:59:50 -0400 <sassydogmom_> @TraumaRama_RN We got hand sanitizer and leftover pens from the covid vaccination centers
1392992972682235904 2021-05-13 19:59:39 -0400 <GeorgiaBoyCore1> People on here calling Biden a dictator because he's for COVID-19 safety til most people get vaccinated. As much as I dislike him he's right this time. Those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives.
1392992794579513348 2021-05-13 19:58:57 -0400 <shidlewitz> @nypost The vaccine is garbage. I know 4 family fully vaccinated got covid right afterwards... wore masks etc. Fauci sucks
1392992743991939079 2021-05-13 19:58:45 -0400 <DebraHange1> @davidhogg111 Until your spaces, areas, and state are less than 75% Fully Vaccinated, you are doing the right thing in my opinion. It's still here.  I'd hate to see an increase in spread that shows up as a new variant that is more deadly or causes more long Covid Synd

1392993016894435329 2021-05-13 19:59:50 -0400 <sassydogmom_> @TraumaRama_RN We got hand sanitizer and leftover pens from the covid vaccination centers
1392992972682235904 2021-05-13 19:59:39 -0400 <GeorgiaBoyCore1> People on here calling Biden a dictator because he's for COVID-19 safety til most people get vaccinated. As much as I dislike him he's right this time. Those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives.
1392992794579513348 2021-05-13 19:58:57 -0400 <shidlewitz> @nypost The vaccine is garbage. I know 4 family fully vaccinated got covid right afterwards... wore masks etc. Fauci sucks
1392992743991939079 2021-05-13 19:58:45 -0400 <DebraHange1> @davidhogg111 Until your spaces, areas, and state are less than 75% Fully Vaccinated, you are doing the right thing in my opinion. It's still here.  I'd hate to see an increase in spread that shows up as a new variant that is more deadly or causes more long Covid Synd

1392992390269390848 2021-05-13 19:57:21 -0400 <britnurseUSA> @coreyreynoldsLA Have taken care of COVID patients, have had COVID and am fully vaccinated.  Working in healthcare and around people who are not vaccinated because...various reasons.  🤷‍♀️ Not sure when I will be ready to not wear a mask, even outside of work.
1392992388864307204 2021-05-13 19:57:20 -0400 <robhon_> @AlbanyCheshire @AngeleOutWest @lourdesgnavarro "You can still contract and spread COVID-19 even if you've been vaccinated..."  Based on the latest research, I don't think that's correct.  https://t.co/FfkEp1ph9F
1392992313073291265 2021-05-13 19:57:02 -0400 <bert_gilfoyle> Bill Maher tested positive for COVID during a weekly PCR test, despite being fully vaccinated.
1392991935468433409 2021-05-13 19:55:32 -0400 <Black_Action> Covid-19 reacting to news that fully vaccinated people no longer need to wear a mask. #COVID19  https://t.co/U4QzVnojMf
1392991043545563138 2021-05-13 19:51:59 -0400 <purposedrivenl> Eric Cla

1392992929359220740 2021-05-13 19:59:29 -0400 <IzzyCEros> @nazadelic @POTUS I have been waiting for @Novavax as well, but I don’t think they have a vaccine at all...  I think they needed a financial life-line and capitalized on the Covid-craze.   They have hardly manufactured any doses and haven’t even passed crucial testing yet lol!
1392992907657940994 2021-05-13 19:59:24 -0400 <anne_halsey> Free! Covid vaccination clinic at Dunbar Recreation Center, tonight until 10:00. 12 and up! Let’s do this, y’all! 💜 #RattlerUp  https://t.co/BfKfNlKoiL
1392992256538386432 2021-05-13 19:56:49 -0400 <felicititty1> @weezersucks I kinda don’t understand why it changed. COVID only protects you, not others. So I figure unless you’re hanging with fully vaccinated people, you should still wear a mask 🧐
1392992191883132930 2021-05-13 19:56:33 -0400 <961nowsa> COVID Vaccine Not Required For Texas Students  https://t.co/M1x2bPmQzJ
1392990544431882245 2021-05-13 19:50:00 -0400 <LPPanther> I'm genuinely confu

1392992463028117504 2021-05-13 19:57:38 -0400 <p_yelvington> @SophiaJunz @MeidasTouch I think because I live in Florida, which is Covid friendly because of our incompetent, lying, arrogant Governor, who is vaccinated by the way, doesn't want to slow down the spread. #IWillWearAMask
1392991722515451905 2021-05-13 19:54:41 -0400 <SECSDA> May 15, 4 pm EDT, join a COVID-19 Vaccine Symposium with healthcare, communication, and theological experts discussing how to make well-informed decisions about getting the vaccine.  Send questions: covidquestions@nadadventist.org Learn more:  https://t.co/FyNL3Ft5Et  #covid19  https://t.co/f1KBWR9FCR
1392990771377319938 2021-05-13 19:50:55 -0400 <MarkHalleyPhD> When your family member who called you a sheep for getting the vaccine posts a selfie and is on oxygen while being treated for COVID.  https://t.co/7siHB4zeVO
1392989814799151104 2021-05-13 19:47:07 -0400 <iceberg171> Anti-vaxxers fear the vaccine but not Covid. 🤷‍♂️
1392989110596292611 2021-05-1

1392992390269390848 2021-05-13 19:57:21 -0400 <britnurseUSA> @coreyreynoldsLA Have taken care of COVID patients, have had COVID and am fully vaccinated.  Working in healthcare and around people who are not vaccinated because...various reasons.  🤷‍♀️ Not sure when I will be ready to not wear a mask, even outside of work.
1392992388864307204 2021-05-13 19:57:20 -0400 <robhon_> @AlbanyCheshire @AngeleOutWest @lourdesgnavarro "You can still contract and spread COVID-19 even if you've been vaccinated..."  Based on the latest research, I don't think that's correct.  https://t.co/FfkEp1ph9F
1392992313073291265 2021-05-13 19:57:02 -0400 <bert_gilfoyle> Bill Maher tested positive for COVID during a weekly PCR test, despite being fully vaccinated.
1392989342226739201 2021-05-13 19:45:14 -0400 <jamesmack2988> @LateRoundQB @happyenchilada2 @Bease11 Look JJ if vaccinated people can catch covid the virus can mutate. It’s not hard to understand....
1392988640435785730 2021-05-13 19:42:27 -0400 <theur

1392992917304848386 2021-05-13 19:59:26 -0400 <MichiganBlues20> For the record, my first COVID vaccine dose was today, May 13, 2021.
1392992566119915524 2021-05-13 19:58:02 -0400 <pashulman> @jodiecongirl Also the issue is probably exposure. A  bunch of fully vaccinated people in close quarters breathing on each other where one person has covid probably won’t stop spreading.
1392992475845910542 2021-05-13 19:57:41 -0400 <Britt_Mont> Had my 2nd dose of the covid vaccine and I have so many pimples it’s like I’m hitting puberty all over again. Good things masks are socially acceptable now 😅
1392991854694645760 2021-05-13 19:55:13 -0400 <VickyKramer16> @Jessicam6946 @MrTAchilles Especially since people can catch Covid-19 even after getting vaccinated.
1392991631012474883 2021-05-13 19:54:20 -0400 <SusanCollings4> @RadioFreeTom A fully vaccinated person can still contract Covid. It will make them sick-just not enough to be hospitalized. It prevents Covid deaths. I will likely still mask up.

1392989029776429063 2021-05-13 19:43:59 -0400 <ChrisRThornton> @theTonyGee Yep! Bill Maher is vaccinated and was just diagnosed with covid. Several New York Yankees also vaccinated and also tested positive for COVID.
1392988518717263878 2021-05-13 19:41:57 -0400 <Gooma2seven> @TomCottonAR as usual, your racism is showing. Exactly why are you afraid of examining what is clearly systemic racism in America?
1392988058954371076 2021-05-13 19:40:08 -0400 <kdhnews> Amazon is seeking to hire 75,000 people in a tight job market and is offering bonuses to attract workers, including $100 for new hires who are already vaccinated for COVID-19.  https://t.co/0FjwfaRPH2
1392987862631596034 2021-05-13 19:39:21 -0400 <LIFESTYLETASTE> CVS, Walgreens and Macy's said they are reviewing their requirements for facial coverings following new CDC guidance easing mask wearing for people vaccinated against Covid-19
1392987060592685061 2021-05-13 19:36:10 -0400 <bookabouttrains> @ChildishHegel ok thats great be

1392990842554654720 2021-05-13 19:51:12 -0400 <Anarch_King> ...to put an end to this pandemic is for literally everyone to get vaccinated and in the meantime to wear a mask in public at all times. This is again, from the head infectious disease doctor of the entire Lexington KY hospital, who's in charge of stopping covid spreading in it.
1392990589709398016 2021-05-13 19:50:11 -0400 <poljunkie12> @BouncingHH @RoArquette You’re right on people not understanding the difference. But the CDC wouldn’t make today’s announcement if vaccinated people, who get the virus after vaccination, had severe symptoms. Sounds like the data is showing little to no effects when getting Covid after vaccination.
1392990211555135490 2021-05-13 19:48:41 -0400 <karenkb> @JennaEllisEsq They have no legal position to tell us what to do with a mask.... or our bodies.  Those of us who have had Covid are safer then those vaccinated.
1392988753761906690 2021-05-13 19:42:54 -0400 <CerasoliEvan> @Goddancess81 @gregkell

1392993017963978758 2021-05-13 19:59:50 -0400 <Geminyestarr> So vaccinated ppl can congregate indoors without masks even tho they can still catch AND spread Covid sounds like we never needed masks 🤨
1392992258715099137 2021-05-13 19:56:49 -0400 <nilocouture> Lawmakers push bills they say will protect people's right to not get COVID-19 vaccine  https://t.co/95rwiU9Vvr.  #NCRepublicans #Shame #FollowTheScience
1392992115395866629 2021-05-13 19:56:15 -0400 <BarkleyTracy> @the_right_girl4 As a kid I got the measles from the vaccine, mumps from the vaccine, older adult I get the flu from the vaccine, therefore, I'm not taking the vaccine for Covid.
1392991953281814528 2021-05-13 19:55:36 -0400 <bylaurarose> Today marks 2 weeks since my 2nd covid shot. And with the CDC officially declaring that fully vaccinated people can go mask free inside, my husband and I celebrated our immunity by eating in a restaurant and then going to one of my favorite breweries. I am so unbelievably happy.
13929895

1392993016894435329 2021-05-13 19:59:50 -0400 <sassydogmom_> @TraumaRama_RN We got hand sanitizer and leftover pens from the covid vaccination centers
1392992972682235904 2021-05-13 19:59:39 -0400 <GeorgiaBoyCore1> People on here calling Biden a dictator because he's for COVID-19 safety til most people get vaccinated. As much as I dislike him he's right this time. Those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives.
1392992794579513348 2021-05-13 19:58:57 -0400 <shidlewitz> @nypost The vaccine is garbage. I know 4 family fully vaccinated got covid right afterwards... wore masks etc. Fauci sucks
1392992743991939079 2021-05-13 19:58:45 -0400 <DebraHange1> @davidhogg111 Until your spaces, areas, and state are less than 75% Fully Vaccinated, you are doing the right thing in my opinion. It's still here.  I'd hate to see an increase in spread that shows up as a new variant that is more deadly or causes more long Covid Synd

## 2. Data Cleaning

Now, the data is ready for us to read in a pandas dataframe.

In [79]:
# create column name for our output
colnames = ['id', 'date','user_id', 'username', 'tweet', 'replies_count', 'retweets_count', 'likes_count',
            'hashtags', 'coordinates']

In [80]:
#df = pd.read_csv('/Users/rachelchen/Desktop/BrainStation/Capstone/Twitter Sentiment/tweets.csv')


df = pd.read_csv('/Users/rachelchen/Desktop/CapstoneProject/tweets.csv', names = colnames,
                   header = None)

In [81]:
# take a quick look in to our data 
df.head()

Unnamed: 0,id,date,user_id,username,tweet,replies_count,retweets_count,likes_count,hashtags,coordinates
0,1382483708784078848,2021-04-14 19:59:36 EDT,1556939665,xwoman54,How do we know the COVID vaccine won't have long-term side-effects? https://t.co/29tIN8o4J1 via @ConversationEDU Waiting for my second dose of #AstraZeneca,0,0,1,['astrazeneca'],"43.70011,-79.4163,200km"
1,1382483637401190403,2021-04-14 19:59:19 EDT,926509404165255168,marcus13781234,Did the FDA and Pfizer hold the COVID vaccine until after Trump lost re-election? - Emily Posts https://t.co/PEn72g3yjl,0,0,0,[],"43.70011,-79.4163,200km"
2,1382483631034281985,2021-04-14 19:59:17 EDT,178168932,marleersocket,COVID-19 vaccine appointment booked ☑️,0,0,15,[],"43.70011,-79.4163,200km"
3,1382483610662543369,2021-04-14 19:59:12 EDT,1322036867555053568,easyontario,@nationalpost The federal government has spent $8B on COVID vaccine that Canada hasn’t received. This is quite clear another bribery scandal this time with the pharmaceutical giants,0,0,3,[],"43.70011,-79.4163,200km"
4,1382483489870782465,2021-04-14 19:58:43 EDT,1146177193682362368,mare55742414,"Evangelical pastor says he is not a politician, he is a prophet and tells followers not to believe in Covid and not to get vaccinated. Stupid is spreading!!!",0,0,0,[],"43.70011,-79.4163,200km"


### Removing Duplicate Entries

Since we search tweets by cities and their surrounding area, we are expecting duplicated tweet from cities next to each other.Let's check the data size and duplicate.

In [82]:
# Check the size
df.shape

(604972, 10)

In [85]:
df.duplicated().any()

True

In [86]:
df.duplicated().T.any()

True

In [87]:
# Check size of duplicate values
df[df[['id']].duplicated() == True].shape

(379169, 10)

Half of our rows are duplicated, let's drop them.

In [88]:
# remove duplicate tweet
df = df.drop_duplicates(subset ='id', keep = 'first').sort_values('id', ascending= False).reset_index(drop=True)
df.head()

Unnamed: 0,id,date,user_id,username,tweet,replies_count,retweets_count,likes_count,hashtags,coordinates
0,1392993017963978758,2021-05-13 19:59:50 EDT,233695382,geminyestarr,So vaccinated ppl can congregate indoors without masks even tho they can still catch AND spread Covid sounds like we never needed masks 🤨,0,0,0,[],"35.22709,-80.84313,200km"
1,1392993016894435329,2021-05-13 19:59:50 EDT,1057001749020598273,sassydogmom_,@TraumaRama_RN We got hand sanitizer and leftover pens from the covid vaccination centers,1,0,0,[],"40.71427,-74.00597,200km"
2,1392992972682235904,2021-05-13 19:59:39 EDT,1313268174943592454,georgiaboycore1,People on here calling Biden a dictator because he's for COVID-19 safety til most people get vaccinated. As much as I dislike him he's right this time. Those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives.,0,0,0,[],"40.71427,-74.00597,200km"
3,1392992939186561025,2021-05-13 19:59:31 EDT,634874773,tyheam_csedr,"@LoMasVogue @JoeBiden That’s because the vaccine wasn’t designed to stop you from catching or transmitting it, it was designed to lessen the effect of Covid so you don’t get deathly symptoms. It seems like not a lot of people understand that. They turned Covid-19 into Flu-2.0",2,0,10,[],"39.95233,-75.16379,200km"
4,1392992929359220740,2021-05-13 19:59:29 EDT,2620237274,izzyceros,"@nazadelic @POTUS I have been waiting for @Novavax as well, but I don’t think they have a vaccine at all... I think they needed a financial life-line and capitalized on the Covid-craze. They have hardly manufactured any doses and haven’t even passed crucial testing yet lol!",2,0,1,[],"29.76328,-95.36327,200km"


In [89]:
print(df.shape)

(225803, 10)


Reduced our data to 225,803 rows. Next, let's check for columns and data type.

### Data Type

In [90]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225803 entries, 0 to 225802
Data columns (total 10 columns):
 #   Column          Non-Null Count   Dtype 
---  ------          --------------   ----- 
 0   id              225803 non-null  int64 
 1   date            225803 non-null  object
 2   user_id         225803 non-null  int64 
 3   username        225803 non-null  object
 4   tweet           225803 non-null  object
 5   replies_count   225803 non-null  int64 
 6   retweets_count  225803 non-null  int64 
 7   likes_count     225803 non-null  int64 
 8   hashtags        225803 non-null  object
 9   coordinates     225803 non-null  object
dtypes: int64(5), object(5)
memory usage: 17.2+ MB


The `datetime` is now object, let's change it to numerical.**

In [91]:
# Remove the time becasue we don't need it
df['date'] = pd.to_datetime(df['date']).dt.date
df.head()

Unnamed: 0,id,date,user_id,username,tweet,replies_count,retweets_count,likes_count,hashtags,coordinates
0,1392993017963978758,2021-05-13,233695382,geminyestarr,So vaccinated ppl can congregate indoors without masks even tho they can still catch AND spread Covid sounds like we never needed masks 🤨,0,0,0,[],"35.22709,-80.84313,200km"
1,1392993016894435329,2021-05-13,1057001749020598273,sassydogmom_,@TraumaRama_RN We got hand sanitizer and leftover pens from the covid vaccination centers,1,0,0,[],"40.71427,-74.00597,200km"
2,1392992972682235904,2021-05-13,1313268174943592454,georgiaboycore1,People on here calling Biden a dictator because he's for COVID-19 safety til most people get vaccinated. As much as I dislike him he's right this time. Those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives.,0,0,0,[],"40.71427,-74.00597,200km"
3,1392992939186561025,2021-05-13,634874773,tyheam_csedr,"@LoMasVogue @JoeBiden That’s because the vaccine wasn’t designed to stop you from catching or transmitting it, it was designed to lessen the effect of Covid so you don’t get deathly symptoms. It seems like not a lot of people understand that. They turned Covid-19 into Flu-2.0",2,0,10,[],"39.95233,-75.16379,200km"
4,1392992929359220740,2021-05-13,2620237274,izzyceros,"@nazadelic @POTUS I have been waiting for @Novavax as well, but I don’t think they have a vaccine at all... I think they needed a financial life-line and capitalized on the Covid-craze. They have hardly manufactured any doses and haven’t even passed crucial testing yet lol!",2,0,1,[],"29.76328,-95.36327,200km"


In [92]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225803 entries, 0 to 225802
Data columns (total 10 columns):
 #   Column          Non-Null Count   Dtype 
---  ------          --------------   ----- 
 0   id              225803 non-null  int64 
 1   date            225803 non-null  object
 2   user_id         225803 non-null  int64 
 3   username        225803 non-null  object
 4   tweet           225803 non-null  object
 5   replies_count   225803 non-null  int64 
 6   retweets_count  225803 non-null  int64 
 7   likes_count     225803 non-null  int64 
 8   hashtags        225803 non-null  object
 9   coordinates     225803 non-null  object
dtypes: int64(5), object(5)
memory usage: 17.2+ MB


The date still in type object because we reformat it, let's transfer it again.

In [93]:
df['date'] = pd.to_datetime(df['date'])

In [94]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225803 entries, 0 to 225802
Data columns (total 10 columns):
 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   id              225803 non-null  int64         
 1   date            225803 non-null  datetime64[ns]
 2   user_id         225803 non-null  int64         
 3   username        225803 non-null  object        
 4   tweet           225803 non-null  object        
 5   replies_count   225803 non-null  int64         
 6   retweets_count  225803 non-null  int64         
 7   likes_count     225803 non-null  int64         
 8   hashtags        225803 non-null  object        
 9   coordinates     225803 non-null  object        
dtypes: datetime64[ns](1), int64(5), object(4)
memory usage: 17.2+ MB


### Text Preprocessing
Let's clean the texts in tweets.
- Remove punctuation 
- Remove number
- Lower case
- Remove emoji 
- Remove URL

In [95]:
import re
import string

# Create a function to clean tweets' text.
def clean_text(txt):
    
    '''
    Remove special characters from the string 

    Parameters
    ----------
    txt : string
        A text string that you want to remove special characters.

    Returns
    -------
    The same txt string with the specials characters removed, and all text convert to lower case
    '''
    
    txt = txt.lower()
    
    return ' '.join(re.sub('(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)|(\b\d+\b)', '', txt).split())

In [96]:
# Apply the function just created
df['tweet'] = df['tweet'].apply(lambda text: clean_text(text))

In [97]:
df.head()

Unnamed: 0,id,date,user_id,username,tweet,replies_count,retweets_count,likes_count,hashtags,coordinates
0,1392993017963978758,2021-05-13,233695382,geminyestarr,so vaccinated ppl can congregate indoors without masks even tho they can still catch and spread covid sounds like we never needed masks,0,0,0,[],"35.22709,-80.84313,200km"
1,1392993016894435329,2021-05-13,1057001749020598273,sassydogmom_,rn we got hand sanitizer and leftover pens from the covid vaccination centers,1,0,0,[],"40.71427,-74.00597,200km"
2,1392992972682235904,2021-05-13,1313268174943592454,georgiaboycore1,people on here calling biden a dictator because hes for covid19 safety til most people get vaccinated as much as i dislike him hes right this time those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives,0,0,0,[],"40.71427,-74.00597,200km"
3,1392992939186561025,2021-05-13,634874773,tyheam_csedr,thats because the vaccine wasnt designed to stop you from catching or transmitting it it was designed to lessen the effect of covid so you dont get deathly symptoms it seems like not a lot of people understand that they turned covid19 into flu20,2,0,10,[],"39.95233,-75.16379,200km"
4,1392992929359220740,2021-05-13,2620237274,izzyceros,i have been waiting for as well but i dont think they have a vaccine at all i think they needed a financial lifeline and capitalized on the covidcraze they have hardly manufactured any doses and havent even passed crucial testing yet lol,2,0,1,[],"29.76328,-95.36327,200km"


In [98]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225803 entries, 0 to 225802
Data columns (total 10 columns):
 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   id              225803 non-null  int64         
 1   date            225803 non-null  datetime64[ns]
 2   user_id         225803 non-null  int64         
 3   username        225803 non-null  object        
 4   tweet           225803 non-null  object        
 5   replies_count   225803 non-null  int64         
 6   retweets_count  225803 non-null  int64         
 7   likes_count     225803 non-null  int64         
 8   hashtags        225803 non-null  object        
 9   coordinates     225803 non-null  object        
dtypes: datetime64[ns](1), int64(5), object(4)
memory usage: 17.2+ MB


In [105]:
# Check for nulls
df.isnull().mean()

id                0.0
date              0.0
user_id           0.0
username          0.0
tweet             0.0
replies_count     0.0
retweets_count    0.0
likes_count       0.0
hashtags          0.0
coordinates       0.0
polarity          0.0
subjectivity      0.0
positive          0.0
neutral           0.0
negative          0.0
sentiment         0.0
dtype: float64

The data looks clean so far

## 3. Feature Engineering

### Adding new features

#### Polarity and Subjectivity 
Let's add a column for polarity and a column for subjectivity to analyze how sentiment level change over time.
- Polarity is float which lies in the range of [-1,1] where 1 means positive statement and -1 means a negative statement. 
- Subjective sentences generally refer to personal opinion, emotion or judgment whereas objective refers to factual information. Subjectivity is also a float which lies in the range of [0,1].

**TextBlob** is a python library and offers a simple API to access its methods and perform basic NLP tasks

In [99]:
from textblob import TextBlob

df['polarity'] = df['tweet'].apply(lambda text: TextBlob(text).sentiment.polarity)
df['subjectivity'] = df['tweet'].apply(lambda text: TextBlob(text).sentiment.subjectivity)

#### Sentiment Label
We will use VADER to create sentiment label because we are interested in finding people's concerns on the vaccines and VADER can pick up more of the negative tone from the the text than TextBlob.

**VADER sentiment intensity analyzer** returns the probability of a given input sentence to be positive, negative, and neutral.

In [100]:
from nltk.sentiment.vader import SentimentIntensityAnalyzer

df['positive'] = df['tweet'].apply(lambda text: SentimentIntensityAnalyzer().polarity_scores(text)['pos'])
df['neutral'] = df['tweet'].apply(lambda text: SentimentIntensityAnalyzer().polarity_scores(text)['neu'])
df['negative'] = df['tweet'].apply(lambda text: SentimentIntensityAnalyzer().polarity_scores(text)['neg'])

In [101]:
# Make a rule and create a column for sentiment
df['sentiment'] = np.select([(df['positive'] > df['negative']), 
                             (df['positive'] < df['negative']), 
                             (df['positive'] == df['negative'])], 
                            ['positive', 'negative','neutral'])

In [102]:
df.head()

Unnamed: 0,id,date,user_id,username,tweet,replies_count,retweets_count,likes_count,hashtags,coordinates,polarity,subjectivity,positive,neutral,negative,sentiment
0,1392993017963978758,2021-05-13,233695382,geminyestarr,so vaccinated ppl can congregate indoors without masks even tho they can still catch and spread covid sounds like we never needed masks,0,0,0,[],"35.22709,-80.84313,200km",0.0,0.0,0.102,0.898,0.0,positive
1,1392993016894435329,2021-05-13,1057001749020598273,sassydogmom_,rn we got hand sanitizer and leftover pens from the covid vaccination centers,1,0,0,[],"40.71427,-74.00597,200km",0.0,0.0,0.211,0.789,0.0,positive
2,1392992972682235904,2021-05-13,1313268174943592454,georgiaboycore1,people on here calling biden a dictator because hes for covid19 safety til most people get vaccinated as much as i dislike him hes right this time those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives,0,0,0,[],"40.71427,-74.00597,200km",0.117143,0.627143,0.118,0.828,0.054,positive
3,1392992939186561025,2021-05-13,634874773,tyheam_csedr,thats because the vaccine wasnt designed to stop you from catching or transmitting it it was designed to lessen the effect of covid so you dont get deathly symptoms it seems like not a lot of people understand that they turned covid19 into flu20,2,0,10,[],"39.95233,-75.16379,200km",0.6,0.9,0.097,0.903,0.0,positive
4,1392992929359220740,2021-05-13,2620237274,izzyceros,i have been waiting for as well but i dont think they have a vaccine at all i think they needed a financial lifeline and capitalized on the covidcraze they have hardly manufactured any doses and havent even passed crucial testing yet lol,2,0,1,[],"29.76328,-95.36327,200km",0.127083,0.560417,0.127,0.873,0.0,positive


### Dropping features
To keep the data set clean and neat, let's drop some columns that we don't  need on further analysis.
- `id`, `use_id`, `username`, `hashtags`, `coordinates` are categorical data that won't help much on our further analysis
- We already have the sentiment label, so we don't need `positive`, `neutral`, `negative` columns which have the repetitive information as the sentiment label

In [110]:
df.drop(columns = ['id', 'user_id', 'username', 'hashtags', 'coordinates', 'positive', 'neutral', 'negative'], 
        inplace = True)

In [109]:
df.head()

Unnamed: 0,date,tweet,replies_count,retweets_count,likes_count,polarity,subjectivity,sentiment
0,2021-05-13,so vaccinated ppl can congregate indoors without masks even tho they can still catch and spread covid sounds like we never needed masks,0,0,0,0.0,0.0,positive
1,2021-05-13,rn we got hand sanitizer and leftover pens from the covid vaccination centers,1,0,0,0.0,0.0,positive
2,2021-05-13,people on here calling biden a dictator because hes for covid19 safety til most people get vaccinated as much as i dislike him hes right this time those calling him a dictator are likely all privileged cishet whites who have never experienced oppression in their lives,0,0,0,0.117143,0.627143,positive
3,2021-05-13,thats because the vaccine wasnt designed to stop you from catching or transmitting it it was designed to lessen the effect of covid so you dont get deathly symptoms it seems like not a lot of people understand that they turned covid19 into flu20,2,0,10,0.6,0.9,positive
4,2021-05-13,i have been waiting for as well but i dont think they have a vaccine at all i think they needed a financial lifeline and capitalized on the covidcraze they have hardly manufactured any doses and havent even passed crucial testing yet lol,2,0,1,0.127083,0.560417,positive


In [111]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225803 entries, 0 to 225802
Data columns (total 8 columns):
 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   date            225803 non-null  datetime64[ns]
 1   tweet           225803 non-null  object        
 2   replies_count   225803 non-null  int64         
 3   retweets_count  225803 non-null  int64         
 4   likes_count     225803 non-null  int64         
 5   polarity        225803 non-null  float64       
 6   subjectivity    225803 non-null  float64       
 7   sentiment       225803 non-null  object        
dtypes: datetime64[ns](1), float64(2), int64(3), object(2)
memory usage: 13.8+ MB


In [112]:
df.to_csv(r'/Users/rachelchen/Desktop/CapstoneProject/CleanTweets.csv', index = False)

**We have completed our ideal data sets!**