### CUNY Data 620 - Web Analytics, Summer 2020  
**Final Project**   
**Prof:** Alain Ledon  
**Members:** Misha Kollontai, Amber Ferger, Zach Alexander, Subhalaxmi Rout 

### Research Questions

* Is there a relationship between location-specific Covid-19 sentiment and the number of positive cases within that region? 
* Does positive sentiment preceed spikes in positive cases?

### The Data

We will be using the Twitter API to scrape Tweet data, [John's Hopkins COVID-19 Data](https://github.com/CSSEGISandData/COVID-19) and [Wikipedia](https://en.wikipedia.org/wiki/COVID-19_pandemic_in_the_United_States) for the COVID-19 numbers. 

### The Plan

1. Scrape Twitter data from 2 locations - perhaps NYC (severe initial wave) and New Orleans (experiencing something of a second wave)
2. Pull coronavirus case numbers for the 2 locations in question
3. Perform sentiment analyis on the tweets collected and aggregate them into an overall sentiment index for each day
4. Plot timeseries of the sentiment index -vs- Coronavirus case numbers
5. Indicate important moments on the timeline related to Covid-19 safety measures or announcements
6. Investigate potential relationships between the two sets and compare the relationships from one city to another

### Import Libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import warnings
import datetime
import GetOldTweets3 as got
warnings.filterwarnings('ignore')

### Functions

We'll define the following functions:
* **perdelta**: Based on a (*stackoverflow*)[https://stackoverflow.com/questions/10688006/generate-a-list-of-datetimes-between-an-interval] thread, this will be used to generate a list of date ranges for our twitter pull. 
* **getTweets**: This will be used to pull the tweets. 

In [6]:
################ date function
# https://stackoverflow.com/questions/10688006/generate-a-list-of-datetimes-between-an-interval
def perdelta(start, end, delta):
    curr = start
    while curr < end:
        yield curr
        curr += delta
        
################ get tweets function 
def getTweets(city, startDate, endDate):
    n = 50
    tweetCriteria = got.manager.TweetCriteria().setQuerySearch('COVID')\
    .setSince(startDate)\
    .setUntil(endDate)\
    .setMaxTweets(n)\
    .setTopTweets(1)\
    .setNear(city)
    
    ls = []

    for i in range(0,n):
        try:
            tweet = got.manager.TweetManager.getTweets(tweetCriteria)[i]
            ls.append([tweet, city, startDate, endDate])
        except:
            pass 
    
    return(ls)

### Twitter Data

#### Largest City by State

* Read in a list of the top 1000 [cities]([https://public.opendatasoft.com/explore/dataset/1000-largest-us-cities-by-population-with-geographic-coordinates/table/?sort=-rank]) in the US
* Select top city by state, extract geocoordinates

In [2]:
# read in cities doc, select top city from each 
# https://stackoverflow.com/questions/50415632/how-to-select-top-n-row-from-each-group-after-group-by-in-pandas
allData = pd.read_csv('largeCities.csv', delimiter=';')
final_cities = allData.sort_values(by = ['State', 'Population'], ascending=False).groupby(['State'], sort=False).head(1)
coords = final_cities['Coordinates'].values.tolist()

#### Date Ranges
Next, we'll generate date ranges for pull. Each range will represent 1 week, defined as Sunday - Saturday. The total span of the analysis will go from **3/8/2020** to **7/11/2020**.

In [7]:
#all_dates = []
#for result in perdelta(datetime.date(2020, 3, 8), datetime.date(2020, 7, 6), datetime.timedelta(days=7)):  
#    nextWk = result + datetime.timedelta(days=6)
#    startDt = result.strftime("%Y-%m-%d")
#    endDt = nextWk.strftime("%Y-%m-%d")   
#    all_dates.append((startDt,endDt))

all_dates = [(datetime.date(2020, 3, 8).strftime("%Y-%m-%d"), datetime.date(2020, 7, 11).strftime("%Y-%m-%d"))]
all_dates

[('2020-03-08', '2020-07-11')]

#### Pull Tweets

In [None]:
# Cycle through all cities
finalList = []

for c in coords:
    print(c)
    for d in all_dates:
        ls = getTweets(c,d[0],d[1])
        finalList.append(ls)

41.1399814,-104.8202462
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.1399814%2C-104.8202462%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.1399814%2C-104.8202462%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.1399814%2C-104.8202462%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.1399814%2C-104.8202462%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.1399814%2C-104.8202462%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.1399814%2C-104.8202462%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.1399814%2C-104.8202462%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.1399814%2C-104.8202462%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP 

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.0389025%2C-87.9064736%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.0389025%2C-87.9064736%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.0389025%2C-87.9064736%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.0389025%2C-87.9064736%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2238.3498195%2C-81.6326234%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2238.3498195%2C-81.6326234%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2238.3498195%2C-81.6326234%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2238.3498195%2C-81.6326234%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2238.3498195%2C-81.6326234%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2238.3498195%2C-81.6326234%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2238.3498195%2C-81.6326234%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2238.3498195%2C-81.6326234%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2247.6062095%2C-122.3320708%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2247.6062095%2C-122.3320708%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2247.6062095%2C-122.3320708%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2247.6062095%2C-122.3320708%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP 

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2236.8529263%2C-75.977985%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2236.8529263%2C-75.977985%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2236.8529263%2C-75.977985%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2236.8529263%2C-75.977985%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request:

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2236.8529263%2C-75.977985%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2236.8529263%2C-75.977985%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2236.8529263%2C-75.977985%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2236.8529263%2C-75.977985%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request:

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2244.4758825%2C-73.212072%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2244.4758825%2C-73.212072%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2244.4758825%2C-73.212072%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2244.4758825%2C-73.212072%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request:

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2240.7607793%2C-111.8910474%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2240.7607793%2C-111.8910474%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2240.7607793%2C-111.8910474%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2240.7607793%2C-111.8910474%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP 

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2240.7607793%2C-111.8910474%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2240.7607793%2C-111.8910474%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2240.7607793%2C-111.8910474%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2240.7607793%2C-111.8910474%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
29.7604267,-95.3698028
An error 

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2229.7604267%2C-95.3698028%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2229.7604267%2C-95.3698028%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2229.7604267%2C-95.3698028%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2229.7604267%2C-95.3698028%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2235.1495343%2C-90.0489801%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2235.1495343%2C-90.0489801%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2235.1495343%2C-90.0489801%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2235.1495343%2C-90.0489801%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.5445959%2C-96.7311034%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.5445959%2C-96.7311034%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.5445959%2C-96.7311034%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.5445959%2C-96.7311034%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.5445959%2C-96.7311034%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.5445959%2C-96.7311034%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.5445959%2C-96.7311034%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2243.5445959%2C-96.7311034%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2234.0007104%2C-81.0348144%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2234.0007104%2C-81.0348144%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2234.0007104%2C-81.0348144%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2234.0007104%2C-81.0348144%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.8239891%2C-71.4128343%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.8239891%2C-71.4128343%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.8239891%2C-71.4128343%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.8239891%2C-71.4128343%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.8239891%2C-71.4128343%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.8239891%2C-71.4128343%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.8239891%2C-71.4128343%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2241.8239891%2C-71.4128343%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2239.9525839%2C-75.1652215%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2239.9525839%2C-75.1652215%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2239.9525839%2C-75.1652215%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2239.9525839%2C-75.1652215%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP requ

An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2245.5230622%2C-122.6764816%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2245.5230622%2C-122.6764816%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2245.5230622%2C-122.6764816%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP request: HTTP Error 429: Too Many Requests
Try to open in browser: https://twitter.com/search?q=COVID%20near%3A%2245.5230622%2C-122.6764816%22%20within%3A15mi%20since%3A2020-03-08%20until%3A2020-07-11&src=typd
An error occured during an HTTP 

### Step 1: Scraping Twitter Data from New York City & New Orleans

As a first step, we decided to scrape tweets from two locations, New York City and New Orleans.

##### Reading in the tweets from NYC

In [None]:
tweets = pd.read_csv('covid_tweets.csv', delimiter='\t')
tweets['City'] = 'NYC'

In [None]:
tweets.head()

In [None]:
tweets.shape

### Step 2: Pulling coronavirus case numbers for both locations

In [None]:
covid_cases = pd.read_csv('confirmed_cases.csv')

##### Filtering for NYC cases

After locating the correct county FIPS number for New York City, we were able to filter the pandas dataframe to only include this row. Additionally, we transposed this row to ensure we had one column designated for the date and another for the number of confirmed cases for that corresponding date. Finally, we made sure to reset the index and adjust the date type in order to be able to show our visuals:

In [None]:
cases_filtered = covid_cases[covid_cases['FIPS'] == 36061]
df = cases_filtered.iloc[:, 11:186:1]

df = df.transpose().reset_index()
df = df.rename(columns={'index': 'Date', 1863: "Confirmed_Cases"})

nyc_time_series = pd.DataFrame(df, columns = ['Date','Confirmed_Cases'])
nyc_time_series['Date'] = pd.to_datetime(nyc_time_series['Date'], format='%m/%d/%y')

Here's a quick look at the filtered dataset with just NYC cases:

In [None]:
nyc_time_series.tail()

Now, in order to find the number of new cases per day, we can utilize our confirmed cases column to take the difference between the current day and the previous day. Additionally, for our visualization, we can take the 7-day average of new cases and plot this as well, in order to obtain a better view of trends over time.

In [None]:
def add_newcases(df):
    df['New_Cases'] = 'NA'
    for i in range(0, len(df['Confirmed_Cases'])):
        if i == 0:
            df['New_Cases'][i] = 0
        else:
            df['New_Cases'][i] = df['Confirmed_Cases'][i] - df['Confirmed_Cases'][i-1]
    return df

In [None]:
def add_sevenday(df):
    df['Seven_Day_Avg'] = 'NA'
    for i in range(0, len(df['Confirmed_Cases'])):
        if i < 8:
            df['Seven_Day_Avg'][i] = 0
        else:
            weekly = []
            for y in range(0,7):
                weekly.append(df['New_Cases'][i-y])
            df['Seven_Day_Avg'][i] = sum(weekly) / 7
    return df

In [None]:
df = add_newcases(nyc_time_series)
df = add_sevenday(df)

After creating the `New Cases` and `Seven Day Average` columns, we can create a plot to show the case counts in New York City:

In [None]:
def drawNewCases(df, title, fignum, var):
    var = plt.figure(fignum, fig)
    plt.bar(df['Date'], df['New_Cases'], color='indianred', alpha=0.4)
    plt.plot(df['Date'], df['Seven_Day_Avg'], c='indianred', linewidth=2)
    plt.plot(legend=None)
    plt.title(title)
    plt.ylabel('Number of New Cases')
    plt.gca().xaxis.set_major_formatter(fmt)
    plt.figure(figsize=(16, 8))
    var.show()

In [None]:
locator = mdates.MonthLocator()
fmt = mdates.DateFormatter('%B')

nyc_time_series = pd.DataFrame(df, columns = ['Date','Confirmed_Cases', 'New_Cases', 'Seven_Day_Avg'])
nyc_time_series['Date'] = pd.to_datetime(nyc_time_series['Date'], format='%m/%d/%y')

drawNewCases(nyc_time_series, 'Number of new COVID-19 cases in New York City (Daily)', 1, 'x')

##### Filtering for New Orleans Cases

In [None]:
cases_filtered_newo = covid_cases[covid_cases['FIPS'] == 22071]


df_newo = cases_filtered_newo.iloc[:, 11:186:1]

df_newo = df_newo.transpose().reset_index()
df_newo = df_newo.rename(columns={'index': 'Date', 1153: "Confirmed_Cases"})

newo_time_series = pd.DataFrame(df_newo, columns = ['Date','Confirmed_Cases'])
newo_time_series['Date'] = pd.to_datetime(newo_time_series['Date'], format='%m/%d/%y')

In [None]:
newo_time_series.tail()

In [None]:
df_newo = add_newcases(newo_time_series)
df_newo = add_sevenday(df_newo)

In [None]:
newo_time_series = pd.DataFrame(df_newo, columns = ['Date','Confirmed_Cases', 'New_Cases', 'Seven_Day_Avg'])
newo_time_series['Date'] = pd.to_datetime(newo_time_series['Date'], format='%m/%d/%y')

In [None]:
drawNewCases(newo_time_series, 'Number of new COVID-19 cases in New Orleans (Daily)', 2, 'y')

**Note from Zach**: Will remove this commented-out code later (see below), but thought I'd leave it just in case it'll be helpful for future visualizations:

In [None]:
# locator = mdates.MonthLocator()
# fmt = mdates.DateFormatter('%B')


# plt.plot(nyc_time_series['Date'], nyc_time_series['Confirmed_Cases'], c='indianred')
# plt.plot(legend=None)
# plt.title('Number of Confirmed COVID-19 Cases in New York City')
# plt.xlabel('Date')
# plt.ylabel('Number of Confirmed Cases')
# plt.gca().xaxis.set_major_formatter(fmt)
# plt.show()