### Import pandas and whatever other libraries you need

In [1]:
import pandas as pd
import numpy as np

Here's another example of reading data in _locally_.

In [8]:
df = pd.read_csv('../data/warn.csv')

You can take a look at the first or last five rows by using the the `head` and `tail` functions.

In [4]:
df.head()

Unnamed: 0,notice_date,event_number,reason,company,address,county,phone,business_type,affected,total_employees,layoff_date,dislocation,union,classification
0,4/27/2020,2019-1582,Temporary Plant Layoff,"OS Restaurant Services, LLC (Bloomin Brands- O...",Multiple Central Region locations,Onondaga County,(813) 282-1225,Restaurant,174,174,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Layoff
1,4/27/2020,2019-1585,Temporary Plant Closing,"OS Restaurant Services, LLC (Bloomin Brands - ...",Multiple Capital Region locations,Albany/Saratoga/Warren County,(813) 282-1225,Restaurant,260,260,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Closing
2,4/27/2020,2019-1583,Temporary Plant Closing,"OS Restaurant Services, LLC (Bloomin Brands - ...",Multiple Finger Lakes Region locations,Monroe/Ontario County,(813) 282-1225,Restaurant,239,239,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Closing
3,4/27/2020,2019-1584,Temporary Plant Closing,"OS Restaurant Services, LLC (Bloomin Brands - ...",Multiple Western Region locations,Erie County,(813) 282-1225,Restaurant,289,289,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Closing
4,4/27/2020,2019-1586,Temporary Plant Closing,"OS Restaurant Services, LLC (Bloomin Brands - ...",Multiple Southern Region locations,Broome/Chemung County,(813) 282-1225,Restaurant,154,154,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Closing


In [5]:
df.tail()

Unnamed: 0,notice_date,event_number,reason,company,address,county,phone,business_type,affected,total_employees,layoff_date,dislocation,union,classification
1208,1/8/2020,2019-0210,Plant Closing,"Connected Ventures, LLC (CH Media)","330 W. 34th Street, 5th FloorNew York, NY 10001",New York County,(212) 314-7366,Operates online content and retail properties,39,39,Employment,Economic,The employees are not represented by a union.,Plant Closing
1209,1/6/2020,2019-0207,Plant Closing,Macy's Broadway Mall Store (Macy's Retail Hold...,"100 Broadway MallHicksville, NY 11801",Nassau County,(646) 429-7462,Retail Store,155,155,Macy's,Economic,The employees are not represented by a union.,Plant Closing
1210,12/27/2019,2019-0206,Temporary Plant Closing,Wesley Gardens Nursing Home,"3 Upton ParkRochester, NY 14607",Monroe County,(585) 241-2105,Nursing Home,132,132,Beginning,Due to a water line break,1199 SEIU,Temporary Plant Closing
1211,12/30/2019,2019-0205,Plant Closing,"127 W. 43rd St. Chophouse, Inc. (Heartland Bre...","127 West 43rd StreetNew York, NY 10018",New York County,(917) 999-6532,Restaurant,106,106,The,Economic,The employees are not represented by a union.,Plant Closing
1212,12/30/2019,2019-0201,Plant Closing,"New York Express and Logistics, LLC","292 Wolf RoadLatham, NY 12110",Albany County,(617) 968-5311,Trucking company providing freight transportat...,48,48,3/31/2020,Contract between New York Express and Logistic...,The employees are not represented by a union.,Plant Closing


You can also take a look at the type of the columns by using [`info`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.info.html).

In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1213 entries, 0 to 1212
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   notice_date      1213 non-null   object
 1   event_number     1213 non-null   object
 2   reason           1213 non-null   object
 3   company          1213 non-null   object
 4   address          1212 non-null   object
 5   county           1213 non-null   object
 6   phone            1213 non-null   object
 7   business_type    1213 non-null   object
 8   affected         1213 non-null   int64 
 9   total_employees  1213 non-null   int64 
 10  layoff_date      1210 non-null   object
 11  dislocation      1213 non-null   object
 12  union            1213 non-null   object
 13  classification   1213 non-null   object
dtypes: int64(2), object(12)
memory usage: 132.8+ KB


Note that the other way you can create a DataFrame is using a list of lists—not just a list of objects. Here's an example.

In [75]:
test = pd.DataFrame([[None, 2, np.nan, 0],
       [3, 4, np.nan, 1],
       [np.nan, np.nan, np.nan, 5],
       [np.nan, 3, np.nan, 4]],
      columns=['A', 'B', 'C', 'D'])
test

Unnamed: 0,A,B,C,D
0,,2.0,,0
1,3.0,4.0,,1
2,,,,5
3,,3.0,,4


This is effectively the same thing using a list of objects, like how we scraped our data.

In [79]:
test = pd.DataFrame([{'A': None, 'B': 2, 'C': np.nan, 'D' :0},
       {'A': 3, 'B': 4, 'C': np.nan, 'D':1},
       {'A': np.nan, 'B': np.nan, 'C': np.nan, 'D': 5},
       {'A': np.nan, 'B': 3, 'C': np.nan, 'D': 4}])
test

Unnamed: 0,A,B,C,D
0,,2.0,,0
1,3.0,4.0,,1
2,,,,5
3,,3.0,,4


Say we wanted to convert columns to integers when reading in our csv. We could do this with the `dtype` parameter.

However, this won't work if we have null values or if some value cannot be converted to an integer. If that's the case, we need to use `fillna` and `to_numeric` or `astype(int)`.

In [11]:
# df = pd.read_csv('../data/warn_na.csv', dtype={'affected': int, 'total_employees': int})
df_na = pd.read_csv('../data/warn_na.csv')

In [12]:
df_na.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1213 entries, 0 to 1212
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   notice_date      1213 non-null   object 
 1   event_number     1213 non-null   object 
 2   reason           1213 non-null   object 
 3   company          1213 non-null   object 
 4   address          1212 non-null   object 
 5   county           1213 non-null   object 
 6   phone            1213 non-null   object 
 7   business_type    1213 non-null   object 
 8   affected         1135 non-null   float64
 9   total_employees  480 non-null    float64
 10  layoff_date      1210 non-null   object 
 11  dislocation      1213 non-null   object 
 12  union            1213 non-null   object 
 13  classification   1213 non-null   object 
dtypes: float64(2), object(12)
memory usage: 132.8+ KB


In [108]:
df = df.fillna(0)

In [13]:
df['affected'] = df['affected'].astype(int)
df['total_employees'] = df['total_employees'].astype(int)

In [14]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1213 entries, 0 to 1212
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   notice_date      1213 non-null   object
 1   event_number     1213 non-null   object
 2   reason           1213 non-null   object
 3   company          1213 non-null   object
 4   address          1212 non-null   object
 5   county           1213 non-null   object
 6   phone            1213 non-null   object
 7   business_type    1213 non-null   object
 8   affected         1213 non-null   int64 
 9   total_employees  1213 non-null   int64 
 10  layoff_date      1210 non-null   object
 11  dislocation      1213 non-null   object
 12  union            1213 non-null   object
 13  classification   1213 non-null   object
dtypes: int64(2), object(12)
memory usage: 132.8+ KB


In [125]:
df['affected'] = pd.to_numeric(df['affected'])

In [126]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1213 entries, 0 to 1212
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   notice_date      1213 non-null   object 
 1   event_number     1213 non-null   object 
 2   reason           1213 non-null   object 
 3   company          1213 non-null   object 
 4   address          1213 non-null   object 
 5   county           1213 non-null   object 
 6   phone            1213 non-null   object 
 7   business_type    1213 non-null   object 
 8   affected         1213 non-null   int64  
 9   total_employees  1213 non-null   float64
 10  layoff_date      1213 non-null   object 
 11  dislocation      1213 non-null   object 
 12  union            1213 non-null   object 
 13  classification   1213 non-null   object 
dtypes: float64(1), int64(1), object(12)
memory usage: 132.8+ KB


In [128]:
df.head()

Unnamed: 0,notice_date,event_number,reason,company,address,county,phone,business_type,affected,total_employees,layoff_date,dislocation,union,classification
0,4/27/2020,2019-1582,Temporary Plant Layoff,"OS Restaurant Services, LLC (Bloomin Brands- O...",Multiple Central Region locations,Onondaga County,(813) 282-1225,Restaurant,174,174.0,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Layoff
1,4/27/2020,2019-1585,Temporary Plant Closing,"OS Restaurant Services, LLC (Bloomin Brands - ...",Multiple Capital Region locations,Albany/Saratoga/Warren County,(813) 282-1225,Restaurant,260,260.0,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Closing
2,4/27/2020,2019-1583,Temporary Plant Closing,"OS Restaurant Services, LLC (Bloomin Brands - ...",Multiple Finger Lakes Region locations,Monroe/Ontario County,(813) 282-1225,Restaurant,239,239.0,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Closing
3,4/27/2020,2019-1584,Temporary Plant Closing,"OS Restaurant Services, LLC (Bloomin Brands - ...",Multiple Western Region locations,Erie County,(813) 282-1225,Restaurant,289,289.0,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Closing
4,4/27/2020,2019-1586,Temporary Plant Closing,"OS Restaurant Services, LLC (Bloomin Brands - ...",Multiple Southern Region locations,Broome/Chemung County,(813) 282-1225,Restaurant,154,154.0,3/15/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union.,Temporary Plant Closing


What types of reasons are there?

In [15]:
df['reason'].unique()

array(['Temporary Plant Layoff', 'Temporary Plant Closing',
       'Plant Closing', 'Plant Layoff', 'Temporary Layoff',
       'Temporary Plant Closing and Plant Closing',
       'Temporary & Permanent Layoff', 'Temporary Plant \nClosing',
       'Temporary Plant \nLayoff',
       'Temporary Plant Layoff and Plant Closing',
       'Possible Plant Layoff', 'Temporary Reduction in Work Hours',
       'Temporary Plant Layoff/Plant Layoff',
       'Temporary Plant Layoff (Furlough)', 'Plant Layoff (Conditional)',
       'Temporary PLant Layoff', 'Temporary PlantClosing',
       'Temporary Closing', 'Possible Plant Layoff/Closing',
       'Temporary Palnt Closing', 'Plant Temporary Closing', 'Plant Sale',
       'Contract Dissolution', 'Plant Unit Closing',
       'Tempoarary Plant Closing', 'Plant Relocation',
       'Partial Temporary Closing', 'Temporary Plant Layoff/Closing',
       'Plant Demolition', 'Temporary Partial Plant Closing',
       'Temporary Plant Closing/Layoff', 'Temporar

What is the frequency of each reason? `value_counts` counts the number of times a specific value appears in a column

In [16]:
df['reason'].value_counts()

Temporary Plant Layoff                       580
Temporary Plant Closing                      436
Plant Closing                                 78
Plant Layoff                                  64
Temporary Layoff                              11
Temporary Closing                              7
Plant Unit Closing                             3
Temporary Plant Layoff (Furlough)              3
Temporary  Plant Closing                       3
Temporary PLant Layoff                         1
Temporary Plant Closing/Layoff                 1
Temporary Plant Layoff/Plant Layoff            1
Possible Plant Layoff                          1
Temporary Plant \nClosing                      1
Plant Temporary Closing                        1
Plant Sale                                     1
Partial Temporary Closing                      1
Temporary & Permanent Layoff                   1
Temporary Palnt Closing                        1
Tempoarary Plant Closing                       1
Plant Demolition    

Looks like we have some typos. We can try to reconcile some of the values by replacing text.

In [24]:
df.loc[df['reason'].isin(['Temporary PlantClosing', 
                          'TemporaryPlant Closing', 
                          'Temporary Plant \nClosing',
                          'Temporary Plant  Closing',
                          'Temporary Palnt Closing',
                          'Temporary Closing',
                          'Temporary  Plant Closing', 
                          'Temporary Unit Closing',
                          'Plant Temporary Closing',
                          'Temporary Plant Closing and Plant Closing'
                          'Temporary Partial Plant Closing',
                          'Partial Temporary Closing',
                          'Tempoarary Plant Closing']), 'reason'] = 'Temporary Plant Closing' 
df.loc[df['reason'].isin(['Temporary Plant  Layoff', 
                          'Temporary PLant Layoff', 
                          'Temporary  Plant Layoff',
                          'Temporary Plant Layoff/Plant Layoff',
                          'Temporary Plant Layoff (Furlough)',
                          'Temporary Plant \nLayoff',
                          'Temporary Layoff'
                          ]), 'reason'] = 'Temporary Plant Layoff'
df.loc[df['reason'].isin(['Plant Unit Closing']), 'reason'] = 'Plant Closing'
df.loc[df['reason'].isin(['Plant Layoff (Conditional)',
                          'Possible Plant Layoff']), 'reason'] = 'Plant Layoff'

In [32]:
reasons = df['reason'].value_counts().reset_index().rename(columns = {'index': 'reason', 'reason': 'count'})
reasons

Unnamed: 0,reason,count
0,Temporary Plant Layoff,599
1,Temporary Plant Closing,455
2,Plant Closing,81
3,Plant Layoff,66
4,Temporary Plant Layoff/Closing,1
5,Temporary Plant Closing and Plant Closing,1
6,Plant Sale,1
7,Contract Dissolution,1
8,Possible Plant Layoff/Closing,1
9,Temporary Plant Closing/Layoff,1


In [35]:
reasons.to_csv('../output/reasons.csv', index=False)

In [36]:
df[df['reason'] == 'Plant Closing']

Unnamed: 0,notice_date,event_number,reason,company,address,county,phone,business_type,affected,total_employees,layoff_date,dislocation,union,classification
9,4/30/2020,2019-1601,Plant Closing,"Truck-Lite Co, LLC","310 E. Elmwood AvenueFalconer, NY 14733",Chautauqua County,(716) 661-1141,Producer of LED safety lighting,97,97,Layoffs,Economic,International Association of Machinists and Ae...,Plant Closing
26,4/30/2020,2019-1568,Plant Closing,"Ivy Rehab Network, Inc.",Central Billing Office1377 Motor Pkwy.Islandia...,Suffolk County,(914) 777-8700,Medical Billing,36,0,5/6/2020,Unforeseeable business circumstances prompted ...,The employees are not represented by a union,Plant Closing
36,4/28/2020,2019-1537 and 2019-1579,Plant Closing,Brookset Bus Corp.,(Nine Brookset Bus Corp. locations in Long Isl...,Nassau/Suffolk County,(631) 471-4600,Transportation,132,132,March,The Company’s initial temporary layoffs were u...,Teamsters Local 1205; United Service Workers U...,Plant Closing
37,4/28/2020,2019-1535 and 2019-1578,Plant Closing,"Baumann & Sons Buses, Inc.","(Six Baumann & Sons Buses, Inc. locations in L...",Suffolk County,(631) 471-4600,Transportation,209,209,March,The Company’s initial temporary layoffs were u...,"Teamsters Local 1205, United Service Workers U...",Plant Closing
38,4/28/2020,2019-1534 and 2019-1577,Plant Closing,Acme Bus Corp.,(Seven Acme Bus Corp. loctions in Long Island ...,Suffolk County,(631) 471-4600,Transportation,814,814,March,The Company’s initial temporary layoffs were u...,Teamsters Local 1205,Plant Closing
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1207,1/7/2020,2019-0209,Plant Closing,Branson Ultrasonics Corp.,"475 Quaker Meeting House RoadHoneoye Falls, NY...",Monroe County,(203) 796-0331,Welding and Soldering Equipment Manufacturing,47,47,Separations,Economic,The employees are not represented by a union.,Plant Closing
1208,1/8/2020,2019-0210,Plant Closing,"Connected Ventures, LLC (CH Media)","330 W. 34th Street, 5th FloorNew York, NY 10001",New York County,(212) 314-7366,Operates online content and retail properties,39,39,Employment,Economic,The employees are not represented by a union.,Plant Closing
1209,1/6/2020,2019-0207,Plant Closing,Macy's Broadway Mall Store (Macy's Retail Hold...,"100 Broadway MallHicksville, NY 11801",Nassau County,(646) 429-7462,Retail Store,155,155,Macy's,Economic,The employees are not represented by a union.,Plant Closing
1211,12/30/2019,2019-0205,Plant Closing,"127 W. 43rd St. Chophouse, Inc. (Heartland Bre...","127 West 43rd StreetNew York, NY 10018",New York County,(917) 999-6532,Restaurant,106,106,The,Economic,The employees are not represented by a union.,Plant Closing


What types of businesses have been most impacted?

In [37]:
df['business_type'].unique()

array(['Restaurant', 'Producer of LED safety lighting', 'Car Rental',
       'Travel Agency', 'Family planning services', 'Transportation',
       'Retail', 'Janitorial Services', 'Hotel',
       'Gem Research, Education, and Laboratory', 'Labor Union',
       'Non-profit', 'Medical Billing', 'Commercial Printing',
       'Event Venue', 'Convention Center', 'Social Organization',
       'Cinemas', 'Catering', 'Mechanical Contractor', 'Social Club',
       'Human Services', 'Wine Wholesaler', 'General Contractor',
       'Production Company', 'Janitorial Engineering', 'Fitness Club',
       'Legal Services', 'Shoe Store', 'Carpentry Organization',
       'Lumber and Building Material Distribution',
       'Transportation Carrier', 'Food preparation and delivery services',
       'Outdoor Lounge/Restaurant', 'Bakery', 'Eye Care',
       'Plumbing, Heating, PVF, Waterworks, and Fire Protection Supplies',
       'Plumbing, Heating, PVF, Fire Protection, and Waterworks Supplies',
       'Ma

In [148]:
df['business_type'].value_counts().reset_index()

Unnamed: 0,index,business_type
0,Restaurant,481
1,Hotel,138
2,Retail,39
3,Auto Dealership,24
4,Catering,16
...,...,...
345,Duck Farm,1
346,Refrigeration Wholesaler,1
347,Wholesale Bakery,1
348,Linen and Uniforms,1


What were the dislocation reasons?

In [39]:
df['dislocation'].value_counts().reset_index().rename(columns={'index': 'dislocation', 'dislocation': 'count'})

Unnamed: 0,dislocation,count
0,Unforeseeable business circumstances prompted ...,1087
1,Economic,58
2,Unforeseeable business circumstances prompted...,8
3,Unforeseen business circumstances as a result ...,3
4,The Company’s initial temporary layoffs were u...,3
5,Sale of Business,3
6,Budgetary Deficit,2
7,Restructuring which will involve the relocatio...,2
8,EconomicUnforeseeable business circumstances p...,2
9,Contract Loss,1


Same issue. Let's try to fix by replacing all values that contain **COVID-19** with COVID-19.

In [42]:
df.loc[df['dislocation'].str.contains('COVID-19'), 'dislocation'] = 'COVID-19'
dislocation = df['dislocation'].value_counts().reset_index().rename(columns={'index': 'dislocation', 'dislocation': 'count'})

In [43]:
df[df['dislocation'] == 'Loss of Contract with New York Transit Authority']

Unnamed: 0,notice_date,event_number,reason,company,address,county,phone,business_type,affected,total_employees,layoff_date,dislocation,union,classification
493,4/3/2020,2019-1062,Plant Layoff,"Consolidated Bus Transit (CBT) Para Transit, Inc.","2382 Blackrock AvenueBronx, NY 10472",Bronx County,(718) 346-9600 Ext,Bus Transit,239,0,Separations,Loss of Contract with New York Transit Authority,Local Union No. 854; Local 553 IBT,Plant Layoff


In [45]:
df['union'].value_counts()

The employees are not represented by a union                                 866
The employees are not represented by a union.                                108
New York Hotel & Motel Trades Council, AFL-CIO                                62
The employees are not represented by a union. Non Union                       20
lUE/CWA Local 81408                                                           10
                                                                            ... 
District Council 37, Local 215                                                 1
Laundry, Distribution & Food Service Joint, Workers United, SEIU Local 99      1
ATU Local 1700                                                                 1
UAW Union, Local 481                                                           1
New York Hotel & Motel Trades Council, AFL-CIO and UNITE HERE                  1
Name: union, Length: 112, dtype: int64

In [46]:
df.loc[df['union'].isin(['The employees are not represented by a union.',
                         'The employees are not represented by a union. Non Union'
                         ]), 'union'] = 'The employees are not represented by a union' 
df['union'].value_counts().reset_index()

Unnamed: 0,index,union
0,The employees are not represented by a union,994
1,"New York Hotel & Motel Trades Council, AFL-CIO",62
2,lUE/CWA Local 81408,10
3,-----,8
4,"Local 32BJ, Service Employees International Union",8
...,...,...
105,"District Council 37, Local 215",1
106,"Laundry, Distribution & Food Service Joint, Wo...",1
107,ATU Local 1700,1
108,"UAW Union, Local 481",1


In [47]:
df['affected'].sum()

110485

How many people were affected by a WARN notice by business type?

In [58]:
df.groupby(['business_type'])['affected'].agg('sum').sort_values(ascending=False).reset_index()

Unnamed: 0,business_type,affected
0,Restaurant,40689
1,Hotel,15358
2,Catering,3217
3,Retail clothing store,2696
4,Retail,2057
...,...,...
345,Closets and Storage,0
346,Clothing Manufacturing,0
347,Museum,0
348,Ophthalmologist/Optician,0


In [59]:
df['county'].value_counts()

New York County                                595
Nassau County                                   65
Kings County                                    64
Suffolk County                                  63
Queens County                                   58
                                              ... 
Queens/Kings/New York County                     1
New York/Kings/Queens/Bronx/ County              1
Nassau /Suffolk County                           1
New York/Richmond/Queens/Kings/Bronx County      1
New York/Kings/Queens/Bronx/Richmond County      1
Name: county, Length: 114, dtype: int64

In [60]:
df['company'].value_counts().reset_index()

Unnamed: 0,index,company
0,"Abercrombie & Fitch, abercrombie kids, Hollist...",10
1,A&M Administration LLC dba Charlotte Russe,8
2,"Mid Rockland Imaging Partners, Inc.",7
3,"OS Restaurant Services, LLC (Bloomin Brands - ...",7
4,"Guess?, Inc.",6
...,...,...
1111,It's Our Pleasure Hospitality Group LLC,1
1112,China Management LLC (2 locations),1
1113,The Century Association (the Club),1
1114,"Home Box Office, Inc.",1


In [61]:
df.groupby(['company'])['affected'].agg('sum').sort_values(ascending=False).reset_index()

Unnamed: 0,company,affected
0,"Abercrombie & Fitch, abercrombie kids, Hollist...",2696
1,"OS Restaurant Services, LLC (Bloomin Brands - ...",2478
2,American Sales Management Organization LLC dba...,1321
3,Regal Cinemas,1004
4,Zara USA,918
...,...,...
1111,Le Bernardin,0
1112,Lastrada Restaurant LLC,0
1113,Highgate Hotels LP (impacted workers at The Re...,0
1114,S.P.E.A.R. Physical and Occupational Therapy PLLC,0


In [188]:
pd.set_option('display.max_rows', 84)
date = df['notice_date'].value_counts().reset_index().rename(columns = {'index': 'date', 'notice_date': 'count'})
date = date[~date['date'].str.contains('2019')]
date

Unnamed: 0,date,count
0,3/25/2020,69
1,3/20/2020,66
2,3/23/2020,64
3,3/27/2020,63
4,3/31/2020,60
5,3/18/2020,56
6,3/30/2020,54
7,3/26/2020,52
8,3/19/2020,51
9,4/17/2020,48


In [189]:
date.to_csv('../output/by_date.csv', index=False)

In [62]:
df.groupby(['notice_date'])['affected'].agg('sum').sort_values(ascending=False).reset_index()

Unnamed: 0,notice_date,affected
0,3/25/2020,7229
1,3/18/2020,6855
2,3/31/2020,6043
3,3/23/2020,5761
4,3/19/2020,5466
...,...,...
79,2/12/2020,7
80,1/30/2020,2
81,11/4/2019,2
82,3/6/2020,2
