# Covid-19 Impact on Businesses in Bellevue, WA

## Introduction
It has been more than two years since Covid-19 pandemic started in the U.S. Though our lives are getting back to normal, Covid's impact to our society is significant. A few weeks ago, my wife told me about her observation that it seemed like there were a lot more businesses poping up in residential area on Google Map, and she speculated that is was mainly due to Covid's influence. So, I decided to prove or disprove this by applying my data analysis skill, and I will walk through how I did it in this post.

## Dataset
Thanks to City of Bellevue | Open Data, I download Business Listing dataset on May 29,2022. It's a tabular data containing business information such as names, addresses and start date. There are 41,407 business dating all the way back to 1953.

In [1]:
#hide
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
%matplotlib inline


import json
import plotly.express as px
from naics_convert import naics2industries, naics2sectors

In [10]:
#hide
business = pd.read_csv("data/Business_Listing.csv", low_memory=False)
for date_col in ['CancelDate','FirstActivityDate','IssueDate']:
    business[date_col] = pd.to_datetime(business[date_col])

In [11]:
#hide_input
business.shape[0], business['IssueDate'].min()

(41407, Timestamp('1953-01-01 08:00:00+0000', tz='UTC'))

## Objective
The goal is to determine whether more people are starting their own businesses in Bellevue. That is, there are more newly small businesses in residential areas since the pandemic. It's also interesting if we can get an insight of what these business do.

### Preprocess Data
There are some preprocessing steps we need to do so that we can have not only a clean dataset but additional data not already existed in the original dataset. Here are the first 5 rows:

In [12]:
#hide_inpute
business.head()

Unnamed: 0,X,Y,ObjectId,BusinessFactId,BusinessId,LegalEntityName,Dba,Ubi,Naic,LegalEntityType,...,MailingZip4,PhysicalAddressLine1,PhysicalAddressLine2,Textbox4,PhysicalCity,PhysicalState,PhysicalPostalCode,PhysicalZip4,ProductsAndServices,IssueDate
0,-122.118803,47.594151,1,1,171474,ABACUS SCIENTIFIC INC,ABACUS SCIENTIFIC,6047495000000000.0,541519.0,Corporation,...,5150.0,16517 SE 18th St,,16517 SE 18th St,Bellevue,WA,98008,5150.0,Research and development. Computer Software,2021-07-26 06:59:59+00:00
1,-122.132494,47.605783,2,2,171612,"BELLEVUE SUNFLOWER DAYCARE,LLC",BELLEVUE SUNFLOWER DAYCARE,6047834000000000.0,611710.0,LLC or PLLC,...,5345.0,445 156th Ave SE,,445 156th Ave SE,Bellevue,WA,98007,5345.0,Family childcare services for kids under 12-ye...,2021-08-09 06:59:59+00:00
2,-122.132017,47.622329,3,3,31781,BURGERMASTER OF BELLEVUE,,179021800.0,722513.0,,...,5098.0,1350 156TH AV NE,,1350 156TH AV NE,BELLEVUE,WA,98007,4412.0,RESTAURANT,1988-11-01 07:59:59+00:00
3,-122.190504,47.626318,4,4,38947,CARL H JELSTRUP DC PS INC,,601331900.0,621310.0,Corporation,...,,1750 112TH AV NE,D154,1750 112TH AV NE D154,BELLEVUE,WA,98005,3727.0,CHIROPRACTIC HEALTH CARE,1992-04-01 08:00:00+00:00
4,-122.117462,47.641262,5,5,167692,ICK International Inc,ICK International Inc,6044245000000000.0,541613.0,Corporation,...,6173.0,3508 167th Pl NE,,3508 167th Pl NE,Bellevue,WA,98008,6173.0,SOFTWARE DEVELOPMENT and MARKETING,2020-04-08 06:59:59+00:00


Since we are interested in Covid's influence on new businesses, IssueDate (start date) would be critical to help us determine the business started before or after pandemic. We will drop 217 business with missing IssueDate. 

In [13]:
#hide_input
print(business['IssueDate'].isna().sum())
business = business.dropna(subset=['IssueDate'])

217


In [None]:
#hide
business["Sector"] = naics2sectors(business["Naic"])
business["Industry"] = naics2industries(business["Naic"])
business['Year'] = business['IssueDate'].dt.year

### Addresses
My initial approach is to fetch residential delivery indicator(RDI) for all businesses since we have their addresses. Unfortunately, it's only feasible with paid API (which I am reluctant to get for a hobby project). But, we can still plot business locations on a map. Mapbox provides free limited API request for its awesome Maps service. In addition, we truncate the dataset to keep only businesses with I

In [None]:
with open('mapbox_token.json','r') as openfile:
    mapbox_token = json.load(openfile)
px.set_mapbox_access_token(mapbox_token['token'])
fig = px.scatter_mapbox(business[business['IssueDate'] >= '2018-01-01'],lat='Y',lon='X',color='post_covid',zoom=10,
                        hover_name='LegalEntityName',opacity=0.6,title='Business Listing in Bellevue Pre/Post Covid-19')
fig.show()