# Exploring the Best Location for a New Daycare in Fairfax County, Virginia

##### *By Gustave Muhoza*
July 2020


## I. Introduction and the Business Problem

In the last two decades, the Child Care Services industry has been steadily growing. In 2019, for example, it was estimated that the childcare services industry was bringing in 47 billions dollars in revenues [(1)](https://blogs.edweek.org/edweek/early_years/2019/02/child_care_packs_nearly_100_billion_economic_impact_report_finds.html). Although child care services is ranked as the 2nd best franchise in the United States by FranData, a Franchise Market Research company[(2)](https://franserve.com/2020/02/the-top-franchising-categories-of-2020/), the vast majority of child care services is still provided by family members or within a home and often without any monetary cost [(3)](https://www.childtrends.org/nearly-30-percent-of-infants-and-toddlers-attend-home-based-child-care-as-their-primary-arrangement). Clearly, there is a room for new providers. Given this size and steady growth, providing childcare services can be fulfilling not just for the type of services provided--serving children--but also because of financial sustainability. However, as is the case for many other small businesses, long term financial sustainability is not guaranteed: starting a child care services business requires extensive market research.     

In this project, I explore one way data science can help answer perhaps the main question for any in-person, service business market research, namely, the question of location. Specifically, I present the case of a person in Fairfax County, Virginia who is the process of making a decision about the neighborhood in which to establish a new daycare. Using Foursquare, Census Bureau, and Fairfax County data, my goal is to answer the following question: **Which Fairfax neighborhoods are the most underserved in terms of daycare?**  


### II. The Data

In answering the above question, I want to show the neighborhoods with population characteristics that indicate need for additional daycares. I group these neighborhoods according to their similarity and create a visual representation of the most promissing neighborhoods. Choosing a daycare location may depend on other factors including the type of daycare, time services will be provided, marriage licenses in the area, and many other demographic factors (see [US Small Business Administration](https://www.sba.gov/sites/default/files/files/pub_mp29.pdf)). To limit the scope of this analysis, I focus on the relationship between the geographic concentration of children ages 0 to 4, and the number of daycares available in the area. I base my analysis and recommendations on the following data:

- **Census Designated Places (CDPs) data**: The data is available through [Fairfax County Open Geospatial Data](https://www.fairfaxcounty.gov/maps/open-geospatial-data) and is provided in many formats. I chose the csv version. In addition to the cdps names that will be used as neighborhoods for the purpose of my analysis, this data provides latitudes and longitudes of the different neighborhooods for mapping purposes and for obtaining nearby daycares. Knowledge of the area confirms that Fairfax County CDPs can indeed be considered neighborhoods. 

- **Census 2010 ZCTA to Place Relationship File**: This csv file is provided by [the US Census Bureau](http://www2.census.gov/geo/docs/maps-data/data/rel/zcta_place_rel_10.txt?#) and contains population data by zip code tabulation area (ZCTA) and how the ZCTAs match county's CDPs. Please also note that for simplification, I use “Zip Code” (postal code) and “ZCTA” interchangeably even though I am aware of issues that may result[4](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1762013/). Lack of geographic precision, however, should not undermine this analysis because a daycare established in one geographic area can provide services to children residing in other geographic areas. This file will make it possible to represent businesses by zipcode (approximately) and neighborhood at the same time. I read this relationship file from the Census Bureau and merge it with the designated places data from Fairfax County using their shared geoid. The result will be a zip code with the corresponding neighborhoods.

- **Census 2010 Age and Sex File**: I obtained this csv data by exploring [the US Census Bureau Explore Census Data](https://data.census.gov/cedsci/) many tables. This file has children data by census designated place and other details that will be relevant for this analysis. I will use the the percent of children Due to time constraint, I was not able to learn the Census API to obtain the file programmatically. I downloaded the file and saved it on my pc.

- **County Population Charateristics 2010-2019: This csv file, available on the [US Census Bureau](https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/counties/asrh/cc-est2019-alldata-51.csv) website has population estimates at the county level.

- **Foursquare Data**: Using Foursquare Places API's explore endpoint and the daycare category (4f4532974b9074f6e4fb0104), I rely on the census data above to pull a list of daycares by each neighborhood and zipcode.  
 