### Data analysis:  Mobile homes in Paradise, California


#### By Kavish Harjai

**Question**

How does the number of mobile home lots in Paradise, California, compare to the number of mobile home lots in other cities in California with a similar population density?

**Datasets**

This project uses three datasets:

* List of mobile home parks permitted by the California Housing and Community Development Department. [Source.](https://casas.hcd.ca.gov/casas/cmirMp/onlineQuery)
* Population estimates by place in California from the 2018 ACS 5-year-estimates (via API)
    - This project uses the 2018 estimates because it reflects population prior to the Camp Fire.
* California places geography. [Source.](https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.2018.html)
    - This dataset contains the area of each place in California, which will be needed to calculate population density
    
**Methodology notes**

This notebook follows ```/Wrangling.ipynb```. 

1. Find the percentile in which Paradise's population density falls
2. Find other cities where the population density is within five percentile points above or below Paradise's
3. Compare the number of mobile home lots in Paradise to those in cities with comparable population densities

In [19]:
import pandas as pd 
import numpy as np
import os as os
import requests
from pprint import pprint

In [20]:
data_dir = os.environ["DATA_DIR"]
raw_data = data_dir + "/raw/"
processed_data = data_dir + '/processed/'

In [27]:
mh_merge = pd.read_csv(processed_data + 'mh_merge.csv')

Some cities don't have mobile home lots. Drop those from the analysis. 

In [28]:
mh_merge.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1521 entries, 0 to 1520
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   city          1521 non-null   object 
 1   pop_2018_est  1521 non-null   int64  
 2   place         1521 non-null   int64  
 3   city_y        1521 non-null   object 
 4   area_land     1521 non-null   float64
 5   place_type    1521 non-null   object 
 6   pop_density   1521 non-null   float64
 7   mh_spaces     678 non-null    float64
dtypes: float64(3), int64(2), object(3)
memory usage: 95.2+ KB


Drop cities where there are no mobile homes. 

In [29]:
mh_merge_filtered = mh_merge.dropna()

There are two places called Paradise in California. This analysis is concerned with the town of Paradise, where the population density in 2018 was 1,448.85 people per square mile. 

In [30]:
mh_merge_filtered[mh_merge_filtered.city == 'Paradise']

Unnamed: 0,city,pop_2018_est,place,city_y,area_land,place_type,pop_density,mh_spaces
183,Paradise,26543,55520,Paradise,18.32,Paradise town,1448.85,1586.0
874,Paradise,186,55528,Paradise,4.35,Paradise CDP,42.76,1586.0


To find the range of cities to compare to Paradise, find the percentile in which the town of Paradise's population density falls.

In [8]:
mh_merge_filtered['pop_density'].quantile(q=[0.1,
                                              0.2,
                                              0.3,
                                              0.4,
                                              0.5,
                                              0.6,
                                              0.7,
                                              0.8,
                                              0.9,
                                              1.0])

0.1       65.400
0.2      181.572
0.3      489.355
0.4      940.922
0.5     1737.390
0.6     2710.674
0.7     3450.765
0.8     4322.878
0.9     6754.560
1.0    20352.540
Name: pop_density, dtype: float64

Paradise falls somewhere between the 40th and 50th percentile. Drill down further. 

In [18]:
mh_merge_filtered['pop_density'].quantile(q=[0.4,
                                             0.41,
                                             0.42,
                                             0.43,
                                             0.44,
                                             0.45,
                                             0.46,
                                             0.47,
                                             0.48,
                                             0.49,
                                             0.5,
                                            0.51,
                                            0.52])

0.40     940.9220
0.41     967.0455
0.42    1151.2548
0.43    1221.7383
0.44    1286.7516
0.45    1318.8820
0.46    1388.6062
0.47    1484.7854
0.48    1530.8668
0.49    1660.7012
0.50    1737.3900
0.51    1862.2144
0.52    1942.4144
Name: pop_density, dtype: float64

Paradise falls most closely in the 47th percentile. 

Create filter to show cities that have a population density between the 42nd and 52nd percentiles.

In [31]:
comparison_range_popdens_percentile = mh_merge_filtered[
    (mh_merge_filtered['pop_density']>= 1151.2548) & 
    (mh_merge_filtered['pop_density']<= 1942.4144)]


In [32]:
comparison_range_popdens_percentile.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 68 entries, 14 to 1454
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   city          68 non-null     object 
 1   pop_2018_est  68 non-null     int64  
 2   place         68 non-null     int64  
 3   city_y        68 non-null     object 
 4   area_land     68 non-null     float64
 5   place_type    68 non-null     object 
 6   pop_density   68 non-null     float64
 7   mh_spaces     68 non-null     float64
dtypes: float64(3), int64(2), object(3)
memory usage: 4.8+ KB


Rank filtered list of cities by number of mobile home spaces. 

In [33]:
sort_compared_popdens_percentile = comparison_range_popdens_percentile.sort_values('mh_spaces', ascending= False).reset_index()
sort_compared_popdens_percentile.head(20)

Unnamed: 0,index,city,pop_2018_est,place,city_y,area_land,place_type,pop_density,mh_spaces
0,556,Yucaipa,53264,87042,Yucaipa,28.29,Yucaipa city,1882.79,4557.0
1,574,Lancaster,159662,40130,Lancaster,94.28,Lancaster city,1693.49,4177.0
2,1169,Redding,91327,59920,Redding,59.65,Redding city,1531.05,2569.0
3,14,Palmdale,156904,55156,Palmdale,106.08,Palmdale city,1479.11,2098.0
4,980,San Jacinto,47474,67112,San Jacinto,25.71,San Jacinto city,1846.52,1846.0
5,627,Oroville,19040,54386,Oroville,13.83,Oroville city,1376.72,1620.0
6,183,Paradise,26543,55520,Paradise,18.32,Paradise town,1448.85,1586.0
7,680,Victorville,121861,82590,Victorville,73.62,Victorville city,1655.27,1248.0
8,1446,Ridgecrest,28736,60704,Ridgecrest,20.88,Ridgecrest city,1376.25,1078.0
9,1227,Auburn,13946,3204,Auburn,7.18,Auburn city,1942.34,1006.0


### Conclusion

Paradise has the 7th highest number of mobile home spaces out of the 68 California cities with a similar population density in 2018.

Line for story: Paradise had one of the highest numbers of spaces available in its mobile home parks compared with more than 60 other California cities that had a similar concentration of people in 2018, an analysis of state and Census data shows. 
