Insight 5 - Which zipcode area has the lowest Rent-to-Wage ratio (Aim to indentify the best places to live for new immigrants)

In [1]:
import os
from spark_utils import *
from pyspark.sql import functions as F

spark = create_spark_session()
bucket = 's3a://helenaudacitybucket'

Note: Here we take the LCA income that is based on the income of H1b holders. 
The prevailing wage data seems biased (They probably intentionally remove serveral high income data points from prevailing wage data.)

In [2]:
load_zillow = spark.read.parquet(os.path.join(bucket, 'processed_data', 'Zillow_price_rent'))
load_zillow.createOrReplaceTempView('zillow')
load_lca = spark.read.parquet(os.path.join(bucket, 'processed_data', 'LCA'))
load_lca.createOrReplaceTempView('lca')

In [3]:
output = spark.sql("""
SELECT zillow.State, 
zillow.Metro, 
zillow.CountyName, 
zillow.Zipcode, 
zillow.2021_07_Rent,
(zillow.2021_07_Rent/AVG_ANNUAL_INCOME) AS Rent_Wage_Ratio,
AVG_ANNUAL_INCOME,
INCOME_SAMPLE_SIZE
FROM zillow
JOIN (SELECT WORKSITE_POSTAL_CODE, 
             Avg(ANNUAL_INCOME) AS AVG_ANNUAL_INCOME,
             Count(ANNUAL_INCOME) AS INCOME_SAMPLE_SIZE
      FROM lca
      GROUP BY WORKSITE_POSTAL_CODE
) lca_wage_by_zip
ON lca_wage_by_zip.WORKSITE_POSTAL_CODE = zillow.Zipcode
WHERE INCOME_SAMPLE_SIZE>10
ORDER BY Rent_Wage_Ratio
LIMIT 10
""")
output.limit(100).toPandas()

Unnamed: 0,State,Metro,CountyName,Zipcode,2021_07_Rent,Rent_Wage_Ratio,AVG_ANNUAL_INCOME,INCOME_SAMPLE_SIZE
0,NY,New York-Newark-Jersey City,Westchester County,10601,2613.0,1.7e-05,153520500.0,30
1,NY,New York-Newark-Jersey City,New York County,10019,3004.0,0.000169,17821920.0,343
2,CA,Riverside-San Bernardino-Ontario,San Bernardino County,91761,2061.0,0.001748,1179054.0,22
3,AL,Birmingham-Hoover,Jefferson County,35205,1586.0,0.001781,890686.7,13
4,OH,Columbus,Franklin County,43202,1338.0,0.001803,742214.5,12
5,GA,Atlanta-Sandy Springs-Roswell,Gwinnett County,30093,1342.0,0.002038,658536.9,16
6,FL,Lakeland-Winter Haven,Polk County,33801,1190.0,0.002251,528659.6,12
7,CA,Los Angeles-Long Beach-Anaheim,Los Angeles County,90501,2324.0,0.002302,1009657.0,45
8,NJ,New York-Newark-Jersey City,Middlesex County,8817,1878.0,0.002473,759423.3,183
9,OR,Portland-Vancouver-Hillsboro,Multnomah County,97210,1588.0,0.002723,583162.3,17


*Take home message:* if you are looking for a job and open for relocation, check the opportunies in these cities. A low Rent-to-Wage indicates that the place is nice to live, easy to save money, and potentially attracts new immigrants.