### Study of the effect of national factors on home prices in the US


**Task:** Using publicly available data on the national factors that impact the supply and demand of homes in the US, build a data science model to study the effect of these variables on home prices.

**Approach:** The following variables are chosen for the study:

1. Unemployment Rate
2. Employment Rate
3. Per capita GDP
4. Median Household Income
5. Construction Prices
6. CPI
7. Interest Rates
8. The number of new houses supplied
9. Working Population
10. Urban Population
11. Percentage of population above 65
12. Housing subsidies
13. Number of Households

As a proxy for home prices, the S&P **Case-Shiller Index** is used.

**Note:** Most of the data is downloaded from [https://fred.stlouisfed.org/].

Data for all the variables is downloaded, preprocessed, and combined to create a dataset using the **Extract Transform Load (ETL)** method. Data for different variables had different frequencies. So, to combine the data, the necessary interpolations are made.


#### Importing neccessary libraries

In [5]:
import numpy as np
import pandas as pd

#### Perform ETL

In [6]:
# Reading CASE-SHILLER Index into a dataframe
df_CS = pd.read_csv("D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\CSUSHPISA.csv")

# Changing dtype of date column
df_CS["DATE"] = pd.to_datetime(df_CS["DATE"])

# Selecting data till JULY 2023
mask = df_CS["DATE"] <= "2024-04-01"
df_CS = df_CS[mask]

#Resetting Index
df_CS.reset_index(inplace = True)
df_CS.drop(columns = ["index"], inplace = True)

# Creating "Year" and "Month" columns
df_CS["Year"] = pd.DatetimeIndex(df_CS["DATE"]).year
df_CS["Month"] = pd.DatetimeIndex(df_CS["DATE"]).month
print("Shape of the CASE-SHILLER Index:- ", df_CS.shape)
df_CS.tail()


Shape of the CASE-SHILLER Index:-  (280, 4)


Unnamed: 0,DATE,CSUSHPISA,Year,Month
275,2023-12-01,314.443,2023,12
276,2024-01-01,315.728,2024,1
277,2024-02-01,317.257,2024,2
278,2024-03-01,318.217,2024,3
279,2024-04-01,319.048,2024,4


In [12]:
# Reading Unemployment Rate Data into a dataframe
df_unemp = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\UNRATE.csv")
df_unemp.drop([280,281], inplace = True)
print("Unemployment Rate Data:- ", df_unemp.shape)
df_unemp.tail()

Unemployment Rate Data:-  (280, 2)


Unnamed: 0,DATE,UNRATE
275,12/1/2023,3.7
276,1/1/2024,3.7
277,2/1/2024,3.9
278,3/1/2024,3.8
279,4/1/2024,3.9


In [13]:
# Reading Employment Rate Data into a dataframe
df_emp = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\EMPRATE.csv")
df_emp = df_emp.rename(columns={'LREM64TTUSM156S': 'EmpRate'})
df_emp.drop([280,281], inplace = True)
print("shape of the Employment Rate Data:- ", df_emp.shape)
df_emp.tail()

shape of the Employment Rate Data:-  (280, 2)


Unnamed: 0,DATE,EmpRate
275,12/1/2023,71.81763
276,1/1/2024,72.01261
277,2/1/2024,71.88552
278,3/1/2024,72.00176
279,4/1/2024,72.02491


In [14]:
# Reading Per Capita GDP Data into a dataframe
df_pcgdp = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\GDP.csv", names = ["DATE", "A939RX0Q048SBEA"], skiprows = 1)
df_pcgdp = df_pcgdp.rename(columns={'A939RX0Q048SBEA': 'Per_Capita_GDP'})
print("Shape of the Per Capita GDP Data:- ", df_pcgdp.shape)
df_pcgdp.tail()

Shape of the Per Capita GDP Data:-  (93, 2)


Unnamed: 0,DATE,Per_Capita_GDP
88,1/1/2023,66096
89,4/1/2023,66357
90,7/1/2023,67050
91,10/1/2023,67513
92,1/1/2024,67672


The data is quarterly. We will impute for other months using linear interpolation after we create the final dataframe combining all the data.


In [16]:
# Interest Rate Data
df_Fed_rate = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\FEDFUNDS.csv").drop([280,281])
print("Shape of the Interest rate data:- ",df_Fed_rate.shape)
df_Fed_rate.tail()

Shape of the Interest rate data:-  (280, 2)


Unnamed: 0,DATE,FEDFUNDS
275,12/1/2023,5.33
276,1/1/2024,5.33
277,2/1/2024,5.33
278,3/1/2024,5.33
279,4/1/2024,5.33


In [17]:
# Reading Construction Material Data into a dataframe
df_cons_price_index = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\WPUSI012011.csv", names = ["DATE", "WPUSI012011"], skiprows = 1)
df_cons_price_index = df_cons_price_index.rename(columns={'WPUSI012011': 'Cons_Material'})
df_cons_price_index.drop([280,281], inplace = True)
print("Shape of the Construction Material Data:- ", df_cons_price_index.shape)
df_cons_price_index.tail()

Shape of the Construction Material Data:-  (280, 2)


Unnamed: 0,DATE,Cons_Material
275,12/1/2023,327.644
276,1/1/2024,334.374
277,2/1/2024,337.766
278,3/1/2024,330.965
279,4/1/2024,330.166


In [20]:
# Consumer Price Index
df_CPI = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\CPIAUCSL.csv", names = ["DATE", "CPIAUCSL"], skiprows = 1).drop([268,269])
df_CPI = df_CPI.rename(columns={'CPIAUCSL': 'CPI'})
print("Shape of the Consumer Price Index:- ", df_CPI.shape)
df_CPI.tail()

Shape of the Consumer Price Index:-  (268, 2)


Unnamed: 0,DATE,CPI
263,12/1/2023,308.742
264,1/1/2024,309.685
265,2/1/2024,311.054
266,3/1/2024,312.23
267,4/1/2024,313.207


In [22]:
# Monthly new house supply
df_house = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\MSACSR.csv", names = ["DATE", "MSACSR"], skiprows = 1).drop([280])
df_house = df_house.rename(columns={'MSACSR': 'Houses'})
print("Shape of the monthly house supply data:- ", df_house.shape)
df_house.tail()


Shape of the monthly house supply data:-  (280, 2)


Unnamed: 0,DATE,Houses
275,12/1/2023,8.2
276,1/1/2024,8.3
277,2/1/2024,8.7
278,3/1/2024,8.2
279,4/1/2024,8.1


In [23]:
# Population above 65

df_oldpop = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\old pops.csv", names = ["DATE", "old pops"], skiprows = 1)
#df_oldpop['DATE'] = pd.to_datetime(df_oldpop['DATE'], format="%d-%m-%Y").dt.strftime("%Y-%m-%d")
print("Shape of the population data age above 65:- ", df_oldpop.shape)
df_oldpop.tail()

Shape of the population data age above 65:-  (23, 2)


Unnamed: 0,DATE,old pops
18,1/1/2019,15.791801
19,1/1/2020,16.2234
20,1/1/2021,16.678895
21,1/1/2022,17.128121
22,1/1/2023,17.58792


In [30]:
# Urban Population Percent

df_urban = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\urban_pops.csv", names = ["DATE", "urban pops"], skiprows = 1)
df_urban['DATE'] = pd.to_datetime(df_urban['DATE'], format="%Y").dt.strftime("%Y-%m-%d")
print("Shape of the urban population percent data:- ", df_urban.shape)
df_urban.tail()

Shape of the urban population percent data:-  (23, 2)


Unnamed: 0,DATE,urban pops
18,2019-01-01,82.459
19,2020-01-01,82.664
20,2021-01-01,82.873
21,2022-01-01,83.084
22,2023-01-01,83.298


In [31]:
# Housing Subsidies

df_subsidy = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\Housing Subsidies.csv", names = ["DATE", "Subsidy"], skiprows = 1)
print("Shape of the housing subsidies:- ", df_subsidy.shape)
df_subsidy.tail()


Shape of the housing subsidies:-  (22, 2)


Unnamed: 0,DATE,Subsidy
17,1/1/2018,38.859
18,1/1/2019,40.185
19,1/1/2020,44.147
20,1/1/2021,45.299
21,1/1/2022,48.021


In [34]:
# Working age population

df_working = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\LFWA64TTUSM647S.csv", names = ["DATE", "LFWA64TTUSM647S"], skiprows = 1).drop([280,281])
df_working = df_working.rename(columns={'LFWA64TTUSM647S': 'working_age_pop'})
print("Shape of the working age population:- ", df_working.shape)
df_working.tail()

Shape of the working age population:-  (280, 2)


Unnamed: 0,DATE,working_age_pop
275,12/1/2023,209117700
276,1/1/2024,208630800
277,2/1/2024,208655500
278,3/1/2024,208606600
279,4/1/2024,208586500


In [35]:
# Real Median Household Income

df_income = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\Real_Median_House.csv", names = ["DATE", "MEHOINUSA672N"], skiprows = 1)
df_income = df_income.rename(columns={'MEHOINUSA672N': 'median_income'})
print("Shape of the median household income data:- ", df_income.shape)
df_income.tail()


Shape of the median household income data:-  (22, 2)


Unnamed: 0,DATE,median_income
17,1/1/2018,73030
18,1/1/2019,78250
19,1/1/2020,76660
20,1/1/2021,76330
21,1/1/2022,74580


In [36]:
# Total number of households

df_households = pd.read_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\DATASETS\Households.csv", names = ["DATE", "TTLHH"], skiprows = 1)
df_households = df_households.rename(columns={'TTLHH': 'Num_Households'})
print("Shape of the total households data:- ", df_households.shape)
df_households.tail()


Shape of the total households data:-  (23, 2)


Unnamed: 0,DATE,Num_Households
18,1/1/2019,128579
19,1/1/2020,128451
20,1/1/2021,129224
21,1/1/2022,131202
22,1/1/2023,131434


In [37]:
# Merging Per Capita GDP (Quarterly data)
df_pcgdp["DATE"] = pd.to_datetime(df_pcgdp["DATE"])
df_CS = pd.merge(df_CS,df_pcgdp, how = "left")
df_CS.head()


Unnamed: 0,DATE,CSUSHPISA,Year,Month,Per_Capita_GDP
0,2001-01-01,109.846,2001,1,49911.0
1,2001-02-01,110.501,2001,2,
2,2001-03-01,111.108,2001,3,
3,2001-04-01,111.652,2001,4,50105.0
4,2001-05-01,112.164,2001,5,


In [38]:
# Concating dataframes having monthly data to create one dataframe
df = pd.DataFrame()
df_bymonth = [df_CS, df_working, df_house, df_CPI, df_unemp, df_emp, df_cons_price_index, df_Fed_rate]
for df1 in df_bymonth:
    df1["DATE"] = pd.to_datetime(df1["DATE"])
    df1 = df1.set_index("DATE")
    df = pd.concat([df,df1], axis = 1)
print(df.shape)
df.head()


(280, 11)


Unnamed: 0_level_0,CSUSHPISA,Year,Month,Per_Capita_GDP,working_age_pop,Houses,CPI,UNRATE,EmpRate,Cons_Material,FEDFUNDS
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2001-01-01,109.846,2001,1,49911.0,180348200,3.8,,4.2,73.96542,142.0,5.98
2001-02-01,110.501,2001,2,,180505900,3.7,,4.2,73.87383,142.4,5.49
2001-03-01,111.108,2001,3,,180595600,3.8,,4.3,73.90762,142.4,5.31
2001-04-01,111.652,2001,4,50105.0,180851800,3.9,,4.4,73.55553,142.5,4.8
2001-05-01,112.164,2001,5,,181013500,4.0,,4.3,73.39594,144.2,4.21


In [39]:
# Merging other dataframes 
others = [df_urban, df_households, df_income, df_subsidy, df_oldpop]
for df1 in others:
    if "Year" not in df1.columns:
        df1["Year"] = pd.DatetimeIndex(df1["DATE"]).year
        df1.set_index("DATE", inplace = True)
        df = pd.merge(df, df1, how = "left", on = "Year")
    else:
        df1.set_index("DATE", inplace = True)
        df = pd.merge(df, df1, how = "left", on = "Year")
df["DATE"] = df_CS["DATE"]
df.set_index("DATE", inplace = True)
df.head()

Unnamed: 0_level_0,CSUSHPISA,Year,Month,Per_Capita_GDP,working_age_pop,Houses,CPI,UNRATE,EmpRate,Cons_Material,FEDFUNDS,urban pops,Num_Households,median_income,Subsidy,old pops
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2001-01-01,109.846,2001,1,49911.0,180348200,3.8,,4.2,73.96542,142.0,5.98,79.234,108209.0,66360.0,20.573,12.296945
2001-02-01,110.501,2001,2,,180505900,3.7,,4.2,73.87383,142.4,5.49,79.234,108209.0,66360.0,20.573,12.296945
2001-03-01,111.108,2001,3,,180595600,3.8,,4.3,73.90762,142.4,5.31,79.234,108209.0,66360.0,20.573,12.296945
2001-04-01,111.652,2001,4,50105.0,180851800,3.9,,4.4,73.55553,142.5,4.8,79.234,108209.0,66360.0,20.573,12.296945
2001-05-01,112.164,2001,5,,181013500,4.0,,4.3,73.39594,144.2,4.21,79.234,108209.0,66360.0,20.573,12.296945


In [40]:
print(df.shape)

(280, 16)


Check missing values (NAN)

In [41]:
df.isna().sum()

CSUSHPISA            0
Year                 0
Month                0
Per_Capita_GDP     187
working_age_pop      0
Houses               0
CPI                 12
UNRATE               0
EmpRate              0
Cons_Material        0
FEDFUNDS             0
urban pops           4
Num_Households       4
median_income       16
Subsidy             16
old pops             4
dtype: int64

The "Per_Capita_GDP" column has missing values because the data was quarterly. The missing values in the other columns are due to the unavailability of fresh data. We will first fill in the missing values in the "Per_Capita_GDP" column using linear interpolation. We will drop the rows with missing values in the other columns. This means that we will use data from 2002 to 2022.

**Interpolation:**

Interpolation is a mathematical technique used to estimate values that are missing in a dataset based on the values of neighboring data points. It calculates intermediate values based on the existing data.



In [42]:
# Filling missing values in the Per_Capita_GDP column using linear interpolation
df["Per_Capita_GDP"] = df["Per_Capita_GDP"].interpolate()

In [43]:
df

Unnamed: 0_level_0,CSUSHPISA,Year,Month,Per_Capita_GDP,working_age_pop,Houses,CPI,UNRATE,EmpRate,Cons_Material,FEDFUNDS,urban pops,Num_Households,median_income,Subsidy,old pops
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2001-01-01,109.846,2001,1,49911.000000,180348200,3.8,,4.2,73.96542,142.000,5.98,79.234,108209.0,66360.0,20.573,12.296945
2001-02-01,110.501,2001,2,49975.666667,180505900,3.7,,4.2,73.87383,142.400,5.49,79.234,108209.0,66360.0,20.573,12.296945
2001-03-01,111.108,2001,3,50040.333333,180595600,3.8,,4.3,73.90762,142.400,5.31,79.234,108209.0,66360.0,20.573,12.296945
2001-04-01,111.652,2001,4,50105.000000,180851800,3.9,,4.4,73.55553,142.500,4.80,79.234,108209.0,66360.0,20.573,12.296945
2001-05-01,112.164,2001,5,49994.666667,181013500,4.0,,4.3,73.39594,144.200,4.21,79.234,108209.0,66360.0,20.573,12.296945
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2023-12-01,314.443,2023,12,67619.000000,209117700,8.2,308.742,3.7,71.81763,327.644,5.33,83.298,131434.0,,,17.587920
2024-01-01,315.728,2024,1,67672.000000,208630800,8.3,309.685,3.7,72.01261,334.374,5.33,,,,,
2024-02-01,317.257,2024,2,67672.000000,208655500,8.7,311.054,3.9,71.88552,337.766,5.33,,,,,
2024-03-01,318.217,2024,3,67672.000000,208606600,8.2,312.230,3.8,72.00176,330.965,5.33,,,,,


In [44]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 280 entries, 2001-01-01 to 2024-04-01
Data columns (total 16 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   CSUSHPISA        280 non-null    float64
 1   Year             280 non-null    int32  
 2   Month            280 non-null    int32  
 3   Per_Capita_GDP   280 non-null    float64
 4   working_age_pop  280 non-null    int64  
 5   Houses           280 non-null    float64
 6   CPI              268 non-null    float64
 7   UNRATE           280 non-null    float64
 8   EmpRate          280 non-null    float64
 9   Cons_Material    280 non-null    float64
 10  FEDFUNDS         280 non-null    float64
 11  urban pops       276 non-null    float64
 12  Num_Households   276 non-null    float64
 13  median_income    264 non-null    float64
 14  Subsidy          264 non-null    float64
 15  old pops         276 non-null    float64
dtypes: float64(13), int32(2), int64(1)
memory u

In [45]:
df.dropna(inplace = True)

In [46]:
df.isna().sum()

CSUSHPISA          0
Year               0
Month              0
Per_Capita_GDP     0
working_age_pop    0
Houses             0
CPI                0
UNRATE             0
EmpRate            0
Cons_Material      0
FEDFUNDS           0
urban pops         0
Num_Households     0
median_income      0
Subsidy            0
old pops           0
dtype: int64

In [47]:
df

Unnamed: 0_level_0,CSUSHPISA,Year,Month,Per_Capita_GDP,working_age_pop,Houses,CPI,UNRATE,EmpRate,Cons_Material,FEDFUNDS,urban pops,Num_Households,median_income,Subsidy,old pops
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2002-01-01,117.143,2002,1,50091.000000,182669900,4.2,177.700,5.7,72.03249,142.000,1.73,79.409,109297.0,65820.0,24.183,12.287458
2002-02-01,117.844,2002,2,50156.000000,182822700,4.0,178.000,5.7,72.33837,142.200,1.74,79.409,109297.0,65820.0,24.183,12.287458
2002-03-01,118.687,2002,3,50221.000000,183078000,4.1,178.500,5.7,72.15660,143.200,1.73,79.409,109297.0,65820.0,24.183,12.287458
2002-04-01,119.611,2002,4,50286.000000,183316800,4.3,179.300,5.9,71.90249,143.500,1.75,79.409,109297.0,65820.0,24.183,12.287458
2002-05-01,120.724,2002,5,50311.333333,183463400,4.0,179.500,5.8,72.01910,143.800,1.75,79.409,109297.0,65820.0,24.183,12.287458
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-08-01,301.029,2022,8,65579.000000,207438500,8.6,295.209,3.6,71.47372,342.753,2.33,83.084,131202.0,74580.0,48.021,17.128121
2022-09-01,299.006,2022,9,65689.000000,207503400,9.9,296.341,3.5,71.43250,336.464,2.56,83.084,131202.0,74580.0,48.021,17.128121
2022-10-01,298.612,2022,10,65799.000000,207522800,9.7,297.863,3.6,71.29188,333.796,3.08,83.084,131202.0,74580.0,48.021,17.128121
2022-11-01,298.332,2022,11,65898.000000,207587800,9.2,298.648,3.6,71.30185,330.369,3.78,83.084,131202.0,74580.0,48.021,17.128121


In [48]:
print("Shape of the dataframe after preprocessing:- ", df.shape)

Shape of the dataframe after preprocessing:-  (252, 16)


This is our preprocessed datset. Let's save it as "prepared_dataset.csv".


In [49]:
df.to_csv("prepared_dataset.csv")

In [50]:
us_house_price_df = pd.read_csv("prepared_dataset.csv").set_index("DATE")
us_house_price_df.head()

Unnamed: 0_level_0,CSUSHPISA,Year,Month,Per_Capita_GDP,working_age_pop,Houses,CPI,UNRATE,EmpRate,Cons_Material,FEDFUNDS,urban pops,Num_Households,median_income,Subsidy,old pops
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2002-01-01,117.143,2002,1,50091.0,182669900,4.2,177.7,5.7,72.03249,142.0,1.73,79.409,109297.0,65820.0,24.183,12.287458
2002-02-01,117.844,2002,2,50156.0,182822700,4.0,178.0,5.7,72.33837,142.2,1.74,79.409,109297.0,65820.0,24.183,12.287458
2002-03-01,118.687,2002,3,50221.0,183078000,4.1,178.5,5.7,72.1566,143.2,1.73,79.409,109297.0,65820.0,24.183,12.287458
2002-04-01,119.611,2002,4,50286.0,183316800,4.3,179.3,5.9,71.90249,143.5,1.75,79.409,109297.0,65820.0,24.183,12.287458
2002-05-01,120.724,2002,5,50311.333333,183463400,4.0,179.5,5.8,72.0191,143.8,1.75,79.409,109297.0,65820.0,24.183,12.287458


In [51]:
df.to_csv(r"D:\_SANKET_DATA_SCIENCE_\Interview\Home.LLC\final.csv")