# BlueFin: Redfin Scraper

BlueFin is a companion tool for RedFin's Unofficial API [https://github.com/reteps/redfin] that allows users to query houses and calculate/store financial viability.

BlueFin stores information in two ways; it stores specific house information inside of the House class, and stores a multitude of house data within pandas dataframes via the Homes class. The Homes class also has REET (Real Estate Evaluation Tool) functionality

In [3]:
from src.homes import Homes

In [4]:
house_list = Homes() #initializing a new variable of Homes class

In [5]:
house_name = input("Enter an address please")

The __add_info_from_address__ function performs three main actions. First, it utilizes the unofficial Redfin API to find information on a house. Then, it sorts the information into two dataframes. 

The first dataframe, denoted as *house_data* within the Homes class, contains basic information about the house, from the number of beds to square footage of the house. The information from this dataframe can be called via __.get_data()__. This is also the information that is printed by the __repr__ function.

The second dataframe, denoeed as *reet*, contains the financial information. The reet assumes a 30 year fixed loan and a closing payment of $6,000. It then collects information from the Redfin API and calculates out some of the information. It can be queried by __.get_reet()__.

In [None]:
house_list.add_info_from_address(house_name)

In [None]:
house_list.get_reet()

Unnamed: 0,address,purchase_price,down%,down_payment,closing_costs,loan_amount,loan_term,interest_rate,monthly_mortgage,monthly_hoa,property_tax_rate,monthly_property_tax,monthly_home_insurance,monthly_rent,annural_rent,monthly_expenses,monthly_cash_flow,annual_cash_flow,cap_rate,cash_on_cash_return
0,5811 Potomac Ave NW,5969085.56,20.0,1193817.112,6000,4781268.448,30,7.176,32377.042233,5.0,1.29,6416.766977,1193.817112,12701.0,152412.0,39992.626322,-27291.626322,-327499.515863,1.021324,-27.295786


In [None]:
house_list.get_data()

Unnamed: 0,address,city,state,zip_code,beds,baths,zestimate,last_price,parking,parking_type,sq_ft,lot_size,home_type
0,5811 Potomac Ave NW,Washington,DC,20016,5,6.5,"$5,969,086 (+$3.4M since last sold)",2499900,Detached Garage,Detached,7357,7049,Stucco


In [None]:
house_list

               address        city state zip_code  beds  baths                            zestimate  last_price          parking parking_type  sq_ft lot_size home_type
0  5811 Potomac Ave NW  Washington    DC    20016     5    6.5  $5,969,086 (+$3.4M since last sold)     2499900  Detached Garage     Detached   7357    7,049    Stucco

The code for __add_info_from_address__ will not add to either dataframe if the address doesn't exist or isn't available via the Redfin API. It outputs an empty House value instead

In [None]:
house_list.add_info_from_address("123 Housie Lane")

None None, None None: None

In [None]:
house_array = ["4544 Radnor St, Detroit, Michigan", "184 W Viola St, Mountain House, California", "834 N Museo Dr, Mountain House, California", "2 Buttons Rd, Chapel Hill, NC", "5517 Barbee Chapel Rd", "217 Summerwalk Cir #0", "511 Hillsborough St", "120 Rolling Meadows Ln"]
for house in house_array:
    house_list.add_info_from_address(house) #Here, some houses aren't available via the API; the function skips adding these and continues


In [None]:
house_list.get_reet()

Unnamed: 0,address,purchase_price,down%,down_payment,closing_costs,loan_amount,loan_term,interest_rate,monthly_mortgage,monthly_hoa,property_tax_rate,monthly_property_tax,monthly_home_insurance,monthly_rent,annural_rent,monthly_expenses,monthly_cash_flow,annual_cash_flow,cap_rate,cash_on_cash_return
0,5811 Potomac Ave NW,5969085.56,20.0,1193817.112,6000,4781268.448,30,7.176,32377.042233,5.0,1.29,6416.766977,1193.817112,12701.0,152412.0,39992.626322,-27291.626322,-327499.515863,1.021324,-27.295786
1,4544 Radnor St,58317.38,20.0,11663.476,6000,52653.904,30,7.267,359.799777,0.0,1.29,62.691183,25.270865,0.0,0.0,447.761825,0.0,0.0,-1.64115,0.0
2,184 W Viola St,809959.33,20.0,161991.866,6000,653967.464,30,7.199,4438.611342,0.0,1.25,843.707635,215.989155,3089.0,37068.0,5498.308132,-2409.308132,-28911.697589,2.984418,-17.210177
3,834 N Museo Dr,908496.29,20.0,181699.258,6000,732797.032,30,7.188,4968.188414,0.0,1.25,946.350302,242.265677,3223.0,38676.0,6156.804393,-2933.804393,-35205.652718,2.669514,-18.756415
4,2 Buttons Rd,1300323.01,20.0,260064.602,6000,1046258.408,30,7.17,7080.643136,1.0,1.29,1397.847236,509.293179,2852.0,34224.0,8988.78355,-6136.78355,-73641.402604,0.867038,-27.678016
5,5517 Barbee Chapel Rd,159168.13,20.0,31833.626,6000,133334.504,30,7.17,902.352644,1.0,1.29,171.10574,62.340851,1635.0,19620.0,1136.799234,498.200766,5978.409187,10.175474,15.80184
6,511 Hillsborough St,745719.77,20.0,149143.954,6000,602575.816,30,7.182,4082.873691,330.0,1.29,801.648753,292.073577,3117.0,37404.0,5506.59602,-2389.59602,-28675.152244,2.703046,-18.482932


To return a dataframe containing only the houses fitting a specific criteria, you can use the following two functions. __.query_r__ checks a specific column in __reet__ for a condition, given as a string. __.query_h__ checks the given column in __house_data__ for the condition.

In [None]:
house_list.query_r('cash_on_cash_return', ">=0") #In this example, you can return the houses that break even or return positively

Unnamed: 0,address,purchase_price,down%,down_payment,closing_costs,loan_amount,loan_term,interest_rate,monthly_mortgage,monthly_hoa,property_tax_rate,monthly_property_tax,monthly_home_insurance,monthly_rent,annural_rent,monthly_expenses,monthly_cash_flow,annual_cash_flow,cap_rate,cash_on_cash_return
1,4544 Radnor St,58317.38,20.0,11663.476,6000,52653.904,30,7.267,359.799777,0.0,1.29,62.691183,25.270865,0.0,0.0,447.761825,0.0,0.0,-1.64115,0.0
5,5517 Barbee Chapel Rd,159168.13,20.0,31833.626,6000,133334.504,30,7.17,902.352644,1.0,1.29,171.10574,62.340851,1635.0,19620.0,1136.799234,498.200766,5978.409187,10.175474,15.80184


In [None]:
house_list.query_h('sq_ft', ">=2000") #With this exmple, the dataframe returns houses which have more than 2000 square feet

Unnamed: 0,address,city,state,zip_code,beds,baths,zestimate,last_price,parking,parking_type,sq_ft,lot_size,home_type
0,5811 Potomac Ave NW,Washington,DC,20016,5,6.5,"$5,969,086 (+$3.4M since last sold)",2499900.0,Detached Garage,Detached,7357,7049,Stucco
3,834 N Museo Dr,Mountain House,CA,95391,3,2.5,"$908,496",,2,,2111,6637,New
4,2 Buttons Rd,Chapel Hill,NC,27514,5,3.0,"$1,300,323",,ROCKY RIDGE,3399,3594,30928,Hip


If you want to remove a specfic address from a dataframe, you can do so via the __remove_house_reet__ or __remove_house_data__ functions. Each function takes an address and removes occurrences of the address from the dataframes. If you want to remove a house from both, you can simply call the __remove_house__ function with the same input. If a house has been duplicated, you can use the __remove_duplictes__ function. 

In [None]:
house_array = ["4544 Radnor St, Detroit, Michigan", "184 W Viola St, Mountain House, California", "834 N Museo Dr, Mountain House, California", "2 Buttons Rd, Chapel Hill, NC", "5517 Barbee Chapel Rd", "217 Summerwalk Cir #0", "511 Hillsborough St", "120 Rolling Meadows Ln"]
for house in house_array:
    house_list.add_info_from_address(house) #Imagine this snippet was called again


house_list.get_data()

Unnamed: 0,address,city,state,zip_code,beds,baths,zestimate,last_price,parking,parking_type,sq_ft,lot_size,home_type
0,5811 Potomac Ave NW,Washington,DC,20016,5,6.5,"$5,969,086 (+$3.4M since last sold)",2499900.0,Detached Garage,Detached,7357,7049.0,Stucco
1,4544 Radnor St,Detroit,MI,48224,4,2.0,"$58,317 (+$28K since last sold)",30000.0,1,Detached Garage,656,4356.0,Single Family
2,184 W Viola St,Mountain House,CA,95391,3,2.5,"$809,959",,2,,1957,3870.0,New
3,834 N Museo Dr,Mountain House,CA,95391,3,2.5,"$908,496",,2,,2111,6637.0,New
4,2 Buttons Rd,Chapel Hill,NC,27514,5,3.0,"$1,300,323",,ROCKY RIDGE,3399,3594,30928.0,Hip
5,5517 Barbee Chapel Rd,Chapel Hill,NC,27517,2,1.0,"$159,168",,Attached Carport,414,921,24786.0,Concrete Block
6,511 Hillsborough St,Chapel Hill,NC,27514,3,3.0,"$745,720 (+$295K since last sold)",450000.0,,,1780,,New
7,4544 Radnor St,Detroit,MI,48224,4,2.0,"$58,317 (+$28K since last sold)",30000.0,1,Detached Garage,656,4356.0,Single Family
8,184 W Viola St,Mountain House,CA,95391,3,2.5,"$809,959",,2,,1957,3870.0,New
9,834 N Museo Dr,Mountain House,CA,95391,3,2.5,"$908,496",,2,,2111,6637.0,New


In [None]:
house_list.remove_duplicates()
house_list.get_data()

Unnamed: 0,address,city,state,zip_code,beds,baths,zestimate,last_price,parking,parking_type,sq_ft,lot_size,home_type
0,5811 Potomac Ave NW,Washington,DC,20016,5,6.5,"$5,969,086 (+$3.4M since last sold)",2499900.0,Detached Garage,Detached,7357,7049.0,Stucco
1,4544 Radnor St,Detroit,MI,48224,4,2.0,"$58,317 (+$28K since last sold)",30000.0,1,Detached Garage,656,4356.0,Single Family
2,184 W Viola St,Mountain House,CA,95391,3,2.5,"$809,959",,2,,1957,3870.0,New
3,834 N Museo Dr,Mountain House,CA,95391,3,2.5,"$908,496",,2,,2111,6637.0,New
4,2 Buttons Rd,Chapel Hill,NC,27514,5,3.0,"$1,300,323",,ROCKY RIDGE,3399,3594,30928.0,Hip
5,5517 Barbee Chapel Rd,Chapel Hill,NC,27517,2,1.0,"$159,168",,Attached Carport,414,921,24786.0,Concrete Block
6,511 Hillsborough St,Chapel Hill,NC,27514,3,3.0,"$745,720 (+$295K since last sold)",450000.0,,,1780,,New


In [None]:
#If we want to remove a specific house, we can do so;
house_list.remove_house('834 N Museo Dr')

In [None]:
house_list.get_reet()

Unnamed: 0,address,purchase_price,down%,down_payment,closing_costs,loan_amount,loan_term,interest_rate,monthly_mortgage,monthly_hoa,property_tax_rate,monthly_property_tax,monthly_home_insurance,monthly_rent,annural_rent,monthly_expenses,monthly_cash_flow,annual_cash_flow,cap_rate,cash_on_cash_return
0,5811 Potomac Ave NW,5969085.56,20.0,1193817.112,6000,4781268.448,30,7.176,32377.042233,5.0,1.29,6416.766977,1193.817112,12701.0,152412.0,39992.626322,-27291.626322,-327499.515863,1.021324,-27.295786
1,4544 Radnor St,58317.38,20.0,11663.476,6000,52653.904,30,7.267,359.799777,0.0,1.29,62.691183,25.270865,0.0,0.0,447.761825,0.0,0.0,-1.64115,0.0
2,184 W Viola St,809959.33,20.0,161991.866,6000,653967.464,30,7.199,4438.611342,0.0,1.25,843.707635,215.989155,3089.0,37068.0,5498.308132,-2409.308132,-28911.697589,2.984418,-17.210177
3,2 Buttons Rd,1300323.01,20.0,260064.602,6000,1046258.408,30,7.17,7080.643136,1.0,1.29,1397.847236,509.293179,2852.0,34224.0,8988.78355,-6136.78355,-73641.402604,0.867038,-27.678016
4,5517 Barbee Chapel Rd,159168.13,20.0,31833.626,6000,133334.504,30,7.17,902.352644,1.0,1.29,171.10574,62.340851,1635.0,19620.0,1136.799234,498.200766,5978.409187,10.175474,15.80184
5,511 Hillsborough St,745719.77,20.0,149143.954,6000,602575.816,30,7.182,4082.873691,330.0,1.29,801.648753,292.073577,3117.0,37404.0,5506.59602,-2389.59602,-28675.152244,2.703046,-18.482932
6,4544 Radnor St,58317.38,20.0,11663.476,6000,52653.904,30,7.204,357.55108,0.0,1.29,62.691183,25.270865,0.0,0.0,445.513129,0.0,0.0,-1.64115,0.0
7,184 W Viola St,809959.33,20.0,161991.866,6000,653967.464,30,7.188,4433.742818,0.0,1.25,843.707635,215.989155,3089.0,37068.0,5493.439608,-2404.439608,-28853.275296,2.984418,-17.1754
8,2 Buttons Rd,1300323.01,20.0,260064.602,6000,1046258.408,30,7.225,7119.594072,1.0,1.29,1397.847236,509.293179,2852.0,34224.0,9027.734487,-6175.734487,-74108.813845,0.867038,-27.853692
9,5517 Barbee Chapel Rd,159168.13,20.0,31833.626,6000,133334.504,30,7.182,903.434761,1.0,1.29,171.10574,62.340851,1635.0,19620.0,1137.881352,497.118648,5965.423777,10.175474,15.767518


In [None]:
house_list.add_info_from_address("34 Matrix Ct")
house_list

NameError: name 'house_list' is not defined

------------------------------------------------------------------------------------------------------------------------