# This Jupyter Notebook imports Excel File with Housing

## and tells the best options based on data...

I personally was looking for 1 bedroom / studio in range up to USD 1,500/month.

In addition, I wanted the place with good reviews.

Also, I needed 3 months only, so it wasn't that easy...

I expect to choose 5 places out of 150 :)

Let's start by importing our excel file:

In [1]:
import pandas as pd
from IPython.display import display
houses = pd.read_excel('Housing.xlsx')
display(houses.head())

Unnamed: 0,Count,Name,S/1b/2b,$/month,Minimum Term,By Bike,By Bus,#buses,Reviews Google,Yelp Reviews,webpage,emailed?,called?,Love?,Comments
0,1,Heights Park Row,,,,22 minutes,1:30hr,2.0,Bad,,https://www.heightsparkrow.com/Marketing/Contact,Yes,No,No,
1,2,14220 Apartments at Park Row,,,,27 min,1:20hr,2.0,Terrible,,https://www.14220houstonapartments.com/houston...,No,No,No,Overcharged when moving out
2,3,The Legend at Park Ten Apartments,,,,31 min,1:40hr,2.0,Bad,,https://www.legendatparkten.com/?utm_source=Go...,No,No,No,Don’t allow other people to sleep in apparantl...
3,4,Park Place Houston,,,,,,,Terrible,,,No,No,No,"Crime, bugs"
4,5,Marquis on Park Row,,,,,,,Terrible,,https://www.cwsapartments.com/marquis-on-park-...,No,No,No,"Crime, roaches, no maintanence"


Now let's get rid of NaNs first, and the count column (which is quite useless here):

In [2]:
del houses['Count']
houses['$/month'] = houses['$/month'].fillna(0)
houses = houses.fillna('?')
display(houses.head())

Unnamed: 0,Name,S/1b/2b,$/month,Minimum Term,By Bike,By Bus,#buses,Reviews Google,Yelp Reviews,webpage,emailed?,called?,Love?,Comments
0,Heights Park Row,?,0,?,22 minutes,1:30hr,2,Bad,?,https://www.heightsparkrow.com/Marketing/Contact,Yes,No,No,?
1,14220 Apartments at Park Row,?,0,?,27 min,1:20hr,2,Terrible,?,https://www.14220houstonapartments.com/houston...,No,No,No,Overcharged when moving out
2,The Legend at Park Ten Apartments,?,0,?,31 min,1:40hr,2,Bad,?,https://www.legendatparkten.com/?utm_source=Go...,No,No,No,Don’t allow other people to sleep in apparantl...
3,Park Place Houston,?,0,?,?,?,?,Terrible,?,?,No,No,No,"Crime, bugs"
4,Marquis on Park Row,?,0,?,?,?,?,Terrible,?,https://www.cwsapartments.com/marquis-on-park-...,No,No,No,"Crime, roaches, no maintanence"


I think that's it so far for preprocessing :)

Let's now select the best candidates...

## Shortlist1: The best candidates as of right now

Idea1: I need 3 months or below as a Minimum Term length.

Idea2: I need price below $1,500/month.

Idea3: Reviews should be decent.

First 2 ideas seem straightforward. How to go about third? I used word classification there. Let's see all the options:

In [3]:
houses['Reviews Google'].unique()

array(['Bad', 'Terrible', 'Ok', '?', 'Awesome', 'Good', 'The worst lol',
       'The worst', 'No Raiting', 'Amazing'], dtype=object)

My idea of decent would be:

Ok
Good
Awesome
Amazing

I want to also give chance to:

?
No Raiting (raiting eh?)

In other words I say 'No' to:

Bad
Terrible
The worst
The worst lol

In [4]:
good_rating = ['Ok', 'Good', '?', 'Awesome', 'No Raiting', 'Amazing']
shortlist1 = houses.loc[houses['Reviews Google'].isin(good_rating)]
del shortlist1['webpage']
del shortlist1['emailed?']
del shortlist1['called?']
display(shortlist1.head())

Unnamed: 0,Name,S/1b/2b,$/month,Minimum Term,By Bike,By Bus,#buses,Reviews Google,Yelp Reviews,Love?,Comments
7,Broadstone Energy Park,?,0,12 months,34 min,1:35hr,2,Ok,?,No,Roaches
10,Sunrise Briar Forest,?,1140,3 months,?,?,?,Ok,?,No,"Rude, roaches, spiders, lots of 5-star revies ..."
11,Vista Energy Corridor,1b,1500,3 months,?,?,?,?,?,?,?
13,Aliso Briar Forest,?,1907,3 months,?,?,?,Awesome,?,?,?
14,The Grand on Memorial,?,1105,3 months,?,?,?,?,?,?,?


Now let's think about how we can select 3 months and below.

Let's actually see if there are values below 3 months:

In [5]:
shortlist1['Minimum Term'].unique()

array(['12 months', '3 months', '?', '1 month', '6 months', '4 months',
       '5 months', '3 Months', '10 months', '7 months'], dtype=object)

I think it'll be actually very nice to apply simple regular expression processing here :)

In [6]:
import re

def min_term_processing(value):
    m = re.match("^(\d{1})\s+", value)
    if m is None or len(m.groups()) > 2:
        return 12
    else:
        return int(m.groups()[0])
    
shortlist1['Minimum Term'] = shortlist1['Minimum Term'].map(min_term_processing)
shortlist1 = shortlist1.loc[shortlist1['Minimum Term'] < 4]
display(shortlist1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  # Remove the CWD from sys.path while we load stuff.


Unnamed: 0,Name,S/1b/2b,$/month,Minimum Term,By Bike,By Bus,#buses,Reviews Google,Yelp Reviews,Love?,Comments
10,Sunrise Briar Forest,?,1140,3,?,?,?,Ok,?,No,"Rude, roaches, spiders, lots of 5-star revies ..."
11,Vista Energy Corridor,1b,1500,3,?,?,?,?,?,?,?
13,Aliso Briar Forest,?,1907,3,?,?,?,Awesome,?,?,?
14,The Grand on Memorial,?,1105,3,?,?,?,?,?,?,?
16,7-Seventy,?,?,1,?,?,?,?,?,?,3 Months or even 1 month based on their webpage


Now let's get rid of too expensive ones:

In [10]:
shortlist1 = shortlist1[pd.to_numeric(shortlist1['$/month'], errors='coerce').notnull()]
shortlist1 = shortlist1.loc[shortlist1['$/month'] < 1501]

Unnamed: 0,Name,S/1b/2b,$/month,Minimum Term,By Bike,By Bus,#buses,Reviews Google,Yelp Reviews,Love?,Comments
10,Sunrise Briar Forest,?,1140,3,?,?,?,Ok,?,No,"Rude, roaches, spiders, lots of 5-star revies ..."
11,Vista Energy Corridor,1b,1500,3,?,?,?,?,?,?,?
14,The Grand on Memorial,?,1105,3,?,?,?,?,?,?,?
36,The Retreat at Eldridge Apartments,1b,1270,3,?,?,?,Ok,?,?,?
48,Mandalay at Shadow Lake,1b,1153,3,36 min,?,?,Good,?,Yes,?
61,Plaza at Westchase,?,1440,3,?,?,?,Good,?,?,"So its 1,400 for 2 months, but for 3 months it..."
64,Marquis at Westchase,?,1384,3,?,?,?,Awesome,?,?,?
78,District at Memorial,1b,1350,3,?,?,?,?,?,Yes,?
80,Creekstone,?,1365,3,?,?,?,Awesome,?,?,?
93,Walden of Westchase,2b,1102,3,?,?,?,Ok,?,No,Amenities are horrible tbh


## AND here we are! Shortlist # 1, Ladies and Gentelmen :)

In [13]:
display(shortlist1.sort_values(['Love?', '$/month'], ascending=[False, True]))

Unnamed: 0,Name,S/1b/2b,$/month,Minimum Term,By Bike,By Bus,#buses,Reviews Google,Yelp Reviews,Love?,Comments
48,Mandalay at Shadow Lake,1b,1153,3,36 min,?,?,Good,?,Yes,?
78,District at Memorial,1b,1350,3,?,?,?,?,?,Yes,?
101,The Lodge at Spring Shadows,1b,1381,3,?,?,?,Good,?,Yes,?
93,Walden of Westchase,2b,1102,3,?,?,?,Ok,?,No,Amenities are horrible tbh
10,Sunrise Briar Forest,?,1140,3,?,?,?,Ok,?,No,"Rude, roaches, spiders, lots of 5-star revies ..."
118,Zocalo,?,986,3,?,?,?,Awesome,?,?,Idk man the reviews arefrom the guys with 0 re...
14,The Grand on Memorial,?,1105,3,?,?,?,?,?,?,?
36,The Retreat at Eldridge Apartments,1b,1270,3,?,?,?,Ok,?,?,?
80,Creekstone,?,1365,3,?,?,?,Awesome,?,?,?
64,Marquis at Westchase,?,1384,3,?,?,?,Awesome,?,?,?


## Shortlist2: potential candidates

This list includes those places that have a really nice rating, but no information about leasing length.

So, potentially, they can be your new home, you just need to call them :)

Idea1: Minimum Term is '?'

Idea2: Rating is 'Good' or 'Awesome' or 'Amazing' (and mb '?' too :) )

In [16]:
shortlist2 = houses.copy()
shortlist2 = shortlist2.loc[shortlist2['Minimum Term'] == '?']
perfect_rating = ['Good', '?', 'Awesome', 'Amazing']
shortlist2 = shortlist2.loc[shortlist2['Reviews Google'].isin(perfect_rating)]
shortlist2 = shortlist2.loc[shortlist2['Name'] != '?']
del shortlist2['webpage']
display(shortlist2)

Unnamed: 0,Name,S/1b/2b,$/month,Minimum Term,By Bike,By Bus,#buses,Reviews Google,Yelp Reviews,emailed?,called?,Love?,Comments
15,Broadstone Memorial,?,0,?,?,?,?,?,?,Yes,?,?,?
54,Ashford Briar Point,?,0,?,?,?,?,Awesome,?,Yes,?,Yes,?
68,Westchase Forest,?,0,?,?,?,?,Good,?,Yes,?,?,?
70,Upland Park Townhomes,?,0,?,?,?,?,?,?,?,?,?,?
71,Memorial Fountain,?,0,?,?,?,?,Good,?,Yes,No,Yes,Klie's recommendation
116,Laguna Vista,?,0,?,?,?,?,Good,?,Yes,?,?,?
127,Alexan Enclave,?,0,?,?,?,?,Good,?,Yes,?,Yes,?
131,Arrabella,?,0,?,?,?,?,Good,?,?,?,Yes,?
133,Jackson Hill Apartments,?,0,?,?,?,?,Awesome,?,Yes,?,Yes,?
134,Fairmont Museum District,?,0,?,?,?,?,Awesome,?,Yes,?,Yes,?
