<a href="https://colab.research.google.com/github/jsmazorra/DS-Unit-2-Linear-Models/blob/master/module2-regression-2/Johan_Mazorra_LS_DS13_212_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Lambda School Data Science

*Unit 2, Sprint 1, Module 2*

---

# Regression 2

## Assignment

You'll continue to **predict how much it costs to rent an apartment in NYC,** using the dataset from renthop.com.

- [ ] Do train/test split. Use data from April & May 2016 to train. Use data from June 2016 to test.
- [ ] Engineer at least two new features. (See below for explanation & ideas.)
- [ ] Fit a linear regression model with at least two features.
- [ ] Get the model's coefficients and intercept.
- [ ] Get regression metrics RMSE, MAE, and $R^2$, for both the train and test data.
- [ ] What's the best test MAE you can get? Share your score and features used with your cohort on Slack!
- [ ] As always, commit your notebook to your fork of the GitHub repo.


#### [Feature Engineering](https://en.wikipedia.org/wiki/Feature_engineering)

> "Some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used." — Pedro Domingos, ["A Few Useful Things to Know about Machine Learning"](https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf)

> "Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering." — Andrew Ng, [Machine Learning and AI via Brain simulations](https://forum.stanford.edu/events/2011/2011slides/plenary/2011plenaryNg.pdf) 

> Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. 

#### Feature Ideas
- Does the apartment have a description?
- How long is the description?
- How many total perks does each apartment have?
- Are cats _or_ dogs allowed?
- Are cats _and_ dogs allowed?
- Total number of rooms (beds + baths)
- Ratio of beds to baths
- What's the neighborhood, based on address or latitude & longitude?

## Stretch Goals
- [ ] If you want more math, skim [_An Introduction to Statistical Learning_](http://faculty.marshall.usc.edu/gareth-james/ISL/ISLR%20Seventh%20Printing.pdf),  Chapter 3.1, Simple Linear Regression, & Chapter 3.2, Multiple Linear Regression
- [ ] If you want more introduction, watch [Brandon Foltz, Statistics 101: Simple Linear Regression](https://www.youtube.com/watch?v=ZkjP5RJLQF4)
(20 minutes, over 1 million views)
- [ ] Add your own stretch goal(s) !

In [0]:
%%capture
import sys

# If you're on Colab:
if 'google.colab' in sys.modules:
    DATA_PATH = 'https://raw.githubusercontent.com/LambdaSchool/DS-Unit-2-Applied-Modeling/master/data/'
    !pip install category_encoders==2.*

# If you're working locally:
else:
    DATA_PATH = '../data/'
    
# Ignore this Numpy warning when using Plotly Express:
# FutureWarning: Method .ptp is deprecated and will be removed in a future version. Use numpy.ptp instead.
import warnings
warnings.filterwarnings(action='ignore', category=FutureWarning, module='numpy')

In [0]:
import numpy as np
import pandas as pd

# Read New York City apartment rental listing data
df = pd.read_csv(DATA_PATH+'apartments/renthop-nyc.csv')
assert df.shape == (49352, 34)

# Remove the most extreme 1% prices,
# the most extreme .1% latitudes, &
# the most extreme .1% longitudes
df = df[(df['price'] >= np.percentile(df['price'], 0.5)) & 
        (df['price'] <= np.percentile(df['price'], 99.5)) & 
        (df['latitude'] >= np.percentile(df['latitude'], 0.05)) & 
        (df['latitude'] < np.percentile(df['latitude'], 99.95)) &
        (df['longitude'] >= np.percentile(df['longitude'], 0.05)) & 
        (df['longitude'] <= np.percentile(df['longitude'], 99.95))]

In [3]:
print(df.shape)
df.head()

(48817, 34)


Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space
0,1.5,3,2016-06-24 07:54:24,A Brand New 3 Bedroom 1.5 bath ApartmentEnjoy ...,Metropolitan Avenue,40.7145,-73.9425,3000,792 Metropolitan Avenue,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1.0,2,2016-06-12 12:19:27,,Columbus Avenue,40.7947,-73.9667,5465,808 Columbus Avenue,low,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,1.0,1,2016-04-17 03:26:41,"Top Top West Village location, beautiful Pre-w...",W 13 Street,40.7388,-74.0018,2850,241 W 13 Street,high,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,1.0,1,2016-04-18 02:22:02,Building Amenities - Garage - Garden - fitness...,East 49th Street,40.7539,-73.9677,3275,333 East 49th Street,low,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,1.0,4,2016-04-28 01:32:41,Beautifully renovated 3 bedroom flex 4 bedroom...,West 143rd Street,40.8241,-73.9493,3350,500 West 143rd Street,low,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [4]:
df.describe()

Unnamed: 0,bathrooms,bedrooms,latitude,longitude,price,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space
count,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0,48817.0
mean,1.201794,1.537149,40.75076,-73.97276,3579.585247,0.524838,0.478276,0.478276,0.447631,0.424852,0.415081,0.367085,0.052769,0.268452,0.185653,0.175902,0.132761,0.138394,0.102833,0.087203,0.060471,0.055206,0.051908,0.046193,0.043305,0.042711,0.039331,0.027224,0.026241
std,0.470711,1.106087,0.038954,0.028883,1762.430772,0.499388,0.499533,0.499533,0.497255,0.494326,0.492741,0.482015,0.223573,0.443158,0.38883,0.380741,0.33932,0.345317,0.303744,0.282136,0.238359,0.228385,0.221844,0.209905,0.203544,0.202206,0.194382,0.162738,0.159852
min,0.0,0.0,40.5757,-74.0873,1375.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,1.0,1.0,40.7283,-73.9918,2500.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,1.0,1.0,40.7517,-73.978,3150.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,1.0,2.0,40.774,-73.955,4095.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,10.0,8.0,40.9894,-73.7001,15500.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [0]:
# I want to seperate it by date so let me check the dtypes and change it respectively.

df.dtypes

bathrooms               float64
bedrooms                  int64
created                  object
description              object
display_address          object
latitude                float64
longitude               float64
price                     int64
street_address           object
interest_level           object
elevator                  int64
cats_allowed              int64
hardwood_floors           int64
dogs_allowed              int64
doorman                   int64
dishwasher                int64
no_fee                    int64
laundry_in_building       int64
fitness_center            int64
pre-war                   int64
laundry_in_unit           int64
roof_deck                 int64
outdoor_space             int64
dining_room               int64
high_speed_internet       int64
balcony                   int64
swimming_pool             int64
new_construction          int64
terrace                   int64
exclusive                 int64
loft                      int64
garden_p

In [0]:
df['created'] = pd.to_datetime(df['created'], infer_datetime_format=True)

In [30]:
# Okay, now we have it as datetime format.
df.dtypes

bathrooms                        float64
bedrooms                           int64
created                   datetime64[ns]
description                       object
display_address                   object
latitude                         float64
longitude                        float64
price                              int64
street_address                    object
interest_level                    object
elevator                           int64
cats_allowed                       int64
hardwood_floors                    int64
dogs_allowed                       int64
doorman                            int64
dishwasher                         int64
no_fee                             int64
laundry_in_building                int64
fitness_center                     int64
pre-war                            int64
laundry_in_unit                    int64
roof_deck                          int64
outdoor_space                      int64
dining_room                        int64
high_speed_inter

In [6]:
df.head()

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space
0,1.5,3,2016-06-24 07:54:24,A Brand New 3 Bedroom 1.5 bath ApartmentEnjoy ...,Metropolitan Avenue,40.7145,-73.9425,3000,792 Metropolitan Avenue,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1.0,2,2016-06-12 12:19:27,,Columbus Avenue,40.7947,-73.9667,5465,808 Columbus Avenue,low,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,1.0,1,2016-04-17 03:26:41,"Top Top West Village location, beautiful Pre-w...",W 13 Street,40.7388,-74.0018,2850,241 W 13 Street,high,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,1.0,1,2016-04-18 02:22:02,Building Amenities - Garage - Garden - fitness...,East 49th Street,40.7539,-73.9677,3275,333 East 49th Street,low,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,1.0,4,2016-04-28 01:32:41,Beautifully renovated 3 bedroom flex 4 bedroom...,West 143rd Street,40.8241,-73.9493,3350,500 West 143rd Street,low,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [7]:
splitdate = '2016-06-01'
splitdate = pd.to_datetime(splitdate, infer_datetime_format=True)

splitdate

Timestamp('2016-06-01 00:00:00')

In [0]:
# Now, I'm going to make two new features

df['total_features'] = df['elevator'] + df['cats_allowed'] + df['dogs_allowed'] + df['dishwasher'] + df['laundry_in_unit'] + df['dining_room'] + df['high_speed_internet']
df['bathrooms_per_bedrooms'] = df['bathrooms'] / df['bedrooms']

def replace_nans_and_infs(inp):
  if np.isnan(inp):
    return 0
  if inp == np.inf:
    return 6
  return inp
df['bathrooms_per_bedrooms'] = df['bathrooms_per_bedrooms'].apply(replace_nans_and_infs)

In [9]:
df['bathrooms_per_bedrooms'].value_counts(dropna=False)

1.000000    19014
0.500000    12240
6.000000     9166
0.333333     3603
0.666667     2772
0.250000      366
0.750000      343
0.000000      304
2.000000      206
1.500000      188
0.400000      129
0.833333      128
1.250000       83
0.375000       61
0.625000       38
0.600000       35
0.800000       31
0.875000       27
1.166667       24
3.000000       14
1.333333       12
0.200000       10
1.125000        7
0.300000        5
0.700000        3
2.500000        2
1.750000        1
0.416667        1
4.500000        1
5.000000        1
0.166667        1
0.428571        1
Name: bathrooms_per_bedrooms, dtype: int64

In [10]:
# Next, let's seperate the training and testing data.
train = df[df['created'] < splitdate]
test = df[df['created'] >= splitdate]

print(train['created'].dt.month.value_counts())
print(test['created'].dt.month.value_counts())

4    16217
5    15627
Name: created, dtype: int64
6    16973
Name: created, dtype: int64


In [11]:
train.head()

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,total_features,bathrooms_per_bedrooms
2,1.0,1,2016-04-17 03:26:41,"Top Top West Village location, beautiful Pre-w...",W 13 Street,40.7388,-74.0018,2850,241 W 13 Street,high,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1.0
3,1.0,1,2016-04-18 02:22:02,Building Amenities - Garage - Garden - fitness...,East 49th Street,40.7539,-73.9677,3275,333 East 49th Street,low,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1.0
4,1.0,4,2016-04-28 01:32:41,Beautifully renovated 3 bedroom flex 4 bedroom...,West 143rd Street,40.8241,-73.9493,3350,500 West 143rd Street,low,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.25
5,2.0,4,2016-04-19 04:24:47,,West 18th Street,40.7429,-74.0028,7995,350 West 18th Street,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5
6,1.0,2,2016-04-27 03:19:56,Stunning unit with a great location and lots o...,West 107th Street,40.8012,-73.966,3600,210 West 107th Street,low,0,1,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0.5


In [12]:
test.head()

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,total_features,bathrooms_per_bedrooms
0,1.5,3,2016-06-24 07:54:24,A Brand New 3 Bedroom 1.5 bath ApartmentEnjoy ...,Metropolitan Avenue,40.7145,-73.9425,3000,792 Metropolitan Avenue,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5
1,1.0,2,2016-06-12 12:19:27,,Columbus Avenue,40.7947,-73.9667,5465,808 Columbus Avenue,low,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0.5
11,1.0,1,2016-06-03 03:21:22,Check out this one bedroom apartment in a grea...,W. 173rd Street,40.8448,-73.9396,1675,644 W. 173rd Street,low,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1.0
14,1.0,1,2016-06-01 03:11:01,Spacious 1-Bedroom to fit King-sized bed comfo...,East 56th St..,40.7584,-73.9648,3050,315 East 56th St..,low,1,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1.0
24,2.0,4,2016-06-07 04:39:56,SPRAWLING 2 BEDROOM FOUND! ENJOY THE LUXURY OF...,W 18 St.,40.7391,-73.9936,7400,30 W 18 St.,medium,1,1,1,1,1,1,0,0,1,0,0,0,1,0,1,1,0,0,1,0,0,0,0,0,5,0.5


In [20]:
# Now let's do the linear regression on the two features that were just made.
# Import the necessary libraries
import itertools
import plotly.express as px
import plotly.graph_objects as go
from sklearn.linear_model import LinearRegression

fig = px.scatter(train,trendline='ols', color='total_features', x='total_features',y='bathrooms_per_bedrooms',title='Condo Price by Bathroom to Bedroom Ratio and Number of Features')

# Fit the linear regression
model = LinearRegression()
model.fit(train[['total_features', 'bathrooms_per_bedrooms']], train[['price']])

xmin, xmax = train['total_features'].min(), train['total_features'].max()
ymin, ymax = train['bathrooms_per_bedrooms'].min(), train['bathrooms_per_bedrooms'].max()
xcoords = np.linspace(xmin, xmax, 100)
ycoords = np.linspace(ymin, ymax, 100)
coords = list(itertools.product(xcoords, ycoords))

# This is for the prediction of the price
pred = model.predict(coords)

fig.add_trace(go.Surface(x=xcoords, y=ycoords))

In [0]:
# Now let's print the coefficients and intercept.

x_coef, y_coef = model.coef_[0]
print(f'The coefficient for \'Bathrooms per Bedroom\' is {x_coef:.2f}')
print(f'The coefficient for \'Total Features\' is {y_coef:.2f}')
print(f'The intercept is {model.intercept_[0]:.2f}')

The coefficient for 'Bathrooms per Bedroom' is 301.94
The coefficient for 'Total Features' is -242.89
The intercept is 3324.86


In [23]:
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# Test the predictions against the training data.
pred = model.predict(train[['total_features', 'bathrooms_per_bedrooms']])
rmse = np.sqrt(mean_squared_error(train['price'], pred))
mae = mean_absolute_error(train['price'], pred)
r2 = r2_score(train['price'], pred)
print('Training Data:')
print('\tRoot Mean Squared Error:',rmse)
print('\tMean Absolute Error:',mae)
print('\tR^2:',r2)

print()

pred = model.predict(test[['total_features', 'bathrooms_per_bedrooms']])
rmse = np.sqrt(mean_squared_error(test['price'], pred))
mae = mean_absolute_error(test['price'], pred)
r2 = r2_score(test['price'], pred)
print('Testing Data:')
print('\tRoot Mean Squared Error:',rmse)
print('\tMean Absolute Error:',mae)
print('\tR^2:',r2)

Training Data:
	Root Mean Squared Error: 4154.180257381969
	Mean Absolute Error: 3266.128901167188
	R^2: -4.557828270733797

Testing Data:
	Root Mean Squared Error: 4088.363802156753
	Mean Absolute Error: 3220.72550080284
	R^2: -4.377929279851546


In [0]:
def get_errors(features, target):
  train_features = train[features]
  train_target = train[target]
  test_features = test[features]
  test_target = test[target]

  model.fit(train_features, train_target)

  # This is for the training data predictions.
  pred = model.predict(train_features)
  rmse = np.sqrt(mean_squared_error(train_target, pred))
  mae = mean_absolute_error(train_target, pred)
  r2 = r2_score(train_target, pred)
  print('Training Data:')
  print('\tIntercept:',model.intercept_)
  for i, coef in enumerate(model.coef_):
    print(f'\tCoef {features[i]}: {model.coef_[i]}')
  print('\tRoot Mean Squared Error:',rmse)
  print('\tMean Absolute Error:',mae)
  print('\tR^2:',r2,'\n')

  # This is for the testing data predictions.
  pred = model.predict(test_features)
  rmse = np.sqrt(mean_squared_error(test_target, pred))
  mae = mean_absolute_error(test_target, pred)
  r2 = r2_score(test_target, pred)
  print('Testing Data:')
  print('\tIntercept:',model.intercept_)
  for i, coef in enumerate(model.coef_):
    print(f'\tCoef {features[i]}: {model.coef_[i]}')
  print('\tRoot Mean Squared Error:',rmse)
  print('\tMean Absolute Error:',mae)
  print('\tR^2:',r2)

In [14]:
df.head()

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,total_features,bathrooms_per_bedrooms
0,1.5,3,2016-06-24 07:54:24,A Brand New 3 Bedroom 1.5 bath ApartmentEnjoy ...,Metropolitan Avenue,40.7145,-73.9425,3000,792 Metropolitan Avenue,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5
1,1.0,2,2016-06-12 12:19:27,,Columbus Avenue,40.7947,-73.9667,5465,808 Columbus Avenue,low,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0.5
2,1.0,1,2016-04-17 03:26:41,"Top Top West Village location, beautiful Pre-w...",W 13 Street,40.7388,-74.0018,2850,241 W 13 Street,high,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1.0
3,1.0,1,2016-04-18 02:22:02,Building Amenities - Garage - Garden - fitness...,East 49th Street,40.7539,-73.9677,3275,333 East 49th Street,low,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1.0
4,1.0,4,2016-04-28 01:32:41,Beautifully renovated 3 bedroom flex 4 bedroom...,West 143rd Street,40.8241,-73.9493,3350,500 West 143rd Street,low,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.25


In [25]:
# Testing for bathrooms and bedrooms.
get_errors(['bathrooms', 'bedrooms'], 'price')

Training Data:
	Intercept: 485.71869002322865
	Coef bathrooms: 2072.610116385187
	Coef bedrooms: 389.32489590255614
	Root Mean Squared Error: 1232.0225917223484
	Mean Absolute Error: 818.5310213271714
	R^2: 0.5111543084316607 

Testing Data:
	Intercept: 485.71869002322865
	Coef bathrooms: 2072.610116385187
	Coef bedrooms: 389.32489590255614
	Root Mean Squared Error: 1219.719357233823
	Mean Absolute Error: 825.8987822403527
	R^2: 0.5213303957090345


In [28]:
# Testing for cats and dog allowed
get_errors(['cats_allowed', 'dogs_allowed'], 'price')

Training Data:
	Intercept: 3487.3129660323707
	Coef cats_allowed: -138.11834995802153
	Coef dogs_allowed: 345.8311113485951
	Root Mean Squared Error: 1758.6703392715874
	Mean Absolute Error: 1197.3520771096757
	R^2: 0.0038991129252357037 

Testing Data:
	Intercept: 3487.3129660323707
	Coef cats_allowed: -138.11834995802153
	Coef dogs_allowed: 345.8311113485951
	Root Mean Squared Error: 1759.6374029126257
	Mean Absolute Error: 1194.168712367809
	R^2: 0.0037636415955266678


In [27]:
# Testing for exclusive and pre-war.
get_errors(['exclusive', 'pre-war'], 'price')

Training Data:
	Intercept: 3606.2819323601757
	Coef exclusive: -114.54008843113444
	Coef pre-war: -138.9523551847877
	Root Mean Squared Error: 1761.1790945063656
	Mean Absolute Error: 1200.9433969564448
	R^2: 0.001055196518376711 

Testing Data:
	Intercept: 3606.2819323601757
	Coef exclusive: -114.54008843113444
	Coef pre-war: -138.9523551847877
	Root Mean Squared Error: 1762.0131864488862
	Mean Absolute Error: 1196.451288039743
	R^2: 0.0010716781983625134


In [26]:
# And finally, testing for latitude and longitude.
get_errors(['latitude', 'longitude'], 'price')

Training Data:
	Intercept: -1285931.9851655478
	Coef latitude: 2208.1897189587876
	Coef longitude: -16215.705413894815
	Root Mean Squared Error: 1704.207955259876
	Mean Absolute Error: 1147.1493278231892
	R^2: 0.06463820907125306 

Testing Data:
	Intercept: -1285931.9851655478
	Coef latitude: 2208.1897189587876
	Coef longitude: -16215.705413894815
	Root Mean Squared Error: 1703.0806186783304
	Mean Absolute Error: 1139.700457630833
	R^2: 0.06677485649195447
