<a href="https://colab.research.google.com/github/mkhalil7625/DS-Unit-2-Linear-Models/blob/master/module2-regression-2/Copy_of_LS_DS_212_assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Lambda School Data Science

*Unit 2, Sprint 1, Module 2*

---

# Regression 2

## Assignment

You'll continue to **predict how much it costs to rent an apartment in NYC,** using the dataset from renthop.com.

- [ ] Do train/test split. Use data from April & May 2016 to train. Use data from June 2016 to test.
- [ ] Engineer at least two new features. (See below for explanation & ideas.)
- [ ] Fit a linear regression model with at least two features.
- [ ] Get the model's coefficients and intercept.
- [ ] Get regression metrics RMSE, MAE, and $R^2$, for both the train and test data.
- [ ] What's the best test MAE you can get? Share your score and features used with your cohort on Slack!
- [ ] As always, commit your notebook to your fork of the GitHub repo.


#### [Feature Engineering](https://en.wikipedia.org/wiki/Feature_engineering)

> "Some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used." — Pedro Domingos, ["A Few Useful Things to Know about Machine Learning"](https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf)

> "Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering." — Andrew Ng, [Machine Learning and AI via Brain simulations](https://forum.stanford.edu/events/2011/2011slides/plenary/2011plenaryNg.pdf) 

> Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. 

#### Feature Ideas
- Does the apartment have a description?
- How long is the description?
- How many total perks does each apartment have?
- Are cats _or_ dogs allowed?
- Are cats _and_ dogs allowed?
- Total number of rooms (beds + baths)
- Ratio of beds to baths
- What's the neighborhood, based on address or latitude & longitude?

## Stretch Goals
- [ ] If you want more math, skim [_An Introduction to Statistical Learning_](http://faculty.marshall.usc.edu/gareth-james/ISL/ISLR%20Seventh%20Printing.pdf),  Chapter 3.1, Simple Linear Regression, & Chapter 3.2, Multiple Linear Regression
- [ ] If you want more introduction, watch [Brandon Foltz, Statistics 101: Simple Linear Regression](https://www.youtube.com/watch?v=ZkjP5RJLQF4)
(20 minutes, over 1 million views)
- [ ] Add your own stretch goal(s) !

In [1]:
%%capture
import sys

# If you're on Colab:
if 'google.colab' in sys.modules:
    DATA_PATH = 'https://raw.githubusercontent.com/LambdaSchool/DS-Unit-2-Applied-Modeling/master/data/'
    !pip install category_encoders==2.*

# If you're working locally:
else:
    DATA_PATH = '../data/'
    
# Ignore this Numpy warning when using Plotly Express:
# FutureWarning: Method .ptp is deprecated and will be removed in a future version. Use numpy.ptp instead.
import warnings
warnings.filterwarnings(action='ignore', category=FutureWarning, module='numpy')

In [2]:
import numpy as np
import pandas as pd

# Read New York City apartment rental listing data
df = pd.read_csv(DATA_PATH+'apartments/renthop-nyc.csv')
assert df.shape == (49352, 34)

# Remove the most extreme 1% prices,
# the most extreme .1% latitudes, &
# the most extreme .1% longitudes
df = df[(df['price'] >= np.percentile(df['price'], 0.5)) & 
        (df['price'] <= np.percentile(df['price'], 99.5)) & 
        (df['latitude'] >= np.percentile(df['latitude'], 0.05)) & 
        (df['latitude'] < np.percentile(df['latitude'], 99.95)) &
        (df['longitude'] >= np.percentile(df['longitude'], 0.05)) & 
        (df['longitude'] <= np.percentile(df['longitude'], 99.95))]

In [3]:
df.head()

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space
0,1.5,3,2016-06-24 07:54:24,A Brand New 3 Bedroom 1.5 bath ApartmentEnjoy ...,Metropolitan Avenue,40.7145,-73.9425,3000,792 Metropolitan Avenue,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1.0,2,2016-06-12 12:19:27,,Columbus Avenue,40.7947,-73.9667,5465,808 Columbus Avenue,low,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,1.0,1,2016-04-17 03:26:41,"Top Top West Village location, beautiful Pre-w...",W 13 Street,40.7388,-74.0018,2850,241 W 13 Street,high,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,1.0,1,2016-04-18 02:22:02,Building Amenities - Garage - Garden - fitness...,East 49th Street,40.7539,-73.9677,3275,333 East 49th Street,low,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,1.0,4,2016-04-28 01:32:41,Beautifully renovated 3 bedroom flex 4 bedroom...,West 143rd Street,40.8241,-73.9493,3350,500 West 143rd Street,low,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [4]:
df.tail()

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space
49347,1.0,2,2016-06-02 05:41:05,"30TH/3RD, MASSIVE CONV 2BR IN LUXURY FULL SERV...",E 30 St,40.7426,-73.979,3200,230 E 30 St,medium,1,0,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
49348,1.0,1,2016-04-04 18:22:34,"HIGH END condo finishes, swimming pool, and ki...",Rector Pl,40.7102,-74.0163,3950,225 Rector Place,low,1,1,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1
49349,1.0,1,2016-04-16 02:13:40,Large Renovated One Bedroom Apartment with Sta...,West 45th Street,40.7601,-73.99,2595,341 West 45th Street,low,1,1,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
49350,1.0,0,2016-04-08 02:13:33,Stylishly sleek studio apartment with unsurpas...,Wall Street,40.7066,-74.0101,3350,37 Wall Street,low,1,1,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
49351,1.0,2,2016-04-12 02:48:07,Look no further!!! This giant 2 bedroom apart...,Park Terrace East,40.8699,-73.9172,2200,30 Park Terrace East,low,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [5]:
df['description'][1]

'        '

In [6]:
# crate a new column for description avaialability 
# < 10 characters and null will be 0 the rest will be 1
df['desc_avail'] = np.where(np.logical_or(df['description'].str.len() < 10,df['description'].isnull()) , 0,1)

In [7]:
df['desc_avail'].value_counts()

1    45524
0     3293
Name: desc_avail, dtype: int64

In [8]:
df[['description','desc_avail']].sample(10)

Unnamed: 0,description,desc_avail
20898,This is a renovated TRUE 2 Bedroom! Granite-ti...,1
23762,Semi-Attached Brick Townhouse On The F...,1
29869,I have total market coverage in Manhattan with...,1
36481,NO FEE Prestigious luxury high rise located in...,1
23418,"We are in the heart of the Upper West side, lo...",1
34859,Gorgeous 2 full bed with a large bathroom - la...,1
39458,Massive 2500 sq 3 Bedroom apartment with an sp...,1
33528,The midtown skyline to the East; the river one...,1
91,Renovated one bedroom apartment features deco ...,1
31659,Beautiful no fee huge one bedroom convertible ...,1


In [9]:
# sum of perks
df['totalperks']=df.iloc[:,10:].sum(axis=1)
df.head()

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,desc_avail,totalperks
0,1.5,3,2016-06-24 07:54:24,A Brand New 3 Bedroom 1.5 bath ApartmentEnjoy ...,Metropolitan Avenue,40.7145,-73.9425,3000,792 Metropolitan Avenue,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1
1,1.0,2,2016-06-12 12:19:27,,Columbus Avenue,40.7947,-73.9667,5465,808 Columbus Avenue,low,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5
2,1.0,1,2016-04-17 03:26:41,"Top Top West Village location, beautiful Pre-w...",W 13 Street,40.7388,-74.0018,2850,241 W 13 Street,high,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4
3,1.0,1,2016-04-18 02:22:02,Building Amenities - Garage - Garden - fitness...,East 49th Street,40.7539,-73.9677,3275,333 East 49th Street,low,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,3
4,1.0,4,2016-04-28 01:32:41,Beautifully renovated 3 bedroom flex 4 bedroom...,West 143rd Street,40.8241,-73.9493,3350,500 West 143rd Street,low,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2


In [10]:
df['interest_level'].value_counts()

low       33946
medium    11181
high       3690
Name: interest_level, dtype: int64

In [11]:
df['cats_or_dogs']= np.where(np.logical_or(df['cats_allowed'],df['dogs_allowed'])==1,1,0)
df.sample(6)

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,desc_avail,totalperks,cats_or_dogs
5925,2.0,2,2016-04-13 02:49:36,Beautifully renovated 2 bedroom has a large pr...,233 E 29th St.,40.7421,-73.9788,4095,233 E 29th St.,medium,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,4,0
3219,2.0,2,2016-04-23 01:12:44,Beautifully renovated 2 bedroom 2 bath apartme...,Cheever Place,40.6862,-73.9996,3800,30 Cheever Place,low,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,1
46980,1.0,2,2016-04-07 02:18:56,Great Deal 2 Bedroom/1 Bath Available in the h...,28th St,40.7742,-73.9158,2000,23-57 28th St,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0
29713,2.0,2,2016-05-14 01:22:14,Large two bedroom with floor to ceiling window...,East 92nd Street,40.7805,-73.9464,5650,408 East 92nd Street,low,0,1,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,5,1
29975,1.0,2,2016-05-03 03:22:44,"Unparalleled access to transportation, Walk to...",W 31 St.,40.7486,-73.9903,3800,125 W 31 St.,high,1,1,1,1,1,1,1,0,1,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,1,12,1
14142,1.0,0,2016-06-17 01:20:04,Large studio in Murray Hill with A LOT of clos...,East 39th Street,40.7479,-73.9739,2550,250 East 39th Street,low,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,1


In [12]:
df['cats_and_dogs']= np.where(np.logical_and(df['cats_allowed'],df['dogs_allowed'])==1,1,0)
df.sample(6)

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,desc_avail,totalperks,cats_or_dogs,cats_and_dogs
46685,1.0,4,2016-04-18 01:20:18,This 2 queen bedroom apartment is located in P...,8th Avenue,40.6664,-73.9784,3400,723 8th Avenue,low,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,1,1
22270,1.0,2,2016-06-21 06:46:16,,W 10th St,40.7347,-74.0009,4150,141 W 10th St,low,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,0
26075,1.0,1,2016-05-26 04:10:30,Newly renovated 1 bedroom on 31st street and 3...,3026 31st St.,40.7663,-73.9222,2050,3026 31st St.,low,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,0,0
9558,1.0,2,2016-04-14 06:19:03,Renovated truly two bedroom apartment located ...,W 47th St.,40.7586,-73.9838,3450,150 W 47th St.,low,1,0,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,7,0,0
7721,1.0,2,2016-04-22 02:36:11,Take advantage of this amazing deal. This Amaz...,E 34th St.,40.7436,-73.9727,3097,401 E 34th St.,medium,1,0,1,0,1,1,1,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,9,0,0
43799,2.0,1,2016-04-24 03:11:46,This is more than one of our newest luxury ren...,W 42nd St.,40.7592,-73.9948,5020,450 W 42nd St.,low,1,1,0,1,1,1,0,0,1,1,1,1,0,0,1,0,1,0,0,0,0,0,0,0,1,12,1,1


In [13]:
df['total_rooms']=df.iloc[:,:2].sum(axis=1)
df.sample(6)

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,desc_avail,totalperks,cats_or_dogs,cats_and_dogs,total_rooms
7341,1.0,1,2016-04-04 10:51:40,This apartment is located in a laundry & eleva...,585 East 21st St. #5H,40.6396,-73.9577,1550,585 East 21st St,high,1,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,5,1,1,2.0
26416,1.0,2,2016-05-28 03:46:09,**NO FEE PLUS 1 MONTH FREE!!!... Great Upper ...,E 102 St.,40.7897,-73.9488,2995,120 E 102 St.,low,0,0,1,0,0,1,1,0,0,1,1,0,1,1,0,1,0,0,1,0,0,1,0,0,1,11,0,0,3.0
20726,1.0,0,2016-06-05 01:21:38,Fabulous recently renovated studio in the mid ...,East 63rd Street,40.7636,-73.9642,1995,210 East 63rd Street,low,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1.0
17088,1.0,1,2016-06-08 04:58:34,Available July 1st!<br><br>Size and location!<...,196 5th Avenue,40.6769,-73.9804,2650,196 5th Avenue,medium,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,4,1,1,2.0
41258,2.0,4,2016-05-06 05:25:43,AMAZING 4 BR FLEX WITH FULL WALL IN THE HEART ...,N Moore St,40.7196,-74.0109,5795,80 N Moore St,high,1,0,1,0,0,1,1,0,0,0,1,0,1,1,0,1,1,0,1,0,0,0,0,0,1,11,0,0,6.0
18001,1.0,0,2016-06-29 01:33:27,This gorgeous studio apartment features beauti...,Amsterdam Avenue,40.7976,-73.968,2325,850 Amsterdam Avenue,low,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,3,1,1,1.0


In [14]:
type(df['created'][0])

str

In [15]:
df['created'] = pd.to_datetime(df['created'],infer_datetime_format=True)

In [16]:
type(df['created'][0])

pandas._libs.tslibs.timestamps.Timestamp

In [17]:
df.head()

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,desc_avail,totalperks,cats_or_dogs,cats_and_dogs,total_rooms
0,1.5,3,2016-06-24 07:54:24,A Brand New 3 Bedroom 1.5 bath ApartmentEnjoy ...,Metropolitan Avenue,40.7145,-73.9425,3000,792 Metropolitan Avenue,medium,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,4.5
1,1.0,2,2016-06-12 12:19:27,,Columbus Avenue,40.7947,-73.9667,5465,808 Columbus Avenue,low,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,1,1,3.0
2,1.0,1,2016-04-17 03:26:41,"Top Top West Village location, beautiful Pre-w...",W 13 Street,40.7388,-74.0018,2850,241 W 13 Street,high,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,0,0,2.0
3,1.0,1,2016-04-18 02:22:02,Building Amenities - Garage - Garden - fitness...,East 49th Street,40.7539,-73.9677,3275,333 East 49th Street,low,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,3,0,0,2.0
4,1.0,4,2016-04-28 01:32:41,Beautifully renovated 3 bedroom flex 4 bedroom...,West 143rd Street,40.8241,-73.9493,3350,500 West 143rd Street,low,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,0,0,5.0


In [18]:
cutoff = pd.to_datetime('2016-06-01')
train = df[df.created < cutoff]
test  = df[df.created >= cutoff]

In [19]:
train.sample(5)

Unnamed: 0,bathrooms,bedrooms,created,description,display_address,latitude,longitude,price,street_address,interest_level,elevator,cats_allowed,hardwood_floors,dogs_allowed,doorman,dishwasher,no_fee,laundry_in_building,fitness_center,pre-war,laundry_in_unit,roof_deck,outdoor_space,dining_room,high_speed_internet,balcony,swimming_pool,new_construction,terrace,exclusive,loft,garden_patio,wheelchair_access,common_outdoor_space,desc_avail,totalperks,cats_or_dogs,cats_and_dogs,total_rooms
28768,1.0,0,2016-05-24 04:58:11,- Light-filled residence<br>- Luxury building<...,Water Street,40.7032,-73.9914,3096,60 Water Street,low,1,1,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,8,1,1,1.0
44112,1.0,2,2016-04-20 02:24:59,Bright and sunny two bedroom apartment with se...,York Avenue,40.77,-73.9517,2595,1443 York Avenue,medium,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,0,0,3.0
33379,1.0,2,2016-05-21 02:12:51,Beautiful 2 BR Duplex in Crown Heights. Key-le...,Classon Ave,40.6728,-73.9606,2500,811 Classon Ave,medium,0,1,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,1,1,3.0
6749,1.0,0,2016-04-06 02:35:20,Newly renovated studio apartment with condo fi...,East 10th Street,40.7274,-73.9807,2299,325 East 10th Street,low,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,3,1,1,1.0
32812,1.0,3,2016-05-25 06:47:48,3 BEDROOM/1BATH * WASHER/DRYER * DISHWASHER * ...,East 106th Street,40.79,-73.9416,2895,314 East 106th Street,low,0,0,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,6,0,0,4.0


In [20]:
target = ['price']
features = ['bathrooms','bedrooms','totalperks','desc_avail']

X_train = train[features]
y_train = train[target]

X_test = test[features]
y_test = test[target]


In [21]:
#baseline
#1- mean
guess = y_train.mean()
guess


price    3575.604007
dtype: float64

In [22]:
#2- trainerror
from sklearn.metrics import mean_absolute_error
y_pred = [guess]*len(y_train)
mae_train=mean_absolute_error(y_train,y_pred)
mae_train

1201.8811133682555

In [23]:
y_pred = [guess]*len(y_test)
mae_test=mean_absolute_error(y_test,y_pred)
mae_test

1197.7088871089013

In [24]:
#import estimator
from sklearn.linear_model import LinearRegression

In [25]:
#instantiate
model=LinearRegression()

In [26]:
#fit the model
model.fit(X_train,y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [27]:
#apply the model to the test data
y_pred = model.predict(X_test)
mae=mean_absolute_error(y_test,y_pred)
mae

791.9012546154527

In [30]:
model.intercept_, model.coef_

(array([703.35569216]),
 array([[1931.06715864,  394.72194569,   83.59209578, -563.49692245]]))

In [31]:
print('Intercept', model.intercept_)
beta0=model.intercept_
betas = model.coef_
print(betas)
# coefficients = pd.Series(model.coef_, features)
# print(coefficients)
# print(coefficients.to_string())
print(f'y = {beta0} + {betas[0][0]}x1 + {betas[0][1]}x2 + {betas[0][2]}x3 + {betas[0][3]}x4')

Intercept [703.35569216]
[[1931.06715864  394.72194569   83.59209578 -563.49692245]]
y = [703.35569216] + 1931.0671586383478x1 + 394.72194569409834x2 + 83.59209578057346x3 + -563.4969224495203x4


In [32]:
# Print regression metrics for train
from sklearn.metrics import mean_absolute_error,mean_squared_error,r2_score
mse = mean_squared_error(y_train, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_train, y_pred)
r2 = r2_score(y_train, y_pred)
print('Mean Squared Error:', mse)
print('Root Mean Squared Error:', rmse)
print('Mean Absolute Error:', mae)
print('R^2:', r2)

ValueError: ignored

In [33]:
# Print regression metrics for test
from sklearn.metrics import mean_absolute_error,mean_squared_error,r2_score
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print('Mean Squared Error:', mse)
print('Root Mean Squared Error:', rmse)
print('Mean Absolute Error:', mae)
print('R^2:', r2)

Mean Squared Error: 1400716.664151628
Root Mean Squared Error: 1183.518763751394
Mean Absolute Error: 791.9012546154527
R^2: 0.5493220465896518
