# Property Investment in Summit County

# Problem Statement:

I have been involved in Summit County, CO since 2002. Bought my first investment condomedium at Copper Mountain in 2004. I wanted to develop a model that will utlize public property data and rental data to help me make an informed property investment decision in Summit County that will maximize my return to value.


# Data Collection:

- MLS website where I was able to pull in three years worth of public data on properties sold in Summit County.
- AirBnB API website (through a third party) where I was able to pull in current listing in Summit County.
- CO census tract website to gather geojson data containing boundary data of Summit County
- Carbonate Real Estate and SkyRun Rental provided rental information


# EDA:

## Data Cleaning:

### Public Record:
- Gathered geomtery information for each property based on the address.
- Remove outliers due to inconsiscency in the data.
- Establish lower and upper bound range of sale prices.

### CO Census Tract GeoJson data:
- Extracted 5 census tracts info that are contained within Summit County out of the CO census tract dataset.
- Extracted the outside borders of each the five census tract and 'zip' them into a new GeoJson
- Save the new GeoJson of Summit County borders in a new file for display purpose.

### AirBnB:
- Download data by cities, daily, within a one week timeframe.
- Combined each city data by the rundate to creat a file for each day.
- Combined each daily file into one file and remove any duplicates.

### Property Rental information:
- Hard copy of rental information frmo past years where the data was entered into a spreadsheet.
- Gathered geomtery information for each property based on the address.
    
## Data Definition:

### Public Record

|   |   |   |  
|---|---|---|
|Column|Type|Description|
|tax_id|int|Key identifier|
|address|Object|Physical address of Property|
|city|Object|Physical city of property|
|property_zip|int|Postal zip code assigned to property|
|sqft|int|Square footage of the property|
|property_type|Object|Desciption of property|
|beds|int|Number of beds on property|    
|baths |int   |Number of baths on property    |
|year_built |int   |Date when property was built    |
|sale_date |date   |Date when property was sold    |
|last_sale_price |int   |Property sold price   |
|latitude|float|Latitude coordinate|
|longitude|float|Longitude coordinate|



### AirBnB

|   |   |   |  
|---|---|---|
|Column|Type|Description|
|property_id|int|Key identifier|
|property_type|Object|Desciption of rental property|
|city|Object|city of rental|
|bedroom|int|Number of beds on rental property|    
|bathroom|int   |Number of baths on rental property    |
|person_capacity|int|Max number of people in rental property| 
|rate_Type|int|Rate code for rental property|
|rate|float|Rate per night for rental Property
|min_Night|int|Minimumn day stay|
|max_Night|int|Maximumn day stay|
|latitude|float|Latitude coordinate|
|longitude|float|Longitude coordinate|


### Carbonate Real Estate
|   |   |   |  
|---|---|---|
|Column|Type|Description|
|property_id|int|Key identifier|
|property_Name|Object|Desciption of building where rental property reside|
|property_Unit|Object|Unit number of rental property|
|bedroom|int|Number of beds on rental property|    
|bathroom|int   |Number of baths on rental property    |
|sleep|int|Max number of people in rental property| 
|occupancy_rate|float|Percentage of occupancy per year|
|rate_catagory|int|forieng key to Rate_Table|
|latitude|float|Latitude coordinate|
|longitude|float|Longitude coordinate|

|   |   |   |  
|---|---|---|
|Column|Type|Description|
|rate_catagory|int|Key identifier|
|rate|float|Rate per night for rental Property|
|start_date|date|
|end_date|date|

#### Note: Data withheld per Owners request



### SkyRun Property Rental

|   |   |   |  
|---|---|---|
|Column|Type|Description|
|property_id|int|Key identifier|
|property_Name|Object|Desciption of rental property|
|bedroom|int|Number of beds on rental property|    
|bathroom|int   |Number of baths on rental property    |
|occupancy_rate|float|Percentage of occupancy per year|
|latitude|float|Latitude coordinate|
|longitude|float|Longitude coordinate|

#### Note: Data withheld per Owners request


# Data Visualization:


![](image/img001.png?raw=true)




# Modeling:

Using a Linear Regression:

My first model was on the original data (before removing the outliers) to get a baseline:

- Training Score: 0.2718
- Testing Score: 0.1727

My second model was after removing the outliers:

- Training Score: 0.5984
- Testing Score: 0.6192


My third model, using only the following features after evaluating the coorelations beteen features against the sale price: 
Used the Property_Type_Sfr with beds, baths, and sqft features:

- Training Score: 0.6017
- Testing Score: 0.6207

My forth model, after using DBSCAN to detect clusters and only use the data sets within the clusters: 
Used the Beds, baths, sqft, and property_type

- Training Score: 0.6117
- Testing Score: 0.5967


Last attempt by separating the data by the year sold:
Model 5
Year 2017
Training Score: 0.5732
Testing Score: 0.5008

Model 6
Year 2018
Training Score: 0.6293
Testing Score: 0.4706

Model 7
Year 2019
Training Score: 0.6539
Testing Score: 0.6316



# Rental Data Analysis:

Properties are running about 43% occupancy for the 2019 calendar year.
	(1 bed/1 bath and 2 bed/2bath)

Majority of the rental occurs during the ski season between Nov 10 through April 15.

Peak rental rates are during x-mas and new years week.

Second peak is during March and President Day.

Base rate is during the non-ski season.


Average Potential Gross Revenue for:

-	1 bedroom and 1 bathroom: $37,800
-	2 bedroom and 2 bathroom: $50,890
-	3 bedroom and 3 bathroom: $44,000  (note: has a lower occupancy rate: 31 percent) 

The average sale price in 2019:

-	1 bedroom and 1-2 bath: $350,600
-	2 bedroom and 1-3 bath: $513,300
-	3 bedroom and 2-4 bath: $784,600 

Based on the data that  I was able to collect and analyze, I would target the following:
-	2 bedroom/1-3 baths property as my primary goal
-	1 bedroom/1-2 baths property as my secondary

Potentially I can get a yearly return of $50,000 for a 2/2 property.


However, I was only able to achieve 65 percent score from my linear regression model.

There is more data to gather and analyze before using the model as a tool for investment opportunities.


# Future Improvement:


More features should be included into the model such as:
-	Condition of Property
-	Condition of Appliances
-	Garage/Parking
-	Force Air/Boiler/Electric Heating
-	Association Dues
-	Hot Tub
-	Porch
-	Level 
-	HOA
    
Quality of the data:
-	I was disappointed in the quality of the data from the public records. I will need to investigate how to weed out the ‘true’ property sale from other type of transactions.

-   Need to differentiate the actual rental usage from the owner’s usage to get a true sense of rental market.

-   Need to collect more prior property data. I was unfortunate that I was restricted to gathering three years worth per month. Over time with enough data, I could map out the pricing trend.
