Skip to content

shaanjahan/MATH742MachineLearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Model Comparison: Linear vs Regression Tree vs The Random Forest

This paper discusses the comparison between three types of machine learning models used in data science and statistics. Many factors impact the final price of apartment sales. Data can only be collected based on what is known and given. Together with many missing pieces to predict apartment prices, the Regression, the Linear, and the Random Forest Algorithms proceeds to complete the task. They are set side by side as to which model gives the most accurate price. The dataset features and the algorithms construct statistical models to predict apartment prices. The dataset comes from Amazon MTruk and contains data from February 2016 until February 2017.

Features

  • zipcode
  • approx_year_built
  • dining_room_type
  • fuel_type
  • kitchen_type
  • maintenance_cost
  • num_bedrooms
  • num_floors_in_building
  • num_full_bathrooms
  • num_total_rooms
  • parking_charges
  • sale_price
  • sq_footage
  • walk_score
  • price_listings
  • avg_prices
  • cats_allowed
  • dogs_allowed
  • coop_condo
  • price_per_sqft

Roadmap

  • Collecting data from Amazon Turk

  • Cleaning data from Amazon Turk. Includes missing values

  • Researching missing values

  • Using umputation on numerical values with Missforest & MICE.

  • Data Visualizations through heatmaps to not overfit with highly correlated variables.

  • Implementing the Linear Regression model to view its predicting power and RMSE.

  • Implementing the Regression Tree model to view its predicting power and RMSE.

  • Implementing the Random Forest Regression model to view its predicting power and RMSE.

  • Declare which model has the lowest RMSE (Root Mean Squared Error) and best predictability power.

About

This project uses data from Amazon MTurk in to predict apartment values in the Queens, New York area. It is also a project on comparing three models (Linear, Regression Tree, & The Random Forest).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors