# OR 568 - Rogers Rangers
## NYC Rolling Sales Dataset

James Baker

Arturo Davila-Adino

Gridihar Kaushik-Ramachandran (GK)

Andrew So

## Developer Notes

*James B*: For tables, I generally build them in excel and then convert them into Markdown tables at https://tableconvert.com/.

# Initial Setup

In [None]:
# To Pretty-Print this as a PDF later
options(jupyter.plot_mimetypes = c("text/plain", "image/png" ))

# ipak function: install and load multiple R packages.
# check to see if packages are installed. Install them if they are not, then load them into the R session.
ipak <- function(pkg){
    new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
    if (length(new.pkg)) 
        install.packages(new.pkg, dependencies = TRUE)
    sapply(pkg, require, character.only = TRUE)
}

# usage
packages <- c(
  "ggplot2", 
  "plyr", 
  "mlbench", 
  "e1071", 
  "dplyr", 
  "caret", #  Contains functions to streamline the model training process for complex regression and classification problems
  "pls", # For Partial Least Squares 
  "lars", # For Penalized Models
  "elasticnet", # For Penalized Models
  "AppliedPredictiveModeling"  
)
ipak(packages)

## Dataset Import
Dataset found at: https://www.kaggle.com/new-york-city/nyc-property-sales#

In [9]:
# This will directly reference the .csv embedded in our Project's Google Drive.
fileID <- "1t7APs7P43E7dQ_4JUUwFjvHHGruRixDh"
filePath <- sprintf("https://drive.google.com/uc?id=%s",fileID)

cat(sprintf("Reading file `%s` at %s...\n","nyc-rolling-sales.csv",filePath))
nyc.data <- read.csv(filePath)

cat("Number of Rows: ",nrow(nyc.data), "\nNumber of Cols: ", ncol(nyc.data),"\nShowing first 2 rows...")
head(nyc.data, 2)

Reading file `nyc-rolling-sales.csv` at https://drive.google.com/uc?id=1t7APs7P43E7dQ_4JUUwFjvHHGruRixDh...
Number of Rows:  84548 
Number of Cols:  22 
Showing first 2 rows...

Unnamed: 0_level_0,X,BOROUGH,NEIGHBORHOOD,BUILDING.CLASS.CATEGORY,TAX.CLASS.AT.PRESENT,BLOCK,LOT,EASE.MENT,BUILDING.CLASS.AT.PRESENT,ADDRESS,⋯,RESIDENTIAL.UNITS,COMMERCIAL.UNITS,TOTAL.UNITS,LAND.SQUARE.FEET,GROSS.SQUARE.FEET,YEAR.BUILT,TAX.CLASS.AT.TIME.OF.SALE,BUILDING.CLASS.AT.TIME.OF.SALE,SALE.PRICE,SALE.DATE
Unnamed: 0_level_1,<int>,<int>,<fct>,<fct>,<fct>,<int>,<int>,<lgl>,<fct>,<fct>,⋯,<int>,<int>,<int>,<fct>,<fct>,<int>,<int>,<fct>,<fct>,<fct>
1,4,1,ALPHABET CITY,07 RENTALS - WALKUP APARTMENTS,2A,392,6,,C2,153 AVENUE B,⋯,5,0,5,1633,6440,1900,2,C2,6625000,2017-07-19 00:00:00
2,5,1,ALPHABET CITY,07 RENTALS - WALKUP APARTMENTS,2,399,26,,C7,234 EAST 4TH STREET,⋯,28,3,31,4616,18690,1900,2,C7,-,2016-12-14 00:00:00


## Data Dictionary

In [None]:
names(nyc.data)

| **Column Name**                  | **Description**                                                                                                                                                            |
|----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `ADDRESS`                        |                                                                                                                                                                            |
| `APARTMENT NUMBER`               |                                                                                                                                                                            |
| `BLOCK`                          | The block number (up to 5 digits)                                                                                                                                          |
| `BOROUGH`                        | A digit code for the borough the property is located in.<br> In order, these are:<br>(1) Manhattan,<br>(2) Bronx,<br>(3) Brooklyn,<br>(4)Queens, and<br>(5) Staten Island. |
| `BUILDING CLASS AT PRESENT`      |                                                                                                                                                                            |
| `BUILDING CLASS AT TIME OF SALE` |                                                                                                                                                                            |
| `BUILDING CLASS CATEGORY`        |                                                                                                                                                                            |
| `COMMERCIAL UNITS`               |                                                                                                                                                                            |
| `EASE-MENT`                      |                                                                                                                                                                            |
| `GROSS SQUARE FEET`              |                                                                                                                                                                            |
| `H1`                             |                                                                                                                                                                            |
| `LAND SQUARE FEET`               |                                                                                                                                                                            |
| `LOT`                            | The lot number (up to 4 digits)                                                                                                                                            |
| `NEIGHBORHOOD`                   |                                                                                                                                                                            |
| `RESIDENTIAL UNITS`              |                                                                                                                                                                            |
| `SALE DATE`                      |                                                                                                                                                                            |
| `SALE PRICE`                     |                                                                                                                                                                            |
| `TAX CLASS AT PRESENT`           |                                                                                                                                                                            |
| `TAX CLASS AT TIME OF SALE`      |                                                                                                                                                                            |
| `TOTAL UNITS`                    |                                                                                                                                                                            |
| `YEAR BUILT`                     |                                                                                                                                                                            |
| `ZIP CODE`                       |                                                                                                                                                                            |
| `ADDRESS`                        |                                                                                                                                                                            |
| `APARTMENT NUMBER`               |                                                                                                                                                                            |
| `BLOCK`                          |                                                                                                                                                                            |
| `BOROUGH`                        |                                                                                                                                                                            |
| `BUILDING CLASS AT PRESENT`      | Refer to code here: <br>https://www1.nyc.gov/assets/finance/jump/hlpbldgcode.html                                                                                          |
| `BUILDING CLASS AT TIME OF SALE` | Refer to code here: <br>https://www1.nyc.gov/assets/finance/jump/hlpbldgcode.html                                                                                          |
| `BUILDING CLASS CATEGORY`        | Refer to code here: <br>https://www1.nyc.gov/assets/finance/jump/hlpbldgcode.html                                                                                          |
| `COMMERCIAL UNITS`               |                                                                                                                                                                            |
| `EASE-MENT`                      |                                                                                                                                                                            |
| `GROSS SQUARE FEET`              |                                                                                                                                                                            |
| `H1`                             |                                                                                                                                                                            |
| `LAND SQUARE FEET`               |                                                                                                                                                                            |
| `LOT`                            |                                                                                                                                                                            |
| `NEIGHBORHOOD`                   |                                                                                                                                                                            |
| `RESIDENTIAL UNITS`              | The number of houses/apartments intended for use as a place of residence at the address.      <br>(https://www.lawinsider.com/dictionary/residential-unit)                 |
| `SALE DATE`                      |                                                                                                                                                                            |
| `SALE PRICE`                     |                                                                                                                                                                            |
| `TAX CLASS AT PRESENT`           |                                                                                                                                                                            |
| `TAX CLASS AT TIME OF SALE`      |                                                                                                                                                                            |
| `TOTAL UNITS`                    |                                                                                                                                                                            |
| `YEAR BUILT`                     |                                                                                                                                                                            |
| `ZIP CODE`                       | The 5-digit zip code of the address.                                                                                                                                       |