# Predicting Bulldozer Sales Price

In this notebook, we will explore a dataset and build machine learning models to predict the auction sale price of bulldozers 🚜. This idea and the necessary data were taken from this [Kaggle Competition](https://www.kaggle.com/c/bluebook-for-bulldozers/).

## 1. Problem definition

The company Fast Iron is creating a "blue book for bulldozer", which aims to inform their customers about what the company's heavy equipment is worth at auction. They need a model to predict this price, based on usage, equipment type and configuration.

## 2. Data

As mentioned before, the data was taken from [Kaggle](https://www.kaggle.com/c/bluebook-for-bulldozers/).

There are 3 main datasets:

- **Train.csv:** Training set, which contains data through the end of 2011.
- **Valid.csv:** Validation set, which contains data from January 1, 2012 - April 30, 2012.
- **Test.csv:** Test set, which contains data from May 1, 2012 - November 2012.

Key fields:

- SalesID: Unique identifier of the sale.
- MachineID: Unique identifier of a machine. One machine can be sold multiple times.
- saleprice: What the machine sold for at auction.
- saledate: Date of the sale.

## 3. Evaluation

The key evaluation metric will be the Root Mean Squared Log Error (RMSLE), between the actual and predicted auction prices. As for most regression models, the goal is to minimize the error of the prediction, when compared to the actual value. 

## 4. Features

| Variable                 	| Description                                                                                                                                                                                                                     	|
|--------------------------	|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------	|
| SalesID                  	|   unique identifier of a particular sale of a machine at auction                                                                                                                                                                	|
| MachineID                	|   identifier for a particular machine;  machines may have multiple sales                                                                                                                                                        	|
| ModelID                  	|   identifier for a unique machine model (i.e. fiModelDesc)                                                                                                                                                                      	|
| datasource               	|   source of the sale record;  some sources are more diligent about reporting attributes of the machine than others.  Note that a particular datasource may report on multiple auctioneerIDs.                                    	|
| auctioneerID             	|   identifier of a particular auctioneer, i.e. company that sold the machine at auction.  Not the same as datasource.                                                                                                            	|
| YearMade                 	|   year of manufacturer of the Machine                                                                                                                                                                                           	|
| MachineHoursCurrentMeter 	|   current usage of the machine in hours at time of sale (saledate);  null or 0 means no hours have been reported for that sale                                                                                                  	|
| UsageBand                	|   value (low, medium, high) calculated comparing this particular Machine-Sale hours to average usage for the fiBaseModel;  e.g. 'Low' means this machine has less hours given it's lifespan relative to average of fiBaseModel. 	|
| Saledate                 	|   time of sale                                                                                                                                                                                                                  	|
| Saleprice                	|   cost of sale in USD                                                                                                                                                                                                           	|
| fiModelDesc              	|   Description of a unique machine model (see ModelID); concatenation of fiBaseModel & fiSecondaryDesc & fiModelSeries & fiModelDescriptor                                                                                       	|
| fiBaseModel              	|   disaggregation of fiModelDesc                                                                                                                                                                                                 	|
| fiSecondaryDesc          	|   disaggregation of fiModelDesc                                                                                                                                                                                                 	|
| fiModelSeries            	|   disaggregation of fiModelDesc                                                                                                                                                                                                 	|
| fiModelDescriptor        	|   disaggregation of fiModelDesc                                                                                                                                                                                                 	|
| ProductSize              	| The size class grouping for a product group. Subsets within product group.                                                                                                                                                      	|
| ProductClassDesc         	|   description of 2nd level hierarchical grouping (below ProductGroup) of fiModelDesc                                                                                                                                            	|
| State                    	|   US State in which sale occurred                                                                                                                                                                                               	|
| ProductGroup             	|   identifier for top-level hierarchical grouping of fiModelDesc                                                                                                                                                                 	|
| ProductGroupDesc         	|   description of top-level hierarchical grouping of fiModelDesc                                                                                                                                                                 	|
| Drive_System             	| machine configuration;  typcially describes whether 2 or 4 wheel drive                                                                                                                                                          	|
| Enclosure                	| machine configuration - does machine have an enclosed cab or not                                                                                                                                                                	|
| Forks                    	| machine configuration - attachment used for lifting                                                                                                                                                                             	|
| Pad_Type                 	| machine configuration - type of treads a crawler machine uses                                                                                                                                                                   	|
| Ride_Control             	| machine configuration - optional feature on loaders to make the ride smoother                                                                                                                                                   	|
| Stick                    	| machine configuration - type of control                                                                                                                                                                                         	|
| Transmission             	| machine configuration - describes type of transmission;  typically automatic or manual                                                                                                                                          	|
| Turbocharged             	| machine configuration - engine naturally aspirated or turbocharged                                                                                                                                                              	|
| Blade_Extension          	| machine configuration - extension of standard blade                                                                                                                                                                             	|
| Blade_Width              	| machine configuration - width of blade                                                                                                                                                                                          	|
| Enclosure_Type           	| machine configuration - does machine have an enclosed cab or not                                                                                                                                                                	|
| Engine_Horsepower        	| machine configuration - engine horsepower rating                                                                                                                                                                                	|
| Hydraulics               	| machine configuration - type of hydraulics                                                                                                                                                                                      	|
| Pushblock                	| machine configuration - option                                                                                                                                                                                                  	|
| Ripper                   	| machine configuration - implement attached to machine to till soil                                                                                                                                                              	|
| Scarifier                	| machine configuration - implement attached to machine to condition soil                                                                                                                                                         	|
| Tip_control              	| machine configuration - type of blade control                                                                                                                                                                                   	|
| Tire_Size                	| machine configuration - size of primary tires                                                                                                                                                                                   	|
| Coupler                  	| machine configuration - type of implement interface                                                                                                                                                                             	|
| Coupler_System           	| machine configuration - type of implement interface                                                                                                                                                                             	|
| Grouser_Tracks           	| machine configuration - describes ground contact interface                                                                                                                                                                      	|
| Hydraulics_Flow          	| machine configuration - normal or high flow hydraulic system                                                                                                                                                                    	|
| Track_Type               	| machine configuration - type of treads a crawler machine uses                                                                                                                                                                   	|
| Undercarriage_Pad_Width  	| machine configuration - width of crawler treads                                                                                                                                                                                 	|
| Stick_Length             	| machine configuration - length of machine digging implement                                                                                                                                                                     	|
| Thumb                    	| machine configuration - attachment used for grabbing                                                                                                                                                                            	|
| Pattern_Changer          	| machine configuration - can adjust the operator control configuration to suit the user                                                                                                                                          	|
| Grouser_Type             	| machine configuration - type of treads a crawler machine uses                                                                                                                                                                   	|
| Backhoe_Mounting         	| machine configuration - optional interface used to add a backhoe attachment                                                                                                                                                     	|
| Blade_Type               	| machine configuration - describes type of blade                                                                                                                                                                                 	|
| Travel_Controls          	| machine configuration - describes operator control configuration                                                                                                                                                                	|
| Differential_Type        	| machine configuration - differential type, typically locking or standard                                                                                                                                                        	|
| Steering_Controls        	| machine configuration - describes operator control configuration                                                                                                                                                                	|