In [1]:
import pandas as pd

# Forest Cover

**Goal:** predict the forest cover type (the predominant kind of tree cover) from strictly cartographic variables (as opposed to remotely sensed data).

In [2]:
# load the data
url = 'https://raw.githubusercontent.com/um-perez-alvaro/Data-Science-Practice/master/Data/forest_cover.csv'
forest_cover = pd.read_csv(url)
forest_cover.head()

Unnamed: 0,Id,Elevation,Aspect,Slope,Horizontal_Distance_To_Hydrology,Vertical_Distance_To_Hydrology,Horizontal_Distance_To_Roadways,Hillshade_9am,Hillshade_Noon,Hillshade_3pm,...,Soil_Type32,Soil_Type33,Soil_Type34,Soil_Type35,Soil_Type36,Soil_Type37,Soil_Type38,Soil_Type39,Soil_Type40,Cover_Type
0,1,2596,51,3,258,0,510,221,232,148,...,0,0,0,0,0,0,0,0,0,5
1,2,2590,56,2,212,-6,390,220,235,151,...,0,0,0,0,0,0,0,0,0,5
2,3,2804,139,9,268,65,3180,234,238,135,...,0,0,0,0,0,0,0,0,0,2
3,4,2785,155,18,242,118,3090,238,238,122,...,0,0,0,0,0,0,0,0,0,2
4,5,2595,45,2,153,-1,391,220,234,150,...,0,0,0,0,0,0,0,0,0,5


This data includes four wilderness areas located in the Roosevelt National Forest of northern Colorado.

**Data Description**

| Feature | Description |
| :- | -: |
| Elevation | Elevation in meters
| Aspect | Aspect in degrees azimuth
| Slope | Slope in degrees
| Horizontal_Distance_To_Hydrology | Horz Dist to nearest surface water features (in meters)
| Vertical_Distance_To_Hydrology | Vert Dist to nearest surface water features (in meters)
| Horizontal_Distance_To_Roadways | Horz Dist to nearest roadway (in meters)
| Hillshade_9am | Hillshade index at 9am, summer solstice
| Hillshade_Noon | Hillshade index at noon, summer soltice
| Hillshade_3pm | Hillshade index at 3pm, summer solstice
| Horizontal_Distance_To_Fire_Points | Horz Dist to nearest wildfire ignition points (in meters)
| Wilderness_Area (4 binary columns) | 0 (absence) or 1 (presence) / Wilderness area designation
| Soil_Type (40 binary columns) | 0 (absence) or 1 (presence) / Soil Type designation
| Cover_Type (target vector) | Forest Cover Type designation

The seven **cover types** are:

 - Spruce/Fir
 - Lodgepole Pine
 - Ponderosa Pine
 - Cottonwood/Willow
 - Aspen
 - Douglas-fir
 - Krummholz

The **wilderness areas** are:

 - Rawah Wilderness Area
 - Neota Wilderness Area
 - Comanche Peak Wilderness Area
 - Cache la Poudre Wilderness Area

The **soil types** are:

- Cathedral family - Rock outcrop complex, extremely stony.
- Vanet - Ratake families complex, very stony.
- Haploborolis - Rock outcrop complex, rubbly.
- Ratake family - Rock outcrop complex, rubbly.
- Vanet family - Rock outcrop complex complex, rubbly.
- Vanet - Wetmore families - Rock outcrop complex, stony.
- Gothic family.
- Supervisor - Limber families complex.
- Troutville family, very stony.
- Bullwark - Catamount families - Rock outcrop complex, rubbly.
- Bullwark - Catamount families - Rock land complex, rubbly.
- Legault family - Rock land complex, stony.
- Catamount family - Rock land - Bullwark family complex, rubbly.
- Pachic Argiborolis - Aquolis complex.
- unspecified in the USFS Soil and ELU Survey.
- Cryaquolis - Cryoborolis complex.
- Gateview family - Cryaquolis complex.
- Rogert family, very stony.
- Typic Cryaquolis - Borohemists complex.
- Typic Cryaquepts - Typic Cryaquolls complex.
- Typic Cryaquolls - Leighcan family, till substratum complex.
- Leighcan family, till substratum, extremely bouldery.
- Leighcan family, till substratum - Typic Cryaquolls complex.
- Leighcan family, extremely stony.
- Leighcan family, warm, extremely stony.
- Granile - Catamount families complex, very stony.
- Leighcan family, warm - Rock outcrop complex, extremely stony.
- Leighcan family - Rock outcrop complex, extremely stony.
- Como - Legault families complex, extremely stony.
- Como family - Rock land - Legault family complex, extremely stony.
- Leighcan - Catamount families complex, extremely stony.
- Catamount family - Rock outcrop - Leighcan family complex, extremely stony.
- Leighcan - Catamount families - Rock outcrop complex, extremely stony.
- Cryorthents - Rock land complex, extremely stony.
- Cryumbrepts - Rock outcrop - Cryaquepts complex.
- Bross family - Rock land - Cryumbrepts complex, extremely stony.
- Rock outcrop - Cryumbrepts - Cryorthents complex, extremely stony.
- Leighcan - Moran families - Cryaquolls complex, extremely stony.
- Moran family - Cryorthents - Leighcan family complex, extremely stony.
- Moran family - Cryorthents - Rock land complex, extremely stony.

**step 1:** Define X and y from the "forest_cover" DataFrame, and then split X and y into training and testing sets

**step 2:** Train a Random Forest. Use *accuracy* to evaluate the performance of your Random Forest on the test set.

**step 3:** Use *grid search* to tune in the hyper-parameters of your Random Forest. 