**This notebook is an exercise in the [Introduction to Machine Learning](https://www.kaggle.com/learn/intro-to-machine-learning) course.  You can reference the tutorial at [this link](https://www.kaggle.com/dansbecker/your-first-machine-learning-model).**

---


## Recap
So far, you have loaded your data and reviewed it with the following code. Run this cell to set up your coding environment where the previous step left off.

In [1]:
# Code you have previously used to load data
import pandas as pd #mengimpor perpustakaan Pandas dengan alias pd.

# Path of the file to read
iowa_file_path = '../input/home-data-for-ml-course/train.csv' #mendefinisikan variabel iowa_file_path yang berisi path dari CSV yang akan dibaca

home_data = pd.read_csv(iowa_file_path) #membaca data dari CSV yang disebutkan (train.csv) dan menyimpannya ke dalam home_data.

# Set up code checking
from learntools.core import binder #mengimpor modul binder dari pustaka learntools.core
binder.bind(globals()) #menggunakan fungsi bind() dari modul binder
from learntools.machine_learning.ex3 import * #mengimpor semua objek yang ada di dalam modul ex3 dari pustaka learntools.machine_learning.

print("Setup Complete") #mencetak "Setup Complete" ke output

Setup Complete


# Exercises

## Step 1: Specify Prediction Target
Select the target variable, which corresponds to the sales price. Save this to a new variable called `y`. You'll need to print a list of the columns to find the name of the column you need.


In [None]:
# print the list of columns in the dataset to find the name of the prediction target


In [8]:
y = home_data.SalePrice #mengambil kolom "SalePrice" dari DataFrame home_data dan menyimpannya dalam variabel y

# Check your answer
step_1.check() #memanggil fungsi check() dari objek yang disebut step_1

<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [7]:
# The lines below will show you a hint or the solution.
step_1.hint() 
step_1.solution()

<IPython.core.display.Javascript object>

<span style="color:#3366cc">Hint:</span> Use `print(home_data.columns)`. The column you want is at the end of the list. Use the dot notation to pull out this column from the DataFrame

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
y = home_data.SalePrice
```

## Step 2: Create X
Now you will create a DataFrame called `X` holding the predictive features.

Since you want only some columns from the original data, you'll first create a list with the names of the columns you want in `X`.

You'll use just the following columns in the list (you can copy and paste the whole list to save some typing, though you'll still need to add quotes):
  * LotArea
  * YearBuilt
  * 1stFlrSF
  * 2ndFlrSF
  * FullBath
  * BedroomAbvGr
  * TotRmsAbvGrd

After you've created that list of features, use it to create the DataFrame that you'll use to fit the model.

In [13]:
# Create the list of features below
features_name = ["LotArea", "YearBuilt", "1stFlrSF", "2ndFlrSF",
                      "FullBath", "BedroomAbvGr", "TotRmsAbvGrd"] #membuat sebuah daftar (list) yang disebut features_name yang berisi nama-nama fitur

# Select data corresponding to features in feature_names
X = home_data[features_name] #membuat variabel X yang berisi data dari dataset home_data

# Check your answer
step_2.check() ##memanggil fungsi check() dari objek yang disebut step_2

<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [12]:
step_2.hint()
step_2.solution()

<IPython.core.display.Javascript object>

<span style="color:#3366cc">Hint:</span> Capitalization and spelling are important when specifying variable names. Use the brackets notation when specifying data for X.

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
feature_names = ["LotArea", "YearBuilt", "1stFlrSF", "2ndFlrSF",
                      "FullBath", "BedroomAbvGr", "TotRmsAbvGrd"]

X=home_data[feature_names]
```

## Review Data
Before building a model, take a quick look at **X** to verify it looks sensible

In [None]:
# Review data
# print description or statistics from X
#print(_)

# print the top few lines
#print(_)

## Step 3: Specify and Fit Model
Create a `DecisionTreeRegressor` and save it iowa_model. Ensure you've done the relevant import from sklearn to run this command.

Then fit the model you just created using the data in `X` and `y` that you saved above.

In [15]:
from sklearn.tree import DecisionTreeRegressor #mengimpor kelas DecisionTreeRegressor dari modul tree di pustaka scikit-learn.
#specify the model. 
#For model reproducibility, set a numeric value for random_state when specifying the model
iowa_model = DecisionTreeRegressor(random_state=1) #membuat model Decision Tree Regressor dan menyimpannya dalam variabel iowa_model. 

# Fit the model
iowa_model.fit(X,y) #menginstruksikan model Decision Tree Regressor (iowa_model) untuk mempelajari pola dalam data dengan menggunakan fitur-fitur yang ada dalam variabel X dan variabel target y

# Check your answer
step_3.check() #memanggil fungsi check() dari objek yang disebut step_3

<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [14]:
# step_3.hint()
step_3.solution()

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
from sklearn.tree import DecisionTreeRegressor
iowa_model = DecisionTreeRegressor(random_state=1)
iowa_model.fit(X, y)
```

## Step 4: Make Predictions
Make predictions with the model's `predict` command using `X` as the data. Save the results to a variable called `predictions`.

In [17]:
predictions = iowa_model.predict(X) #menggunakan model Decision Tree Regressor (iowa_model) yang telah dilatih sebelumnya untuk membuat prediksi.
print(predictions) #mencetak hasil prediksi yang telah dihasilkan oleh model ke layar

# Check your answer
step_4.check() #memanggil fungsi check() dari objek yang disebut step_4.

[208500. 181500. 223500. ... 266500. 142125. 147500.]


<IPython.core.display.Javascript object>

<span style="color:#33cc33">Correct</span>

In [16]:
# step_4.hint()
step_4.solution()

<IPython.core.display.Javascript object>

<span style="color:#33cc99">Solution:</span> 
```python
iowa_model.predict(X)
```

## Think About Your Results

Use the `head` method to compare the top few predictions to the actual home values (in `y`) for those same homes. Anything surprising?


In [None]:
# You can write code in this cell


It's natural to ask how accurate the model's predictions will be and how you can improve that. That will be you're next step.

# Keep Going

You are ready for **[Model Validation](https://www.kaggle.com/dansbecker/model-validation).**


---




*Have questions or comments? Visit the [course discussion forum](https://www.kaggle.com/learn/intro-to-machine-learning/discussion) to chat with other learners.*