# Getting started: Regression

This tutorial uses safeds on **house sales data** to predict house prices.


1. Load your data into a `Table`, the data is available under `docs/tutorials/data/pricing.csv`:


In [56]:
from safeds.data.tabular.containers import Table

pricing = Table.from_csv_file("data/house_sales.csv")
# For visualisation purposes we only print out the first 15 rows.
pricing.slice_rows(0,15)

2. Split the house sales dataset into two tables. A training set, that we will use later to implement a training model to predict the house price, containing 60% of the data, and a testing set containing the rest of the data.
Delete the column `price` from the test set, to be able to predict it later:


In [57]:
split_tuple = pricing.split(0.60)

train_table = split_tuple[0]
testing_table = split_tuple[1]

test_table = testing_table.remove_columns(["price"]).shuffle_rows()

3. Tag the `price` `Column` as the target variable to be predicted. Use the new names of the fitted `Column`s as features, which will be used to make predictions based on the target variable.


In [58]:
feature_columns = set(train_table.column_names) - set(["price", "id"])

tagged_train_table = train_table.tag_columns("price", feature_names=[
    *feature_columns])


6. Use `Decision Tree` regressor as a model for the regression. Pass the "tagged_pricing" table to the fit function of the model:


In [59]:
from safeds.ml.classical.regression import DecisionTree

model = DecisionTree()
fitted_model = model.fit(tagged_train_table)

7. Use the fitted decision tree regression model, that we trained on the training dataset to predict the price of a house in the test dataset.


In [60]:
prediction = fitted_model.predict(
    test_table
)
#For visualisation purposes we only print out the first 15 rows.
prediction.slice_rows(0,15)

7. You can test the mean absolute error of that model with the initial testing_table as follows:


In [61]:
tagged_test_table= testing_table.tag_columns("price", feature_names=[
    *feature_columns
])

fitted_model.mean_absolute_error(tagged_test_table)


105595.6001735107