<img src="https://docs.actable.ai/_images/logo.png" style="object-fit: cover; max-width:100%; height:300px;" />

# AAIRegressionTask

This notebook is an example on how you can run a regression automatically with
[Actable AI](https://actable.ai)

For this example we will try to predict the rental prices of appartments and 
then try to predict the rental prices for new appartments.

### Imports

This part simply imports the python modules.
Last line imports the Regression task from actableai

In [11]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay, RocCurveDisplay

from actableai.tasks.regression import AAIRegressionTask

### Importing the data

This part imports the data and cut it into two parts.\
First part will be the data we use for training and second part
will be used to showcase the predictive power of the new generated model

In [12]:
df = pd.read_csv("https://raw.githubusercontent.com/Actable-AI/public-datasets/master/apartments.csv").head(100)
train_ratio = 0.8
df_train = df.iloc[:int(train_ratio * len(df))]
df_prediction = df.iloc[int(train_ratio * len(df)):]
print(f"Number of features : {df.shape[1]}, Number of rows : {df.shape[0]}")
df.head(5)

Number of features : 8, Number of rows : 100


Unnamed: 0,number_of_rooms,number_of_bathrooms,sqft,location,days_on_market,initial_price,neighborhood,rental_price
0,0,1,4848,great,10,2271,south_side,2271.0
1,1,1,674,good,1,2167,,2167.0
2,1,1,554,poor,19,1883,,1883.0
3,0,1,529,great,3,2431,,2431.0
4,3,2,1219,great,3,5510,,5510.0


### Calling Actable AI task

This part is the call to the ActableAI regression analysis.\
To learn more about the available parameters you can consult the [API Documentation](https://lib.actable.ai/actableai.tasks.html#module-actableai.tasks.classification)

In [None]:
# Here df is the DataFrame containing our data
# target is "Churn" because we want to predict the churn
# features set to None means that we will use every single feature available
result = AAIRegressionTask().run(
    df=df_train,
    target="rental_price",
    features=None,
)

### Evaluation of the generated model

In this part we take a look at the metrics created by the model on the validation set.\
The validation set is created internally so you dont need to specify it.

In [14]:
evaluation = result["data"]["evaluate"]
metrics = evaluation["metrics"]
print(metrics)
pd.DataFrame(result["data"]["importantFeatures"])

                    metric      value
0  Root Mean Squared Error  86.478663
1                       R2   0.995603
2      Mean Absolute Error  56.216107
3    Median Absolute Error  32.314331


Unnamed: 0,feature,importance,p_value
0,initial_price,1.139217,0.000546
1,number_of_rooms,0.064834,0.002703
2,number_of_bathrooms,0.002851,0.004495
3,location,0.002824,0.00297
4,days_on_market,0.002626,0.013342
5,sqft,0.001197,0.003808
6,neighborhood,-0.000272,0.980925


### Prediction with the generated model

Finally, we showcase how we can use the generated model to make further predictions\
on unseen data. Here in our case we already have the values but this works for any\
new incoming data points.

In [15]:
model = result["model"]
prediction = model.predict(df_prediction)
df_prediction["Predicted rental_price"] = prediction
df_prediction.head(5)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


Unnamed: 0,number_of_rooms,number_of_bathrooms,sqft,location,days_on_market,initial_price,neighborhood,rental_price,Predicted rental_price
80,0,1,245,great,4,2094,south_side,2094.0,2100.2771
81,3,2,1216,great,5,5495,south_side,5495.0,5282.26123
82,0,1,381,poor,28,1483,westbrae,1459.272,1395.069336
83,2,1,819,great,7,3806,south_side,3806.0,3575.249756
84,2,1,787,good,9,3332,downtown,3332.0,3295.577881
