# Home Sale Forecasting
In this tutorial, we demonstrate how to use the forecasting capablity of EvaDB to predict the home sale price.
<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/georgia-tech-db/eva/blob/master/tutorials/16-homesale-forecasting.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" /> Run on Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/georgia-tech-db/eva/blob/master/tutorials/16-homesale-forecasting.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source on GitHub</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/georgia-tech-db/eva/raw/master/tutorials/16-homesale-forecasting.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" /> Download notebook</a>
  </td>
</table><br><br>

## Setup
We first setup the backend postgres database and EvaDB to bring AI inside database systems.

### Start Postgres

In [1]:
!apt -qq install postgresql
!service postgresql start

The following additional packages will be installed:
  libcommon-sense-perl libjson-perl libjson-xs-perl libtypes-serialiser-perl logrotate netbase
  postgresql-14 postgresql-client-14 postgresql-client-common postgresql-common ssl-cert sysstat
Suggested packages:
  bsd-mailx | mailx postgresql-doc postgresql-doc-14 isag
The following NEW packages will be installed:
  libcommon-sense-perl libjson-perl libjson-xs-perl libtypes-serialiser-perl logrotate netbase
  postgresql postgresql-14 postgresql-client-14 postgresql-client-common postgresql-common ssl-cert
  sysstat
0 upgraded, 13 newly installed, 0 to remove and 18 not upgraded.
Need to get 18.3 MB of archives.
After this operation, 51.5 MB of additional disk space will be used.
Preconfiguring packages ...
Selecting previously unselected package logrotate.
(Reading database ... 120874 files and directories currently installed.)
Preparing to unpack .../00-logrotate_3.19.0-1ubuntu1.1_amd64.deb ...
Unpacking logrotate (3.19.0-1ubuntu1.1

### Create User and Database

In [2]:
!sudo -u postgres psql -c "CREATE USER eva WITH SUPERUSER PASSWORD 'password'"
!sudo -u postgres psql -c "CREATE DATABASE evadb"

CREATE ROLE
CREATE DATABASE


###Prettify  Output

In [3]:
import warnings
warnings.filterwarnings("ignore")

### Install EvaDB

We install EvaDB with extra postgres and forecasting dependency.

In [4]:
%pip install --quiet "evadb[postgres,forecasting] @ git+https://github.com/georgia-tech-db/evadb.git@e19f13da3624bc8bb19292b121a12fa8c5ea6b36"

import evadb
cursor = evadb.connect().cursor()

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.6/92.6 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m108.9/108.9 kB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.6/137.6 kB[0m [31m13.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m110.9/110.9 kB[0m [31m12.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m170.6/170.6 kB[0m [31m16.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m98.7/98.7 kB[0m [31m10.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m774.6/774.6 kB[0m [31m21.2 MB/

Downloading: "http://ml.cs.tsinghua.edu.cn/~chenxi/pytorch-models/mnist-b07bb66b.pth" to /root/.cache/torch/hub/checkpoints/mnist-b07bb66b.pth
100%|██████████| 1.03M/1.03M [00:00<00:00, 3.28MB/s]
Downloading: "https://download.pytorch.org/models/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth" to /root/.cache/torch/hub/checkpoints/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth


## Prepare Data
We then prepara the dataset used in this time serise forecasting use case.

### Create Data Source in EvaDB
We use data source to connect EvaDB directly to underlying database systems like Postgres.

In [5]:
params = {
    "user": "eva",
    "password": "password",
    "host": "localhost",
    "port": "5432",
    "database": "evadb",
}
query = f"CREATE DATABASE postgres_data WITH ENGINE = 'postgres', PARAMETERS = {params};"
cursor.query(query).df()

Unnamed: 0,0
0,The database postgres_data has been successful...


### Load the Datasets
We load the [House Property Sales Time Series](https://www.kaggle.com/datasets/htagholdings/property-sales?resource=download) into our PostgreSQL database.

In [6]:
!mkdir -p content
!wget -qnc -O /content/home_sales.csv https://www.dropbox.com/scl/fi/2e9yyzymm0rwzria2kvzo/raw_sales.csv?rlkey=lfdr9th7csw7ru42mtaw00hx1&dl=0

In [7]:
cursor.query("""
  USE postgres_data {
    CREATE TABLE IF NOT EXISTS home_sales (datesold VARCHAR(64), postcode INT, price INT, propertyType VARCHAR(64), bedrooms INT)
  }
""").df()

Unnamed: 0,status
0,success


In [8]:
cursor.query("""
  USE postgres_data {
    COPY home_sales(datesold, postcode, price, propertyType, bedrooms)
    FROM '/content/home_sales.csv'
    DELIMITER ',' CSV HEADER
  }
""").df()

Unnamed: 0,status
0,success


### Preview the Data
The `home_sales` table contains 4 columns.
- postcode: 4 digit postcode of the suburb where the property was sold
- price: Price for which the property was sold
- bedrooms: Number of bedrooms
- datesold: Date on which this property was sold
- propertytype: Property type i.e. house or unit

In [9]:
cursor.query("SELECT * FROM postgres_data.home_sales LIMIT 3;").df()

Unnamed: 0,postcode,price,bedrooms,datesold,propertytype
0,2607,525000,4,2007-02-07 00:00:00,house
1,2906,290000,3,2007-02-27 00:00:00,house
2,2905,328000,3,2007-03-07 00:00:00,house


## Analysis Data with EvaDB

We then use EvaDB to train a model to forecast the home price.

### Train the Forecast Model
We use the [statsforecast](https://github.com/Nixtla/statsforecast) engine to train a time serise forecast model for sale prices of home with two bedrooms at the postcode `2607` area.

In [10]:
cursor.query("""
  CREATE OR REPLACE FUNCTION HomeSaleForecast FROM
    (
      SELECT propertytype, datesold, price
      FROM postgres_data.home_sales
      WHERE bedrooms = 3 AND postcode = 2607
    )
  TYPE Forecasting
  PREDICT 'price'
  HORIZON 3
  TIME 'datesold'
  ID 'propertytype'
  FREQUENCY 'W'
""").df()

Unnamed: 0,0
0,Function HomeSaleForecast added to the database.


### Use the Forecast Model
We then use the trained `HomeSaleForecast` model to predict the sale price for homes with two bedrooms for the next three weeks.

In [11]:
cursor.query("SELECT HomeSaleForecast();").df()

Unnamed: 0,propertytype,datesold,price
0,house,2019-07-21,766572.9375
1,house,2019-07-28,766572.9375
2,house,2019-08-04,766572.9375
3,unit,2018-12-23,417229.78125
4,unit,2018-12-30,409601.65625
5,unit,2019-01-06,402112.96875


We can use `ORDER BY` to find out the type of home and months that have lower market price.

In [None]:
cursor.query("SELECT HomeSaleForecast() ORDER BY price;").df()

Unnamed: 0,propertytype,datesold,price
0,unit,2019-01-06,402112.96875
1,unit,2018-12-30,409601.65625
2,unit,2018-12-23,417229.78125
3,house,2019-07-21,766572.9375
4,house,2019-07-28,766572.9375
5,house,2019-08-04,766572.9375


### Try Neuralforecast instead
By default, EvaDB uses [statsforecast](https://github.com/Nixtla/statsforecast) for time series forecasting. In the above example, we notice that there are flat predictions for the `propertytype = house`. Let's try [Neuralforecast](https://github.com/Nixtla/neuralforecast) instead to improve the prediction. We turn off the automatic hyperparameter optimization (`AUTO 'F'`) for faster results.

In [12]:
cursor.query("""
  CREATE OR REPLACE FUNCTION HomeSaleNeuralForecast FROM
    (
      SELECT propertytype, datesold, price
      FROM postgres_data.home_sales
      WHERE bedrooms = 3 AND postcode = 2607
    )
  TYPE Forecasting
  LIBRARY 'neuralforecast'
  PREDICT 'price'
  HORIZON 3
  TIME 'datesold'
  ID 'propertytype'
  FREQUENCY 'W'
  AUTO 'F'
""").df()

2023-10-23 04:38:23,095	INFO worker.py:1642 -- Started a local Ray instance.
2023-10-23 04:38:28,365	INFO tune.py:228 -- Initializing Ray automatically. For cluster usage or custom Ray initialization, call `ray.init(...)` before `Tuner(...)`.
2023-10-23 04:38:28,376	INFO tune.py:654 -- [output] This will use the new output engine with verbosity 0. To disable the new output and use the legacy output engine, set the environment variable RAY_AIR_NEW_OUTPUT=0. For more information, please see https://github.com/ray-project/ray/issues/36949


+--------------------------------------------------------------------+
| Configuration for experiment     _train_tune_2023-10-23_04-38-15   |
+--------------------------------------------------------------------+
| Search algorithm                 BasicVariantGenerator             |
| Scheduler                        FIFOScheduler                     |
| Number of trials                 10                                |
+--------------------------------------------------------------------+

View detailed results here: /root/ray_results/_train_tune_2023-10-23_04-38-15
To visualize your results with TensorBoard, run: `tensorboard --logdir /root/ray_results/_train_tune_2023-10-23_04-38-15`


[2m[36m(_train_tune pid=6110)[0m Seed set to 8
[2m[36m(_train_tune pid=6110)[0m 2023-10-23 04:38:37.862317: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
[2m[36m(_train_tune pid=6110)[0m To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


[2m[36m(_train_tune pid=6110)[0m Sanity Checking: |          | 0/? [00:00<?, ?it/s]Sanity Checking:   0%|          | 0/1 [00:00<?, ?it/s]Sanity Checking DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]
Sanity Checking DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 11.94it/s]
Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=115.0, train_loss_epoch=115.0]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=207.0, train_loss_epoch=207.0]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=76.60, train_loss_epoch=76.60]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=136.0, train_loss_epoch=136.0]
Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=95.00, train_loss_epoch=95.00]
Epoch 6:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=72.20, train_loss_epoch=72.20]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, 

[2m[36m(_train_tune pid=6110)[0m Seed set to 6


Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.160, train_loss_epoch=1.160]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.38e+4, train_loss_epoch=1.38e+4]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=5.7e+5, train_loss_epoch=5.7e+5]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=565.0, train_loss_epoch=565.0]
Epoch 5: 100%|██████████| 1/1 [00:00<00:00, 10.45it/s, v_num=0, train_loss_step=4.71e+4, train_loss_epoch=4.71e+4]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.2e+3, train_loss_epoch=1.2e+3]
Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=4e+3, train_loss_epoch=4e+3]
Epoch 10:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.65e+3, train_loss_epoch=1.65e+3]
Epoch 11:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.48e+5, train_loss_epoch=1.48e+

[2m[36m(_train_tune pid=6110)[0m Seed set to 18


Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=195.0, train_loss_epoch=195.0]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=353.0, train_loss_epoch=353.0]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=217.0, train_loss_epoch=217.0]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=324.0, train_loss_epoch=324.0]
Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=220.0, train_loss_epoch=220.0]
Epoch 6:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=65.90, train_loss_epoch=65.90]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=92.80, train_loss_epoch=92.80]
Epoch 7: 100%|██████████| 1/1 [00:00<00:00,  7.11it/s, v_num=0, train_loss_step=1.100, train_loss_epoch=1.100]
Epoch 8:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.100, train_loss_epoch=1.100]
Epoch 9:   0%|   

[2m[36m(_train_tune pid=6110)[0m Seed set to 13


Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=103.0, train_loss_epoch=103.0]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=3.44e+11, train_loss_epoch=3.44e+11]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=6.83e+10, train_loss_epoch=6.83e+10]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.81e+8, train_loss_epoch=1.81e+8]
Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=6.29e+11, train_loss_epoch=6.29e+11]
Epoch 6:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=5.13e+10, train_loss_epoch=5.13e+10]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.65e+10, train_loss_epoch=1.65e+10]
Epoch 8:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=6.45e+7, train_loss_epoch=6.45e+7]
Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=4.42e+10, train_lo

[2m[36m(_train_tune pid=6110)[0m Seed set to 17


Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=9.31e+4, train_loss_epoch=9.31e+4]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=4.15e+5, train_loss_epoch=4.15e+5]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.34e+5, train_loss_epoch=1.34e+5]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.17e+5, train_loss_epoch=1.17e+5]
Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.04e+5, train_loss_epoch=1.04e+5]
Epoch 6:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=9.71e+4, train_loss_epoch=9.71e+4]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.13e+5, train_loss_epoch=1.13e+5]
Epoch 8:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=8.78e+4, train_loss_epoch=8.78e+4]
Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=9.15e+4, train_loss_epoc

[2m[36m(_train_tune pid=6110)[0m Seed set to 3


Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=8.630, train_loss_epoch=8.630]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=585.0, train_loss_epoch=585.0]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=15.00, train_loss_epoch=15.00]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=351.0, train_loss_epoch=351.0]
Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=902.0, train_loss_epoch=902.0]
Epoch 6:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=163.0, train_loss_epoch=163.0]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=116.0, train_loss_epoch=116.0]
Epoch 8:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=8.050, train_loss_epoch=8.050]
Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=177.0, train_loss_epoch=177.0]
Epoch 10:   0%|          

[2m[36m(_train_tune pid=6110)[0m Seed set to 4


Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.350, train_loss_epoch=1.350]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.290, train_loss_epoch=1.290]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.270, train_loss_epoch=1.270]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.180, train_loss_epoch=1.180]
Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.170, train_loss_epoch=1.170]
Epoch 6:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.190, train_loss_epoch=1.190]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.100, train_loss_epoch=1.100]
Epoch 8:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.070, train_loss_epoch=1.070]
Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.070, train_loss_epoch=1.070]
Epoch 10:   0%|          

[2m[36m(_train_tune pid=6110)[0m Seed set to 2


Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.04e+5, train_loss_epoch=1.04e+5]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.53e+10, train_loss_epoch=1.53e+10]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.28e+10, train_loss_epoch=1.28e+10]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=2.14e+10, train_loss_epoch=2.14e+10]
Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.89e+10, train_loss_epoch=1.89e+10]
Epoch 6:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.41e+9, train_loss_epoch=1.41e+9]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=9.54e+8, train_loss_epoch=9.54e+8]
Epoch 8:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=5.17e+7, train_loss_epoch=5.17e+7]
Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.12e+7, train_loss_epoch=1.12e+7]
Epoch 10:   0%|          | 0/

[2m[36m(_train_tune pid=6110)[0m Seed set to 16


Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 1:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.200, train_loss_epoch=1.200]
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.9e+3, train_loss_epoch=1.9e+3]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=4.170, train_loss_epoch=4.170]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.01e+4, train_loss_epoch=1.01e+4]
Epoch 5:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=3.42e+3, train_loss_epoch=3.42e+3]
Epoch 6:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=2.120, train_loss_epoch=2.120]
Epoch 7:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=2.360, train_loss_epoch=2.360]
Epoch 8:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.840, train_loss_epoch=1.840]
Epoch 9:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=830.0, train_loss_epoch=830.0]
Epoch 10:   0%|

[2m[36m(_train_tune pid=6110)[0m Seed set to 11


[2m[36m(_train_tune pid=6110)[0m Sanity Checking: |          | 0/? [00:00<?, ?it/s]Sanity Checking:   0%|          | 0/1 [00:00<?, ?it/s]Sanity Checking DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]Sanity Checking DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 117.69it/s]                                                                            Training: |          | 0/? [00:00<?, ?it/s]Training:   0%|          | 0/1 [00:00<?, ?it/s]Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] 
Epoch 2:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.49e+6, train_loss_epoch=1.49e+6]
Epoch 3:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=1.15e+5, train_loss_epoch=1.15e+5]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=3.34e+5, train_loss_epoch=3.34e+5]
Epoch 4: 100%|██████████| 1/1 [00:00<00:00, 15.78it/s, v_num=0, train_loss_step=1.45e+5, train_loss_epoch=3.34e+5]
Epoch 4:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, tr

INFO:lightning_fabric.utilities.seed:Seed set to 2


[2m[36m(_train_tune pid=6110)[0m Epoch 497: 100%|██████████| 1/1 [00:00<00:00, 15.53it/s, v_num=0, train_loss_step=6.2e+3, train_loss_epoch=6.2e+3, valid_loss=8.59e+4]Epoch 497: 100%|██████████| 1/1 [00:00<00:00, 15.42it/s, v_num=0, train_loss_step=4.65e+3, train_loss_epoch=6.2e+3, valid_loss=8.59e+4]Epoch 497: 100%|██████████| 1/1 [00:00<00:00, 15.29it/s, v_num=0, train_loss_step=4.65e+3, train_loss_epoch=4.65e+3, valid_loss=8.59e+4]Epoch 497:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=4.65e+3, train_loss_epoch=4.65e+3, valid_loss=8.59e+4]        Epoch 498:   0%|          | 0/1 [00:00<?, ?it/s, v_num=0, train_loss_step=4.65e+3, train_loss_epoch=4.65e+3, valid_loss=8.59e+4]
[2m[36m(_train_tune pid=6110)[0m Epoch 498: 100%|██████████| 1/1 [00:00<00:00, 15.70it/s, v_num=0, train_loss_step=4.65e+3, train_loss_epoch=4.65e+3, valid_loss=8.59e+4]Epoch 498: 100%|██████████| 1/1 [00:00<00:00, 15.54it/s, v_num=0, train_loss_step=6.92e+3, train_loss_epoch=4.65e+3

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Unnamed: 0,0
0,Function HomeSaleNeuralForecast added to the d...


As we can see from the following query, with Neuralforecast, we no longer have flat precitions for the home price. The better prediction comes with a trade-off, which is that the training time (i.e., 28 minutes) is longer than statsforecast (i.e., 21 seconds).

In [13]:
cursor.query("SELECT HomeSaleNeuralForecast() ORDER BY price;").df()

INFO:lightning_fabric.utilities.seed:Seed set to 2


Predicting: |          | 0/? [00:00<?, ?it/s]

Unnamed: 0,propertytype,datesold,price
0,unit,2018-12-23,498183.21875
1,unit,2019-01-06,499462.1875
2,unit,2018-12-30,525323.25
3,house,2019-07-28,693413.5
4,house,2019-08-04,714673.5625
5,house,2019-07-21,725615.25


### Predict the Home Sale Price at different postcode

By choosing a different ID column (i.e., `ID 'postcode'`), we can instead predict the home sale price at different postcode instead of different propertytype.


In [15]:
cursor.query("""
  CREATE OR REPLACE FUNCTION HomeSalePostcodeForecast FROM
    (
      SELECT postcode, datesold, price
      FROM postgres_data.home_sales
      WHERE bedrooms = 3 AND propertytype = "house"
    )
  TYPE Forecasting
  PREDICT 'price'
  HORIZON 3
  TIME 'datesold'
  ID 'postcode'
  FREQUENCY 'W'
""").df()

Unnamed: 0,0
0,Function HomeSalePostcodeForecast added to the...


We now can use the following query to find out the postcode and date with lowest predicted prices.

In [24]:
cursor.query("""
  SELECT * FROM (
    SELECT HomeSalePostcodeForecast()
  ) AS T
  WHERE price > 0
  ORDER BY price
  LIMIT 5;
""").df()

Unnamed: 0,postcode,datesold,price
0,2609,2017-07-02,255000.0
1,2609,2017-06-25,255000.0
2,2609,2017-06-18,255000.0
3,2616,2011-03-06,391000.0
4,2616,2011-03-13,391000.0
