# House Prices

In [None]:
from tidychef import acquire, preview
from tidychef.selection import XlsSelectable

table: XlsSelectable = acquire.xls.http("https://raw.githubusercontent.com/mikeAdamss/tidychef/main/tests/fixtures/xls/house-prices.xls", tables="Table 11")
preview(table, bounded="A1:M20")

From an xlx source which can be [downloaded here](https://raw.githubusercontent.com/mikeAdamss/tidychef/main/tests/fixtures/xls/house-prices.xls).

## Requirements

- We'll take "Year" and "Quarter" from the appropriate values in columns B and C.
- We'll take populated cells on row 4 as "Housing" and we'll strip the "4" notation away.
- We'll take "Area" and "Area Code" from column A (see United Kingdom and K02000001 as the examples).
- We'll call the observations column "Value" and we'll strip any trailing ".0"s.

The key lesson here is the use of `closest` to get the quarter. Remember the "closest" you can be to something on a directional axis is _level with it_ (so in this example: observations on 9 will resolve "closest above" to Q2 **also** on row 9).  

In [None]:
from tidychef import acquire, preview, filters
from tidychef.direction import up, down, right
from tidychef.output import TidyData, Column
from tidychef.selection import XlsSelectable

table: XlsSelectable = acquire.xls.http("https://raw.githubusercontent.com/mikeAdamss/tidychef/main/tests/fixtures/xls/house-prices.xls", tables="Table 11")

housing = table.re('New dwellings').assert_one().expand(right).is_not_blank().label_as("Housing")
area_code = table.excel_ref("A").is_not_blank().re("[A-Z][0-9].*").label_as("Area Code")
area = area_code.shift(up).label_as("Area")
year = area.shift(right).expand(down).is_not_blank().label_as("Year")
quarter = year.shift(right).expand(down).is_not_blank().label_as("Quarter")
observations = quarter.fill(right).is_not_blank().filter(filters.is_not_numeric).label_as("Value")

# Create a bounded preview inline but also write the full preview to path
preview(observations, housing, area_code, area, year, quarter, bounded="A1:M20")
preview(observations, housing, area_code, area, year, quarter, path="house-prices.html")

tidy_data = TidyData(
    observations,
    Column(housing.finds_observations_directly(down), apply=lambda x: x.rstrip("4")),
    Column(area.finds_observations_closest(down)),
    Column(area_code.finds_observations_closest(down)),
    Column(year.finds_observations_closest(down), apply=lambda x: x.replace(".0", "")),
    Column(quarter.finds_observations_directly(right)),
    obs_apply = lambda x: x.replace(".0", "")
)

tidy_data.to_csv("house-prices.csv")

# Outputs

The full preview can be [downloaded here](./house-prices.html).

The tidy data can be [downloaded here](./house-prices.csv) and a full inline preview of the tidydata generated is shown below for those people who'd prefer to scroll.

In [None]:
print(tidy_data)