# Assessment Quality
This notebook runs a few quick studies on already-cleaned and prepared data.

You will need to have previously run the pipeline through the end of either 1-assemble or 2-clean.  
Horizontal equity studies are only available if you have run through 2-clean, but the rest will work with just 1-assemble.

**User settings**

Change the below to whatever you need:

In [None]:
# The slug of the locality you are currently working on
locality = "us-nc-guilford"

# The model group we're operating on
model_group = "single_family"

# General settings:
prediction_field = "assr_market_value"
land_prediction_field = "assr_land_value"
sales_field = "sale_price"

# Study period:
end_date = "2025-01-01"
start_date = "2024-01-01"

# Ratio study settings:
max_trim = 0.05

In [None]:
import init_notebooks
init_notebooks.setup_environment()

# Calculations
Don't change anything below this line. This does the actual calculations.

In [None]:
# Load functions from OpenAVMKit
from openavmkit.pipeline import (
    init_notebook,
    read_sales_univ, 
    run_ratio_study, 
    run_horizontal_equity_study, 
    run_vertical_equity_study, 
    plot_prediction_vs_sales
)

In [None]:
init_notebook(locality)

In [None]:
from openavmkit.pipeline import load_settings
from openavmkit.data import read_predictions
from openavmkit.filters import select_filter

# Read our data
try:
    sales_univ_pair = read_sales_univ("out/look/3-spatial-lag-")
    print("Read files from notebook 3")
except ValueError as e:
    try:
        sales_univ_pair = read_sales_univ("out/look/2-clean-")
        print("Read files from notebook 2")
    except ValueError as e:
        sales_univ_pair = read_sales_univ("out/look/1-assemble-")
        print("Error reading files from notebook 2.\nBacking up to notebook 1, basic tests will work but horizontal equity won't be available.")

if prediction_field == "prediction":
    df_sales = sales_univ_pair.sales
    df_pred_sales = read_predictions("ensemble", model_group, "sales")
    df_sales = df_sales.merge(df_pred_sales, on="key_sale", how="left")
    df_sales = df_sales[~df_sales["prediction"].isna()]
    if len(sales_filters) > 0:
        df_sales = select_filter(df_sales, sales_filters)
    sales_univ_pair.sales = df_sales
    
    df_univ = sales_univ_pair.universe
    df_pred_univ = read_predictions("ensemble", model_group, "universe")
    df_univ = df_univ.merge(df_pred_univ, on="key", how="left")
    df_univ = df_univ[~df_univ["prediction"].isna()]
    if len(univ_filters) > 0:
        df_sales = select_filter(df_univ, univ_filters)
    sales_univ_pair.universe = df_univ

## Accuracy -- Ratio Study

In [None]:
ratio_study = run_ratio_study(
    sales_univ_pair,
    model_group,
    prediction_field,
    sales_field,
    start_date,
    end_date,
    land_only = False,
    max_trim = max_trim
)

In [None]:
display(ratio_study.summary())

# Is that good?

You want to see these values:
 - Median ratio within bounds
 - COD below target
 - If the main stat is out of range, you still pass if the target is within your 95% CI
 - IAAO standard grades you on the *trimmed* score

**Example:**  
Large to mid-sized single family residential improved should have a COD below `15`and a median ratio between `0.90` and `1.10`, according to IAAO guidelines.  
Local standards vary widely; many insist on a median ratio between `0.95` to `1.05`, others deliberately target a low-ball figure of e.g. `0.9`, `.0.8`, `0.7`, etc.

Things to consider:
- Low COD (< 5) *can* indicate sales chasing, COD very close to 0 is particularly suspicious
  - Horizontal equity study (below) can tell you more about that
- Local standards often copy IAAO guidelines, but not always
- Always look at untrimmed scores to keep yourself honest
- **Always check local standards**


Below you will find a summary of [IAAO Standard on ratio studies](https://www.iaao.org/wp-content/uploads/Standard_on_Ratio_Studies.pdf), Table 2-3, page 34.

------

<table style="margin-left:0; margin-right:auto;">
  <tr>
      <th style="text-align:left;">Jurisdiction Size/Profile/Market Activity</th>
      <th style="text-align:left;">Abbreviation</th>
  </tr>
  <tr>
      <td>Very large jurisdictions/densely populated/newer properties/active markets</td>
      <td>3</td>
  </tr>
  <tr>
      <td>Large to mid-sized jurisdictions/older & newer properties/less active markets</td>
      <td>2</td>
  </tr>
  <tr>
      <td>Rural or small jurisdictions/little development/depressed markets</td>
      <td>1</td>
  </tr>
</table>

<table style="margin-left:0; margin-right:auto;">
    <tr>
        <th style="text-align:left;">General Property Class</th>
        <th style="text-align:left;">Examples</th>
    </tr>
    <tr>
        <td>Residential Improved</td>
        <td>single family dwellings, condominiums, manuf. housing, 2-4 family units</td>
    </tr>
    <tr>
        <td>Income-producing properties</td>
        <td>commercial, industrial, apartments</td>
    </tr>
</table>

*"The COD performance recommendations are based upon representative and adequate sample sizes, with outliers trimmed and a 95%
level of confidence."*

<table style="margin-left:0; margin-right:auto;">
  <tr><th style="text-align:left;">Class + Size/Profile/Activity</th><th style="text-align:left;">COD Range</th></tr>
  <tr><td>Residential improved</td><td></td></tr>
  <tr><td>3</td><td>5.0 - 10.0</td></tr>
  <tr><td>2</td><td>5.0 - 15.0</td></tr>
  <tr><td>1</td><td>5.0 - 20.0</td></tr>
  <tr><td>Income-producing properties</td><td></td></tr>
  <tr><td>3</td><td>5.0 - 15.0</td></tr>
  <tr><td>2</td><td>5.0 - 20.0</td></tr>
  <tr><td>1</td><td>5.0 - 25.0</td></tr>
  <tr><td>Residential vacant land</td><td></td></tr>
  <tr><td>3</td><td>5.0 - 15.0</td></tr>
  <tr><td>2</td><td>5.0 - 20.0</td></tr>
  <tr><td>1</td><td>5.0 - 25.0</td></tr>
  <tr><td>Other (non-agricultural) vacant land</td><td></tr>
  <tr><td>3</td><td>5.0 - 20.0</td></tr>
  <tr><td>2</td><td>5.0 - 25.0</td></tr>
  <tr><td>1</td><td>5.0 - 30.0</td></tr>
</table>

*"These types of property are provided for general guidance only and may not represent jurisdictional requirements"*  
*"CODs lower than 5.0 may indicate sales chasing or non-representative samples"*  
*"Appraisal level recommendation for each type of property shown should be between `0.90` and `1.10`"*  


## Consistency -- Horizontal Equity

If you have run notebook 2, this part will tell you about the *consistency* of your valuations, regardless of how close they are to sale prices.  
If you have not run notebook 2, this part will be skipped.

In [None]:
he_study = run_horizontal_equity_study(
    sales_univ_pair, 
    model_group, 
    prediction_field
)

In [None]:
if he_study is not None:
    display(he_study.summary.print())
else:
    print("No horizontal equity id found, skipping...")

## Detect sales chasing
With a horizontal equity study in hand, we can check for sales chasing by seeing if any clusters that include sales have very different valuations from physically similar properties

# Is that good?

OpenAVMKit's horizontal equity study is not an IAAO standard and is not written into in any local standards that we know of.

Here are our own recommendations:

- Median CHD should be about the same or less than the IAAO COD guidelines for the property type (e.g. `15` for typical single family)
- 75th %ile CHD should not be crazy high
- A very low COD (< 5) *and* a high Median CHD, is a strong indicator of sales chasing
  - This is because properties that sold are basically copying the sale price, whereas unsold properties have wildly different valuations, even when their characteristics and locations are similar

The test works by dividing your model group into clusters of similarly-located, physically similar properties (e.g. same land use, same location, same size, same type, same age, etc). You would naturally expect similar valuations for such properties.

- The CHD measures variation in valuation within a cluster
- It has nothing to do with sale prices
- It has nothing to do with accuracy
- It only measures horizontal uniformity (consistency)

The test is not completely objective because drawing clusters is always subjective. This test is only as good as the quality of the clusters.

## Fairness -- Vertical Equity

In [None]:
ve_study = run_vertical_equity_study(
    sales_univ_pair,
    model_group,
    prediction_field,
    sales_field,
    "census_tract",
    start_date,
    end_date
)

In [None]:
ve_study.summary()

# Is that good?

IAAO guidelines:
- PRD:
  - fails if outside `0.98` and `1.03` 
  - passes if 95% CI is within tolerance range
  - if 95% CI overlaps 1.0, not statistically significant evidence of vertical inequity bias
- PRB:
  - fails if outside `-0.10` to `0.10`
  - passes if 95% CI is within tolerance range
  - should ideally fall between `-0.05` and `0.05`
  - if 95% CI overlaps 0.0, not statistically significant evidence of vertical inequity bias
- PRB is generally preferred to PRD
- **always consult your local standards**

Quotes from the IAAO standard on ratio studies:

*"As a general matter, the PRB coefficient should fall between –0.05 and 0.05. PRBs for which 95% confidence intervals fall outside of this range indicate that one can reasonably conclude that assessment levels change by more than 5% when values are halved or doubled."*

*"PRBs for which 95% confidence intervals fall outside the range of –0.10 to 0.10 indicate unacceptable vertical inequities"*

*"PRD standards are not absolute and may be less meaningful when samples are small or when wide variation in prices exist. In such cases, statistical tests of vertical equity hypotheses should be substituted."*

*"Alternatively, assessing officials can rely on the PRB, which is less sensitive to atypical prices and ratios"*

# Visualizations

In [None]:
plot_prediction_vs_sales(
    sales_univ_pair,
    model_group,
    prediction_field,
    sales_field,
    start_date,
    end_date
)

In [None]:
plot_prediction_vs_sales(
    sales_univ_pair,
    model_group,
    land_prediction_field,
    sales_field,
    start_date,
    end_date,
    land_only=True
)

In [None]:
ve_study.plot_quantiles(ylim=None)
ve_study.plot_quantiles(ylim=[0.6,0.8], ci_bounds=True)
ve_study.plot_quantiles(ylim=None, grouped=True)
ve_study.plot_quantiles(ylim=[0.6,0.8], ci_bounds=True, grouped=True)