# User inferer

This examples shows how user-inferer can be implemented to agument a new attribute to existing brails-created inventory. This feature is useful when user know the rulesets to infer a new attribute (e.g. contents value) from existing attributes (e.g. occupancy type and building replacement cost).

## Formatting the user-defined inferer file

Let's first set the path to user-inferer python file. We will use "content_value_inferer.py" to infer contents value using occupancy type and building replacement cost information provided by NSI.

In [1]:
import os
cwd = os.getcwd()
filepath =  os.path.join(cwd,"content_value_inferer.py")

The user-inferer python file should contain a function named **user_inferer**. Let's first see how the function looks like. The input **inventory_dict** and outputs **new_features** are dictionaries and their examples will be given subsequently.

In [2]:
# Read the file contents
with open(filepath, 'r') as f:
    file_contents = f.read()

# Display the content with Markdown
from IPython.display import display, Markdown
display(Markdown(f'The contents of {filepath}'))
display(Markdown(f'---'))
display(Markdown(f'```python\n{file_contents}\n```'))
display(Markdown(f'---'))


The contents of C:\Users\SimCenter\Sangri\BrailsPlusPlus2\examples\inference\content_value_inferer.py

---

```python
import numpy as np
def user_inferer(inventory_dict):
    #
    # Defining my mapping following Table 6-10 in Hazus Inventory Technical Manual 6
    # (Baseline Hazus Contents Value as Percent of Structure Value)
    #
    contents_value_over_str_value = {
        "RES1": 0.50,
        "RES2": 0.50,
        "RES3A": 0.50,
        "RES3B": 0.50,
        "RES3C": 0.50,
        "RES3D": 0.50,
        "RES3E": 0.50,
        "RES3F": 0.50,
        "RES3": 0.50,
        "RES4": 0.50,
        "RES5": 0.50,
        "RES6": 0.50,
        "COM1": 1.00,
        "COM2": 1.00,
        "COM3": 1.00,
        "COM4": 1.00,
        "COM5": 1.00,
        "COM6": 1.50,
        "COM7": 1.50,
        "COM8": 1.00,
        "COM9": 1.00,
        "COM10": 0.50,
        "IND1": 1.50,
        "IND2": 1.50,
        "IND3": 1.50,
        "IND4": 1.50,
        "IND5": 1.50,
        "IND6": 1.00,
        "AGR1": 1.00,
        "REL1": 1.00,
        "GOV1": 1.00,
        "GOV2": 1.50,
        "EDU1": 1.00,
        "EDU2": 1.50
    }
    new_features = {}
    for idx,bldg in inventory_dict.items():
        occ_type = bldg["properties"]["occupancy"]
        contents_value_ratio = contents_value_over_str_value.get(occ_type, np.nan)
        contents_value = contents_value_ratio * bldg["properties"]["repaircost"]
        new_features[idx] = {"contentsValue": contents_value}

    return new_features

```

---

It enumerates the existing inventory in **inventory_dict** and creates **new_feature** dictionary that contains the new attributes

An example of **inventory_dict** provided by Brails would look like below.

---
```json
inventory_json = {
    0: {
        "type": "Building",
        "properties": {
            "type": "Building",
            "buildingheight": "NA",
            "erabuilt": 1983,
            "numstories": 1,
            "roofshape": "flat",
            "fpAreas": 27433,
            "lon": -81.92019722,
            "lat": 26.43725715,
            "fparea": 32663.7,
            "repaircost": 3968655.62,
            "constype": "W1",
            "occupancy": "COM1",
            "fd_id": 497575843
        },
        "geometry": {
            "type": "Polygon",
            "coordinates": [
                [-81.9202572, 26.4375827],
                [-81.920495, 26.4370076],
                [-81.9201985, 26.4369093],
                [-81.9201437, 26.4368912],
                [-81.919906, 26.4374663],
                [-81.9202572, 26.4375827]
            ]
        }
    },
    1: {
        "type": "Building",
        "properties": {
            "type": "Building",
            "buildingheight": "NA",
            "erabuilt": 1983.0,
            "numstories": 1.0,
            "roofshape": "flat",
            "fpAreas": 8238,
            "fparea": 605.35504,
            "repaircost": 212759.348,
            "constype": "W1",
            "occupancy": "RES1"
        },
        "geometry": {
            "type": "Polygon",
            "coordinates": [
                [-81.9191106, 26.438107],
                [-81.9190345, 26.4381961],
                [-81.9189942, 26.4382432],
                [-81.9189849, 26.4382368],
                [-81.9189378, 26.4382919],
                [-81.9188165, 26.4382088],
                [-81.9188396, 26.4381817],
                [-81.9187882, 26.4381466],
                [-81.9188034, 26.4381288],
                [-81.9187612, 26.4380998],
                [-81.9187527, 26.4380479],
                [-81.918853, 26.4379305],
                [-81.9191106, 26.438107]
            ]
        }
    },
    .....
}
```
---

Note that it takes the building id as key and contains the existing attributes under the key "properties". 

The resulting **new_features** would look like.

---
```json
new_features = {
    0: {
        "contentsValue": 3968655.62
    },
    1: {
        "contentsValue": 106379.674
    },
    .....
}
```
---
    

## Generation of base inventory

Before running user-inferer, let's create a baseline inventory using NSI attributes, OSM footprint info, and imputation.

### Scraping OSM

In [3]:
import sys
import copy

sys.path.insert(0, "../../")
from brails.utils import Importer
from brails.types.image_set import ImageSet    
from brails.types.asset_inventory import Asset, AssetInventory
importer = Importer()

INFO:numexpr.utils:NumExpr defaulting to 8 threads.


In [4]:
region_data = {"type": "locationName", "data": "Fort Myers Beach"}
region_boundary_class = importer.get_class("RegionBoundary")
region_boundary_object = region_boundary_class(region_data)

In [5]:
#
# Get Footprints using OSM
#

print("Trying OSM_FootprintsScraper ...")

osm_class = importer.get_class("OSM_FootprintScraper")
osm_data = {"length": "ft"}
osm = osm_class(osm_data)
osm_inventory = osm.get_footprints(region_boundary_object)


Trying OSM_FootprintsScraper ...

Searching for Fort Myers Beach...
Found Fort Myers Beach, Lee County, Florida, 33931, United States

Found a total of 2766 building footprints in Fort Myers Beach


### Scraping NSI atributes and merging them with OSM

In [6]:
nsi_class = importer.get_class("NSI_Parser")
nsi = nsi_class()

In [7]:
my_inventory = nsi.get_filtered_data_given_inventory(osm_inventory, "ft")


Getting National Structure Inventory (NSI) building data for the entered location...
Found a total of 2503 building points in NSI that match the footprint data.


In [8]:
# There can be missing attributes
my_inventory.get_asset_features(1)

(True,
 {'type': 'Building',
  'buildingheight': 'NA',
  'erabuilt': 'NA',
  'numstories': 'NA',
  'roofshape': 'NA',
  'fpAreas': 8238})

### Imputing missing attributes

In [9]:
knn_imputer_class = importer.get_class("KnnImputer")
imputer=knn_imputer_class(my_inventory,n_possible_worlds=10, exclude_features=["lon","lat","fd_id"])
fort_myers_imputed = imputer.impute()

  bldg_properties_df = bldg_properties_df.replace("NA", np.nan, inplace=False)


Features with no reference data cannot be imputed. Removing them from the imputation target: buildingheight
Missing percentages among 2766 assets
erabuilt: 9.51%
numstories: 9.51%
roofshape: 99.82%
fparea: 9.51%
repaircost: 9.51%
constype: 9.51%
occupancy: 9.51%
found_ht: 9.51%
Primitive imputation done.
Running the main imputation. This may take a while.
Done imputation. It took 0.10 mins


In [10]:
fort_myers_imputed.get_asset_features(1)

(True,
 {'type': 'Building',
  'buildingheight': 'NA',
  'erabuilt': 1983.0,
  'numstories': 1.0,
  'roofshape': 'flat',
  'fpAreas': 8238,
  'fparea': [3048.0,
   522.0,
   1248.0,
   1248.0,
   1248.0,
   1248.0,
   1248.0,
   1248.0,
   1248.0,
   1248.0],
  'repaircost': [224669.279,
   115813.171,
   124700.024,
   124700.024,
   124700.024,
   124700.024,
   124700.024,
   124700.024,
   124700.024,
   124700.024],
  'constype': 'W1',
  'occupancy': ['REL1',
   'RES1',
   'RES1',
   'RES1',
   'RES1',
   'RES1',
   'RES1',
   'RES1',
   'RES1',
   'RES1'],
  'found_ht': [1.5, 0.5, 0.5, 0.5, 1.5, 1.5, 1.5, 0.5, 0.5, 8.0]})

Now the base inveotires of Fort Myers Beach is created. The filtered NSI inventory does not contain information on "contentsValue". We want to add this through the user-inferer

# Example 1: Run user-inferer to update the content values

In [11]:
user_inferer_class = importer.get_class("UserInferer")
inferer=user_inferer_class(fort_myers_imputed,filepath)
fort_myers_inferred = inferer.infer()

All assets are updated


In [12]:
fort_myers_inferred.get_asset_features(55)[1]

{'type': 'Building',
 'buildingheight': 'NA',
 'erabuilt': [1973.0,
  1973.0,
  1973.0,
  1973.0,
  1973.0,
  1973.0,
  1974.0,
  1973.0,
  1973.0,
  1973.0],
 'numstories': [1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0],
 'roofshape': 'flat',
 'fpAreas': 564,
 'fparea': [2868.42,
  6294.28172,
  6294.28172,
  6294.28172,
  6294.28172,
  6294.28172,
  6294.28172,
  6294.28172,
  6294.28172,
  8691.0],
 'repaircost': [550882.859,
  550882.859,
  944314.165,
  550882.859,
  971575.799,
  550882.859,
  550882.859,
  550882.859,
  550882.859,
  550882.859],
 'constype': 'W1',
 'occupancy': ['COM4',
  'COM4',
  'RES1',
  'RES1',
  'COM4',
  'COM4',
  'RES1',
  'COM4',
  'COM4',
  'RES1'],
 'found_ht': 0.5,
 'contentsValue': [550882.859,
  550882.859,
  472157.0825,
  275441.4295,
  971575.799,
  550882.859,
  275441.4295,
  550882.859,
  550882.859,
  275441.4295]}

The contentsValues are now added. Note that, because there are multiple possible worlds of occupancy types (coming from probablistic imputation), contents value can be  evaluated differently for each world

# Example 2: Run-user inferer to update the floor area

Let us import another user-inferer script to estimate the floor area of average and maximum plan area ('fpAreas' and 'fpAreas_max') using occupancy type information.

Note that 'fpAreas' already exist in your inventory. You can either overwrite the existing one or not. By default, it overwrites the existing values. 

In [13]:
import os
cwd = os.getcwd()
filepath_fp =  os.path.join(cwd,"floor_area_inferer.py")

In [14]:
inferer=user_inferer_class(fort_myers_imputed,filepath_fp)
fort_myers_inferred_fp = inferer.infer()

All assets are updated


In [15]:
fort_myers_inferred_fp.get_asset_features(1)[1]

{'type': 'Building',
 'buildingheight': 'NA',
 'erabuilt': 1983.0,
 'numstories': 1.0,
 'roofshape': 'flat',
 'fpAreas': ['NA', 1500, 1500, 1500, 1500, 1500, 1500, 1500, 1500, 1500],
 'fparea': [3048.0,
  522.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0],
 'repaircost': [224669.279,
  115813.171,
  124700.024,
  124700.024,
  124700.024,
  124700.024,
  124700.024,
  124700.024,
  124700.024,
  124700.024],
 'constype': 'W1',
 'occupancy': ['REL1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1'],
 'found_ht': [1.5, 0.5, 0.5, 0.5, 1.5, 1.5, 1.5, 0.5, 0.5, 8.0],
 'fpAreas_max': ['NA', 5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000]}

### You can also avoid the overwritting of already existing values

In [16]:
inferer=user_inferer_class(fort_myers_imputed,filepath_fp, overwrite=False)
fort_myers_inferred_fp2 = inferer.infer()

All assets are updated


In [17]:
fort_myers_inferred_fp2.get_asset_features(1)[1]

{'type': 'Building',
 'buildingheight': 'NA',
 'erabuilt': 1983.0,
 'numstories': 1.0,
 'roofshape': 'flat',
 'fpAreas': 8238,
 'fparea': [3048.0,
  522.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0,
  1248.0],
 'repaircost': [224669.279,
  115813.171,
  124700.024,
  124700.024,
  124700.024,
  124700.024,
  124700.024,
  124700.024,
  124700.024,
  124700.024],
 'constype': 'W1',
 'occupancy': ['REL1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1',
  'RES1'],
 'found_ht': [1.5, 0.5, 0.5, 0.5, 1.5, 1.5, 1.5, 0.5, 0.5, 8.0],
 'fpAreas_max': ['NA', 5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000, 5000]}