## Install BRAILS++ and Folium

Before running the following cells, install the BRAILS++ package along with `folium` for interactive mapping. 

In [1]:
!pip install brails
!pip install folium



# Import Required Packages

In [25]:
import numpy as np
from brails.utils import Importer
import folium

## Define Location and Scraper Information

Specify the target area, output file and number of realizations (worlds) for inventory creation.  

- **LOCATION**: The name of the target area (`'Berkeley, CA'`).  
- **LOCATION_TYPE**: Specifies how the location is defined, by name (`'locationName'`) or through polygon coordinates (`'locationPolygon'`). 
- **INVENTORY_OUTPUT**: Filename for saving the retrieved building footprint inventory in GeoJSON format.
- **NO_POSSIBLE_WORLDS**: Number of inventory realizations to generate.
- **LENGTH_UNIT**: Base unit to be used for attributes that involve length or area measurements.

In [3]:
LOCATION_NAME = 'San Francisco California'
LOCATION_TYPE = 'locationName'
INVENTORY_OUTPUT = 'SFInventory_EQ.geojson'
NO_POSSIBLE_WORLDS = 1
LENGTH_UNIT = 'ft'

# Create and Importer object to Pull In Required BRAILS Modules

In [4]:
importer = Importer()

## Create a Region Boundary for the Area of Interest

We begin by initializing an `Importer` instance, followed by creating a `RegionBoundary` object that defines the target area. Subsequent modules will use the `region_boundary_object` to limit inventory creation to the specified location.

In [5]:
# Create an Importer instance:
region_data = {"type": LOCATION_TYPE, "data": LOCATION_NAME}

# Create a region boundary:
region_boundary_class = importer.get_class("RegionBoundary")
region_boundary_object = region_boundary_class(region_data)

# Get Raw NSI Data for the Defined Region

To begin creating our inventory, we first fetch the National Structure Inventory (NSI) data for the specified region using the `NSI_Parser` module in BRAILS++. This data serves as the baseline building attribute information for the area.

The `get_raw_data method` retrieves the NSI data exactly as it exists, without applying any modifications.

In [6]:
nsi = importer.get_class('NSI_Parser')()
nsi_inventory = nsi.get_raw_data(region_boundary_object)


Searching for San Francisco California...
Found San Francisco, California, United States


INFO:root:
Getting National Structure Inventory (NSI) building data for the entered location...



Found a total of 167283 building points in NSI that are within the entered region of interest


## Create a Building Footprint Scraper and Retrieve the Building Footprints for the Specified Region

First, select a scraper class to obtain geometric footprint data for buildings within a given region. Available footprint scraper classes:  
- `OSM_FootprintScraper`: Gets OpenStreetMap data  
- `USA_FootprintScraper`: Retrieves FEMA USA Structures dataset  
- `MS_FootprintScraper`: Uses Microsoft building footprints  
- `OvertureMapsFootprintScraper`: Uses Overture Maps data  

In the example below, we dynamically load the chosen scraper class, set the output length units to feet, and then request all building footprints within the specified `region_boundary_object`. The result, `footprint_inventory`, contains the geometric data for further analysis.

In [7]:
footprint_scraper = importer.get_class('OSM_FootprintScraper')({'length': LENGTH_UNIT})
footprint_inventory = footprint_scraper.get_footprints(region_boundary_object)


Searching for San Francisco California...
Found San Francisco, California, United States

Found a total of 160427 building footprints in San Francisco


# Create a Baseline Inventory by Merging NSI Raw Data and Extracted Footprint Data

Next, we combine the NSI data with the building footprint data (`footprint_inventory`) to create a baseline building inventory. This is done using the `get_filtered_data_given_inventory` method.

The `get_extended_features` argument allows the inclusion of additional details, such as whether a building is split-level or has a basement.

In [8]:
nsi_inventory = nsi.get_filtered_data_given_inventory(
    footprint_inventory, 
    LENGTH_UNIT, 
    get_extended_features=True
)


Getting National Structure Inventory (NSI) building data for the entered location...
Found a total of 147446 building points in NSI that match the footprint data.


# Fill Missing Values in the Baseline Inventory Using KNN Imputation

After creating our baseline inventory, some building attributes may still be missing. We use a K-Nearest Neighbors (KNN) approach provided by the BRAILS++ module `KnnImputer`. This step fills in missing values and generates complete inventory realizations.

The process to run KNN imputation works as follows:
1. **Get the KNN imputer class** using the `Importer`.  
2. **Create an imputer instance**, providing:  
   - `nsi_inventory`: the baseline NSI inventory  
   - `n_possible_worlds`: the number of inventory realizations to generate  
   - `exclude_features`: features to skip during imputation (`'lat'`, `'lon'`, `'fd_id'`)  
3. **Run the imputer** by calling the `impute` method, which fills in the missing attributes and produces the `imputed_inventory`.  


In [9]:
knn_imputer_class = importer.get_class('KnnImputer')

imputer = knn_imputer_class(
    nsi_inventory, 
    n_possible_worlds=NO_POSSIBLE_WORLDS,
    exclude_features=['lat', 'lon', 'fd_id']
)
imputed_inventory = imputer.impute()



Existing worlds: 1
New worlds per existing world: 1
world # 0
Features with no reference data cannot be imputed. Removing them from the imputation target: FloodZone
Missing percentages among 160427 assets
buildingheight: 12.81%
erabuilt: 8.05%
numstories: 7.88%
roofshape: 99.86%
fparea: 8.09%
repaircost: 8.09%
constype: 8.09%
occupancy: 8.09%
found_ht: 8.09%
ground_elv: 8.09%
splitlevel: 8.09%
basement: 28.09%
Primitive imputation done.
Running the main imputation. This may take a while.
Enumerating clusters: 20 among 321
Enumerating clusters: 40 among 321
Enumerating clusters: 60 among 321
Enumerating clusters: 80 among 321
Enumerating clusters: 100 among 321
Enumerating clusters: 120 among 321
Enumerating clusters: 140 among 321
Enumerating clusters: 160 among 321
Enumerating clusters: 180 among 321
Enumerating clusters: 200 among 321
Enumerating clusters: 220 among 321
Enumerating clusters: 240 among 321
Enumerating clusters: 260 among 321
Enumerating clusters: 280 among 321
Enumera

# Add Household Income Feature Using a Lognormal Distribution
To enrich the inventory with socioeconomic data, we add a household income attribute by sampling from a **lognormal distribution**. Income is typically right-skewed, making the lognormal a natural choice.

We begin by defining the state average household income (`CA_AVG`) and assuming a 50% coefficient of variation (`CA_STD_DEV`). From these, we calculate the parameters (`mu` and `sigma`) of the underlying normal distribution. Finally, we generate lognormal samples and assign them as the `Income` feature for each building in the imputed inventory.

In [10]:
CA_AVG = 78672  # state average
CA_STD_DEV = CA_AVG*0.5  # 50% cov

# Step 1: Calculate the parameters of the underlying normal distribution:
mu = np.log(CA_AVG**2 /
            np.sqrt(CA_STD_DEV**2 + CA_AVG**2))
sigma = np.sqrt(np.log(1 + (CA_STD_DEV**2 / CA_AVG**2)))

# Step 2: Generate the lognormal sample using the parameters of the normal
# distribution:
for key, val in imputed_inventory.inventory.items():
    lognormal_sample = np.random.lognormal(
        mean=mu, 
        sigma=sigma, 
        size=NO_POSSIBLE_WORLDS
    )
    val.add_features({"Income": lognormal_sample[0]})

# Change Attribute Keys for Compatibility with R2D 

In this step, we specify the keys that will be used when enriching the inventory.  Some of these attributes represent new attributes to be inferred, while others correspond to existing attributes that will serve as input to rulesets to predict the derived attributes.

In [11]:
# The names of NEW keys to be inferred:
STRUCTURE_TYPE_KEY = 'StructureTypeHazus'      # Instead of  "constype" from NSI
REPLACEMENT_COST_KEY = 'ReplacementCostHazus'  # Instead of NSI "repaircost"

# The names of existing keys to be used as "predictors":
YEAR_BUILT_KEY = 'erabuilt'
OCCUPANCY_CLASS_KEY = 'occupancy'
INCOME_KEY = 'Income'
NUMBER_OF_STORIES_KEY = 'numstories'
PLAN_AREA_KEY = 'fpAreas'
SPLIT_LEVEL_KEY = 'splitlevel'

# Infer Hazus-Compatible Features for Earthquake Analysis

With the baseline inventory complete, the next step is to infer Hazus-compatible features that are required for earthquake loss analysis. We use the `HazusInfererEarthquake` class from BRAILS++ to get these attributes.  

The `HazusInfererEarthquake` class takes the enriched inventory (in this case `imputed_inventory` and uses predictors such as year built, occupancy class, number of stories, and income to infer key attributes like structure type and replacement cost, which are required for Hazus-style damage and loss analysis. In this example, the `clean_features` argument is set to `False` when intializing the constructor for `HazusInfererEarthquake`, which ensures that both the original predictors and the newly inferred features are retained in hazus_inferred_inventory, rather than limiting the dataset to only the attributes needed for Hazus damage and loss analysis.

The resulting `hazus_inferred_inventory`, produced by the `infer` method, is fully aligned with Hazus requirements and ready for use in R2D for regional-scale seismic loss analysis.

In [12]:
infer_features_for_hazuseq = importer.get_class("HazusInfererEarthquake")

inferer = infer_features_for_hazuseq(
    input_inventory=imputed_inventory,
    n_possible_worlds=NO_POSSIBLE_WORLDS,
    yearBuilt_key=YEAR_BUILT_KEY,
    occupancyClass_key=OCCUPANCY_CLASS_KEY,
    numberOfStories_key=NUMBER_OF_STORIES_KEY,
    income_key=INCOME_KEY,
    splitLevel_key=SPLIT_LEVEL_KEY,
    structureType_key=STRUCTURE_TYPE_KEY,
    replacementCost_key=REPLACEMENT_COST_KEY,
    planArea_key=PLAN_AREA_KEY,
    clean_features=False
)

hazus_inferred_inventory = inferer.infer()

>> Step1 : Checking if OccupancyClass (occupancy) exist.
>> Step2-1 : Checking if StructureType (StructureTypeHazus) and ReplacementCost (ReplacementCostHazus) exist
>> Step2-2 : Inferring {'StructureTypeHazus', 'ReplacementCostHazus'}




Done inference. It took 1.50 mins
>> Step3-1 : Checking if HeightClass (HeightClass), DesignLevel (DesignLevel) and FoundationType (FoundationType) exist
>> Step3-2 : Inferring {'HeightClass', 'DesignLevel', 'FoundationType'}




The feature StructureTypeHazus is missing in many buildings including:  [18, 132, 163, 188, 243, 279, 289, 296, 316, 318]
>> Step4 : Changing feature names to what R2D (pelicun) can recognize
Done inference. It took 2.92 mins




From the warning message, inferring HAZUS StructureType for several provided structural types e.g., West Coast-IND1-mid_rise-pre_1950, were not possible using the inference rulestes in BRAILS++, because some provided structural types do not exist in HAZUS's inventory definition. The failed inference has also leaded to missing DesignLevels in the produced inventory here. Below, imputation is used to estimate the HAZUS structural types that do not exist in HAZUS. And the inferrer is run again to estimate DesignLevel.

# Re-run KNN Imputation to Fill Remaining Missing Values

Some attributes in the Hazus-inferred inventory may still be missing because they cannot be determined directly from Hazus rulesets. We use the K-Nearest Neighbors (KNN) imputer to fill these remaining gaps. This step ensures that all missing attributes are estimated, producing a fully populated inventory. 

In [13]:
imputer = knn_imputer_class(
    hazus_inferred_inventory, 
    n_possible_worlds=NO_POSSIBLE_WORLDS
)

hazus_inferred_inventory_imputed = imputer.impute()



Existing worlds: 1
New worlds per existing world: 1
world # 0
Features with no reference data cannot be imputed. Removing them from the imputation target: FloodZone
Missing percentages among 160427 assets
lon: 8.09%
lat: 8.09%
fd_id: 8.09%
StructureType: 0.11%
Primitive imputation done.
Running the main imputation. This may take a while.
Enumerating clusters: 20 among 321
Enumerating clusters: 40 among 321
Enumerating clusters: 60 among 321
Enumerating clusters: 80 among 321
Enumerating clusters: 100 among 321
Enumerating clusters: 120 among 321
Enumerating clusters: 140 among 321
Enumerating clusters: 160 among 321
Enumerating clusters: 180 among 321
Enumerating clusters: 200 among 321
Enumerating clusters: 220 among 321
Enumerating clusters: 240 among 321
Enumerating clusters: 260 among 321
Enumerating clusters: 280 among 321
Enumerating clusters: 300 among 321
Enumerating clusters: 320 among 321
Done imputation. It took 0.19 mins


# Generate the Final Hazus-Compatible Inventory for Damage and Loss Analysis

In this step, we initialize a new `HazusInfererEarthquake` instance to create the final inventory for Hazus-based damage and loss modeling. We use the `'StructureType'` data from the previous KNN imputation combined with Hazus rulesets to complete the inventory.

In [14]:
# Initialize the Hazus Inferer for damage and loss analysis (HazusDL):
HazusDLInferer = importer.get_class('HazusInfererEarthquake')

# Create an instance of the inferer using the imputed Hazus inventory.
# In this step, we use the StructureType data obtained from the previous imputation
# with the Hazus rulesets to generate the final Hazus-compatible inventory:
inferer = HazusDLInferer(
    input_inventory=hazus_inferred_inventory_imputed,
    n_possible_worlds=NO_POSSIBLE_WORLDS,
    yearBuilt_key='erabuilt',
    structureType_key='StructureType',
    clean_features=False
)

# Run the inference to produce the final Hazus-compatible inventory:
hazus_inventory_final = inferer.infer()

>> Step1 : Checking if OccupancyClass (OccupancyClass) exist.
>> Step2-1 : Checking if StructureType (StructureType) and ReplacementCost (ReplacementCost) exist
>> Step2-2 : Inferring {'ReplacementCost'}




Done inference. It took 1.11 mins
>> Step3-1 : Checking if HeightClass (HeightClass), DesignLevel (DesignLevel) and FoundationType (FoundationType) exist
>> Step3-2 : Inferring {'DesignLevel'}
>> Step4 : Changing feature names to what R2D (pelicun) can recognize
Done inference. It took 2.58 mins




# Change Attribute Names to Make Them Compatible with R2D
To prepare the final inventory for R2D analysis, we perform two key steps:
1. Rename Features to Match R2D Naming Conventions
   - `'erabuilt'` → `'YearBuilt'`  
   - `'lat'` → `'Latitude'`  
   - `'lon'` → `'Longitude'`  
   - `'fpAreas'` → `'PlanArea'`  
   - `'numstories'` → `'NumberOfStories'`
2. Assign Unique IDs to Each Building

In [15]:
# Rename selected features to match R2D naming conventions:
hazus_inventory_final.change_feature_names({
    'erabuilt': 'YearBuilt',
    'lat': 'Latitude',
    'lon': 'Longitude',
    'fpAreas': 'PlanArea',
    'numstories': 'NumberOfStories'
})

# Assign a unique ID to each building in the inventory:
for idx, (_, val) in enumerate(hazus_inventory_final.inventory.items()):
    val.add_features({"id": idx})

# Write the Created Inventory in a GeoJSON File
After completing all imputation, inference, and feature standardization steps, we save the final Hazus-compatible inventory to a GeoJSON file for use in R2D or visualization.

In [27]:
geojson_data = hazus_inventory_final.write_to_geojson(
    output_file=INVENTORY_OUTPUT
)

Wrote 160427 assets to /home/bacetiner/Documents/BrailsPlusPlus/examples/inventory_creation/SFInventory_EQ.geojson


# Plot the Created Inventory
This section provides a sample workflow for visualizing the created building inventory on an interactive map. The process consists of the following steps:
1. **Extract Footprint Coordinates**: Retrieve the coordinates of all building footprints from the inventory.
2. **Flatten Coordinate Lists**: Flatten the nested lists of points from each building footprint into a single list for processing.
3. **Calculate Map Center**: Compute the geographic center of all footprints to center the map appropriately.
4. **Initialize the Interactive Map**: Create a map centered on the computed location, using a clean, light basemap for clarity.
5. **Add Building Footprints with Tooltips**: Add the building footprints as a GeoJSON layer, including tooltips that display key attributes such as income, structure type, year built, and number of stories.
6. **Display the Map**: Render the interactive map for exploration.

In [None]:
# Extract building footprint coordinates from the inventory
inventory_footprints, _ = hazus_inventory_final.get_coordinates()

# Flatten the nested coordinate lists into a single list of points
all_coords = [coord for path in inventory_footprints for coord in path]

# Calculate the geographic center of all footprints for map centering
center_lat = sum(point[1] for point in all_coords) / len(all_coords)
center_lon = sum(point[0] for point in all_coords) / len(all_coords)

# Initialize an interactive map centered on the footprints
m = folium.Map(
    location=(center_lat, center_lon),
    tiles="cartodbpositron",  # Light, clean basemap style
    zoom_start=13
)

# Add building footprints as a GeoJSON layer with tooltips showing 
folium.GeoJson(
    geojson_data,
    name="geojson",
    tooltip=folium.GeoJsonTooltip(
        fields=[
            'Income',
            'StructureType',
            'HeightClass',
            'DesignLevel',
            'FoundationType',
            'OccupancyClass',
            'YearBuilt',
            'Latitude',
            'Longitude',
            'NumberOfStories'
        ],
        sticky=False
    )
).add_to(m)

# Display the interactive map:
m