# Analyzing FEMA's National Flood Insurance Program (NFIP) Data With DuckDB
Author: Mark Bauer

# Introduction

The FEMA National Flood Insurance Program (NFIP) website offers a trove of valuable information. Among its highlights is the [Flood Insurance Data and Analytics](https://nfipservices.floodsmart.gov/reports-flood-insurance-data) section, featuring data visualizations, tables, and reports. This project was inspired by these resources, particularly the Financial Losses by State and Policy, and Loss Statistics by Flood Zone Excel files.

Additionally, this analysis focuses solely on the [NFIP Redated Claims](https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2) dataset, and fortunately, it is available as a Parquet file. Given the size of this dataset, it was a good opportunity to utilize and learn more about [DuckDB](https://duckdb.org/), an in-process SQL OLAP database management system.

# OpenFEMA Dataset: FIMA NFIP Redacted Claims - v2
Federal Emergency Management Agency (FEMA), OpenFEMA Dataset: FIMA NFIP Redacted Claims - v2. Retrieved from https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2. This product uses the FEMA OpenFEMA API, but is not endorsed by FEMA. The Federal Government or FEMA cannot vouch for the data or analyses derived from these data after the data have been retrieved from the Agency's website(s).

Read more about [OpenFEMA Terms and Conditions](https://www.fema.gov/about/openfema/terms-conditions).

Dataset Description:
>This dataset provides details on NFIP claims transactions. It is derived from the NFIP system of record, staged in the NFIP reporting platform and redacted to protect policy holder personally identifiable information.
>
>This dataset is not intended to be an official federal report, and should not be considered an official federal report.

About the National Flood Insurance Program:  
>Congress passed the National Flood Insurance Act (NFIA), 42 U.S.C. 4001 in 1968, creating the National Flood Insurance Program (NFIP) in order to reduce future flood losses through flood hazard identification, floodplain management, and providing insurance protection. The Department of Housing and Urban Development (HUD) originally administered the NFIP, and Congress subsequently transferred the NFIP to FEMA upon its creation in 1979. FEMA and insurance companies participating in FEMA's Write Your Own (WYO) program offer NFIP insurance coverage for building structures as well as for contents and personal property within the building structures to eligible and insurable properties. The WYO program began in 1983 with NFIP operating under Part B of the NFIA and allows FEMA to authorize private insurance companies to issue the Standard Flood Insurance Policy (SFIP) as FEMA's fiduciary and fiscal agent. FEMA administers NFIP by ensuring insurance applications are processed properly; determining correct premiums; renewing, reforming, and cancelling insurance policies; transferring policies from the seller of the property to the purchaser of the property in certain circumstances; and processing insurance claims.
>
>The paid premiums of SFIPs and claims payments for damaged property are processed through the National Flood Insurance Fund (NFIF). NFIF was established by the National Flood Insurance Act of 1968 (42 U.S.C. 4001, et seq.), and is a centralized premium revenue and fee-generated fund that supports NFIP, which holds these U.S. Treasury funds.
>
>The Flood Insurance Claims Manual (https://nfipservices.floodsmart.gov/insurance-manuals) provides claims guidance to WYOs, vendors, adjusters, and examiners so that policyholders experience consistent and reliable service. The Manual provides processes for handling claims from the notice of loss to final payment. The NFIP has provided answers to Frequently Asked Questions (FAQs) to assist the public in understanding and navigating the data our program makes available: https://www.fema.gov/sites/default/files/documents/fema_nfip-data-faqs.pdf.

Source: [OpenFEMA Dataset: FIMA NFIP Redacted Claims - v2](https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2)

# Data Dictionary
View the data dictionary on OpenFEMA under the [Data Fields](https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2) section.


In [1]:
# import libraries
import duckdb
import pandas as pd

In [2]:
# reproducibility
%reload_ext watermark
%watermark -v -p duckdb,pandas

Python implementation: CPython
Python version       : 3.11.0
IPython version      : 8.6.0

duckdb: 1.0.0
pandas: 1.5.1



In [3]:
# data path
%ls data/

FimaNfipClaims.parquet


# Create a DuckDB database instance using the Python client

In [4]:
# create a DuckDB database instance
con = duckdb.connect()

# create table claims of dataset
con.execute("""
    CREATE TABLE claims AS
        FROM read_parquet('data/FimaNfipClaims.parquet')
""")

# sanity check
con.sql("""
    SELECT *
    FROM claims
    LIMIT 5
""").show()

┌──────────────────────┬──────────────────────┬──────────────────────┬───┬──────────────────────┬──────────┬───────────┐
│          id          │ agricultureStructu…  │       asOfDate       │ … │ censusBlockGroupFips │ latitude │ longitude │
│       varchar        │       boolean        │      timestamp       │   │       varchar        │  double  │  double   │
├──────────────────────┼──────────────────────┼──────────────────────┼───┼──────────────────────┼──────────┼───────────┤
│ e7af3d9f-b605-4653…  │ false                │ 2020-11-13 14:50:3…  │ … │ 010030114073         │     30.3 │     -87.7 │
│ bbaeaf64-c162-41bf…  │ false                │ 2020-12-11 16:25:4…  │ … │ 010030114072         │     30.3 │     -87.7 │
│ 256da746-b30b-4129…  │ false                │ 2020-03-27 12:15:4…  │ … │ 040131089022         │     33.5 │    -112.1 │
│ e3dcbb27-a2d0-4a9e…  │ false                │ 2020-03-27 12:15:4…  │ … │ 040131089022         │     33.5 │    -112.1 │
│ f77efb94-0188-4fa0…  │ false  

In [5]:
# list tables and schemas
con.sql("SHOW ALL TABLES").df()

Unnamed: 0,database,schema,name,column_names,column_types,temporary
0,memory,main,claims,"[id, agricultureStructureIndicator, asOfDate, ...","[VARCHAR, BOOLEAN, TIMESTAMP, SMALLINT, SMALLI...",False


In [6]:
# count of rows
con.sql("""
    SELECT
        COUNT(*) AS count_rows
    FROM claims
""")

┌────────────┐
│ count_rows │
│   int64    │
├────────────┤
│    2671486 │
└────────────┘

In [7]:
# count of columns
con.sql("""
    SELECT
        COUNT(column_name) AS count_columns
    FROM (DESCRIBE FROM claims)
""")

┌───────────────┐
│ count_columns │
│     int64     │
├───────────────┤
│            73 │
└───────────────┘

# Examine Dataset

## Column Info

In [8]:
# examine column datatypes
con.sql("""
    SELECT
        column_name,
        column_type
    FROM (DESCRIBE claims)
""").show(max_rows=80)

┌────────────────────────────────────────────┬─────────────┐
│                column_name                 │ column_type │
│                  varchar                   │   varchar   │
├────────────────────────────────────────────┼─────────────┤
│ id                                         │ VARCHAR     │
│ agricultureStructureIndicator              │ BOOLEAN     │
│ asOfDate                                   │ TIMESTAMP   │
│ basementEnclosureCrawlspaceType            │ SMALLINT    │
│ policyCount                                │ SMALLINT    │
│ crsClassificationCode                      │ SMALLINT    │
│ dateOfLoss                                 │ DATE        │
│ elevatedBuildingIndicator                  │ BOOLEAN     │
│ elevationCertificateIndicator              │ VARCHAR     │
│ elevationDifference                        │ DOUBLE      │
│ baseFloodElevation                         │ DOUBLE      │
│ ratedFloodZone                             │ VARCHAR     │
│ houseWorship          

In [9]:
# examine column null percentage
con.sql("""
    SELECT
        column_name,
        null_percentage
    FROM (SUMMARIZE FROM claims)
    WHERE null_percentage > 0
    ORDER BY null_percentage DESC
""").show(max_rows=80)

┌────────────────────────────────────────────┬─────────────────┐
│                column_name                 │ null_percentage │
│                  varchar                   │  decimal(9,2)   │
├────────────────────────────────────────────┼─────────────────┤
│ floodCharacteristicsIndicator              │           98.51 │
│ eventDesignationNumber                     │           94.34 │
│ lowestAdjacentGrade                        │           81.26 │
│ crsClassificationCode                      │           80.65 │
│ nonPaymentReasonBuilding                   │           78.02 │
│ elevationCertificateIndicator              │           77.72 │
│ lowestFloorElevation                       │           76.38 │
│ baseFloodElevation                         │           75.70 │
│ elevationDifference                        │           72.95 │
│ floodZoneCurrent                           │           72.15 │
│ nfipCommunityNumberCurrent                 │           72.10 │
│ nfipCommunityName      

## Preview Data

In [10]:
# preview data
con.sql("""
    SELECT *
    FROM claims
    LIMIT 5
""").show()

┌──────────────────────┬──────────────────────┬──────────────────────┬───┬──────────────────────┬──────────┬───────────┐
│          id          │ agricultureStructu…  │       asOfDate       │ … │ censusBlockGroupFips │ latitude │ longitude │
│       varchar        │       boolean        │      timestamp       │   │       varchar        │  double  │  double   │
├──────────────────────┼──────────────────────┼──────────────────────┼───┼──────────────────────┼──────────┼───────────┤
│ e7af3d9f-b605-4653…  │ false                │ 2020-11-13 14:50:3…  │ … │ 010030114073         │     30.3 │     -87.7 │
│ bbaeaf64-c162-41bf…  │ false                │ 2020-12-11 16:25:4…  │ … │ 010030114072         │     30.3 │     -87.7 │
│ 256da746-b30b-4129…  │ false                │ 2020-03-27 12:15:4…  │ … │ 040131089022         │     33.5 │    -112.1 │
│ e3dcbb27-a2d0-4a9e…  │ false                │ 2020-03-27 12:15:4…  │ … │ 040131089022         │     33.5 │    -112.1 │
│ f77efb94-0188-4fa0…  │ false  

In [11]:
# preview data as pandas dataframe for readability
sql = """
    SELECT *
    FROM claims
    LIMIT 5
"""

# examine each column in sections because of large number of columns
con.sql(sql).df().iloc[:, :15]

Unnamed: 0,id,agricultureStructureIndicator,asOfDate,basementEnclosureCrawlspaceType,policyCount,crsClassificationCode,dateOfLoss,elevatedBuildingIndicator,elevationCertificateIndicator,elevationDifference,baseFloodElevation,ratedFloodZone,houseWorship,locationOfContents,lowestAdjacentGrade
0,e7af3d9f-b605-4653-958f-bc48f413766c,False,2020-11-13 14:50:38.288,2,1,,2020-09-16,False,,6.0,10.0,AE,False,,0.0
1,bbaeaf64-c162-41bf-a399-862d0e832447,False,2020-12-11 16:25:40.587,0,1,,2020-09-16,True,,4.0,10.0,AE,False,7.0,4.4
2,256da746-b30b-4129-a391-93f50a6190c8,False,2020-03-27 12:15:45.887,0,1,,2016-08-02,False,2.0,,,AH,False,3.0,
3,e3dcbb27-a2d0-4a9e-a70d-ff7e578d4e6a,False,2020-03-27 12:15:45.887,0,1,,2014-09-08,False,2.0,,,AH,False,3.0,
4,f77efb94-0188-4fa0-b52e-3aa71499baeb,False,2020-03-26 12:56:27.476,0,1,,2018-01-09,True,,,,AE,False,,


In [12]:
# slice through columns
con.sql(sql).df().iloc[:, 15:30]

Unnamed: 0,lowestFloorElevation,numberOfFloorsInTheInsuredBuilding,nonProfitIndicator,obstructionType,occupancyType,originalConstructionDate,originalNBDate,amountPaidOnBuildingClaim,amountPaidOnContentsClaim,amountPaidOnIncreasedCostOfComplianceClaim,postFIRMConstructionIndicator,rateMethod,smallBusinessIndicatorBuilding,totalBuildingInsuranceCoverage,totalContentsInsuranceCoverage
0,16.0,2,False,,2,1975-01-01,2008-10-20,2695.44,0.0,0.0,True,8,False,7400,0
1,14.4,3,False,54.0,1,1991-09-17,2017-10-05,2750.18,121.0,0.0,True,8,False,10800,700
2,,1,False,,1,1949-01-01,2014-09-02,,,,False,B,False,153500,15800
3,,1,False,,1,1949-01-01,2014-09-02,25156.27,0.0,0.0,False,B,False,139500,15000
4,,1,False,10.0,1,1960-01-01,2001-08-05,,,,False,1,False,250000,0


In [13]:
# slice through columns
con.sql(sql).df().iloc[:, 30:45]

Unnamed: 0,yearOfLoss,primaryResidenceIndicator,buildingDamageAmount,buildingDeductibleCode,netBuildingPaymentAmount,buildingPropertyValue,causeOfDamage,condominiumCoverageTypeCode,contentsDamageAmount,contentsDeductibleCode,netContentsPaymentAmount,contentsPropertyValue,disasterAssistanceCoverageRequired,eventDesignationNumber,ficoNumber
0,2020,False,3945.0,F,2695.44,167531.0,1,N,0.0,,0.0,0.0,0,AL0520,
1,2020,True,4000.0,F,2750.18,278027.0,1,N,1371.0,F,121.0,50000.0,0,AL0520,
2,2016,True,1698.0,5,0.0,140422.0,4,N,3690.0,5,0.0,21500.0,0,,
3,2014,True,30098.0,5,25156.27,140919.0,4,N,,5,0.0,,0,,
4,2018,True,,2,0.0,,4,N,,0,0.0,,0,,368.0


In [14]:
# slice through columns
con.sql(sql).df().iloc[:, 45:60]

Unnamed: 0,floodCharacteristicsIndicator,floodWaterDuration,floodproofedIndicator,floodEvent,iccCoverage,netIccPaymentAmount,nfipRatedCommunityNumber,nfipCommunityNumberCurrent,nfipCommunityName,nonPaymentReasonContents,nonPaymentReasonBuilding,numberOfUnits,buildingReplacementCost,contentsReplacementCost,replacementCostBasis
0,,,False,Hurricane Sally,30000,0.0,15005,0,,,,0,185457.0,0.0,A
1,,,False,Hurricane Sally,30000,0.0,15005,0,,,,0,314835.0,75000.0,A
2,,0.0,False,,30000,0.0,40051,0,,1.0,1.0,0,175528.0,0.0,R
3,,0.0,False,,30000,0.0,40051,0,,,,0,173974.0,,R
4,,0.0,False,Mid-Winter California Flooding,30000,0.0,60331,0,,,6.0,0,,,A


In [15]:
# slice through columns
con.sql(sql).df().iloc[:, 60:75]

Unnamed: 0,stateOwnedIndicator,waterDepth,floodZoneCurrent,buildingDescriptionCode,rentalPropertyIndicator,state,reportedCity,reportedZipCode,countyCode,censusTract,censusBlockGroupFips,latitude,longitude
0,False,0,,1,False,AL,Currently Unavailable,36542,1003,1003011407,10030114073,30.3,-87.7
1,False,0,AE,1,False,AL,Currently Unavailable,36542,1003,1003011407,10030114072,30.3,-87.7
2,False,1,,1,False,AZ,Currently Unavailable,85015,4013,4013108902,40131089022,33.5,-112.1
3,False,1,,1,False,AZ,Currently Unavailable,85015,4013,4013108902,40131089022,33.5,-112.1
4,False,0,AE,1,False,CA,Currently Unavailable,93108,6083,6083001402,60830014022,34.4,-119.6


In [16]:
# count duplicate IDs
con.sql("""
    SELECT
        id,
        COUNT(id) AS count
    FROM claims
    GROUP BY id
    HAVING count > 1
""")

┌─────────┬───────┐
│   id    │ count │
│ varchar │ int64 │
├─────────┴───────┤
│     0 rows      │
└─────────────────┘

In [17]:
# latest record effective date
sql = """
    SELECT
        asOfDate,
    FROM claims
    ORDER BY asOfDate DESC
    LIMIT 1
"""

record_last_updated = (
    con
    .sql(sql)
    .df()
    .loc[:, ['asOfDate']]
    .values[0][0]
)

print(f"Record last updated at: {record_last_updated}")

Record last updated at: 2024-10-04T00:59:55.552000000


In [18]:
# earliest and latest record effective date
con.sql("""
    SELECT
        min(asOfDate) AS earliestEffectiveDate,
        max(asOfDate) AS latestEffectiveDate
    FROM claims
""")

┌─────────────────────────┬─────────────────────────┐
│  earliestEffectiveDate  │   latestEffectiveDate   │
│        timestamp        │        timestamp        │
├─────────────────────────┼─────────────────────────┤
│ 2019-09-19 06:12:43.388 │ 2024-10-04 00:59:55.552 │
└─────────────────────────┴─────────────────────────┘

In [19]:
# top 5 most recent effective date claim records
con.sql("""
    SELECT
        asOfDate,
        dateOfLoss,
        floodEvent,
        state,
        ROUND(
            amountPaidOnBuildingClaim
            + amountPaidOnContentsClaim
            + amountPaidOnIncreasedCostOfComplianceClaim)::BIGINT AS paidTotalClaim
    FROM claims
    ORDER BY asOfDate DESC
    LIMIT 5
""")

┌─────────────────────────┬────────────┬────────────────────────┬─────────┬────────────────┐
│        asOfDate         │ dateOfLoss │       floodEvent       │  state  │ paidTotalClaim │
│        timestamp        │    date    │        varchar         │ varchar │     int64      │
├─────────────────────────┼────────────┼────────────────────────┼─────────┼────────────────┤
│ 2024-10-04 00:59:55.552 │ 2024-09-16 │ Tropical Cyclone Eight │ NC      │              0 │
│ 2024-10-04 00:58:55.563 │ 2024-09-11 │ Hurricane Francine     │ LA      │           2901 │
│ 2024-10-04 00:57:54.885 │ 2024-09-11 │ Hurricane Francine     │ LA      │              0 │
│ 2024-10-04 00:57:54.695 │ 2024-09-26 │ Hurricane Helene       │ FL      │           7500 │
│ 2024-10-04 00:56:54.47  │ 2024-09-11 │ Hurricane Francine     │ LA      │          57179 │
└─────────────────────────┴────────────┴────────────────────────┴─────────┴────────────────┘

In [20]:
# earliest and latest date of loss in dataset
con.sql("""
    SELECT
        min(dateOfLoss) AS earliestDateOfLoss,
        max(dateOfLoss) AS latestDateOfLoss
    FROM claims
""")

┌────────────────────┬──────────────────┐
│ earliestDateOfLoss │ latestDateOfLoss │
│        date        │       date       │
├────────────────────┼──────────────────┤
│ 1978-01-01         │ 2024-10-03       │
└────────────────────┴──────────────────┘

In [21]:
# top 5 latest claim records by date of loss 
con.sql("""
    SELECT
        dateOfLoss,
        asOfDate,
        floodEvent,
        state,
        ROUND(
            amountPaidOnBuildingClaim
            + amountPaidOnContentsClaim
            + amountPaidOnIncreasedCostOfComplianceClaim)::BIGINT AS paidTotalClaim
    FROM claims
    ORDER BY dateOfLoss DESC
    LIMIT 5
""")

┌────────────┬─────────────────────────┬────────────┬─────────┬────────────────┐
│ dateOfLoss │        asOfDate         │ floodEvent │  state  │ paidTotalClaim │
│    date    │        timestamp        │  varchar   │ varchar │     int64      │
├────────────┼─────────────────────────┼────────────┼─────────┼────────────────┤
│ 2024-10-03 │ 2024-10-03 21:40:08.5   │ NULL       │ FL      │           NULL │
│ 2024-10-03 │ 2024-10-03 22:38:35.955 │ NULL       │ FL      │           NULL │
│ 2024-10-03 │ 2024-10-03 17:21:21.871 │ NULL       │ FL      │           NULL │
│ 2024-10-02 │ 2024-10-03 21:40:01.559 │ NULL       │ VA      │           NULL │
│ 2024-10-02 │ 2024-10-03 21:40:08.5   │ NULL       │ FL      │           NULL │
└────────────┴─────────────────────────┴────────────┴─────────┴────────────────┘

In [22]:
# total insured units in dataset
con.sql("""
    SELECT SUM(policyCount) AS totalPolicyCount
    FROM claims
""")

┌──────────────────┐
│ totalPolicyCount │
│      int128      │
├──────────────────┤
│          3462669 │
└──────────────────┘

## Summary Statistics

In [23]:
# calculate summary statistics of each column
con.sql("""
    SELECT *
    FROM (SUMMARIZE claims)
""").df()

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
0,id,VARCHAR,0000094b-1d1a-4dbe-a944-97c62b225b9c,fffffe7c-7ca2-4431-b88b-8a8216ffc934,2623176,,,,,,2671486,0.00
1,agricultureStructureIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.00
2,asOfDate,TIMESTAMP,2019-09-19 06:12:43.388,2024-10-04 00:59:55.552,479449,,,,,,2671486,0.00
3,basementEnclosureCrawlspaceType,SMALLINT,0,4,4,1.195105377264796,1.0618378842847138,0,1,2,2671486,69.92
4,policyCount,SMALLINT,1,1090,399,1.2961583927447122,6.742744486702094,1,1,1,2671486,0.00
...,...,...,...,...,...,...,...,...,...,...,...,...
68,countyCode,VARCHAR,01001,78030,2946,,,,,,2671486,2.32
69,censusTract,VARCHAR,01001020100,78030961200,59577,,,,,,2671486,5.16
70,censusBlockGroupFips,VARCHAR,010010201001,780309612002,117722,,,,,,2671486,5.16
71,latitude,DOUBLE,-36.0,69.9,334,33.75590024438222,5.827326735346191,29.821735515602324,30.65736762121356,39.66343321382219,2671486,1.51


In [24]:
# examine each column in sections because of large number of columns
sql = """
    SELECT *
    FROM (SUMMARIZE claims)
"""

# slice through columns
con.sql(sql).df().iloc[:25, :]

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
0,id,VARCHAR,0000094b-1d1a-4dbe-a944-97c62b225b9c,fffffe7c-7ca2-4431-b88b-8a8216ffc934,2623176,,,,,,2671486,0.0
1,agricultureStructureIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.0
2,asOfDate,TIMESTAMP,2019-09-19 06:12:43.388,2024-10-04 00:59:55.552,479449,,,,,,2671486,0.0
3,basementEnclosureCrawlspaceType,SMALLINT,0,4,4,1.195105377264796,1.0618378842847214,0.0,1.0,2.0,2671486,69.92
4,policyCount,SMALLINT,1,1090,399,1.2961583927447122,6.74274448670211,1.0,1.0,1.0,2671486,0.0
5,crsClassificationCode,SMALLINT,1,10,10,6.708258215272243,1.5263043059761665,5.0,7.0,8.0,2671486,80.65
6,dateOfLoss,DATE,1978-01-01,2024-10-03,16868,,,,,,2671486,0.0
7,elevatedBuildingIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.0
8,elevationCertificateIndicator,VARCHAR,1,E,9,,,,,,2671486,77.72
9,elevationDifference,DOUBLE,-9989.0,998.0,371,1.253409617633019,29.09140773741413,0.0,1.0,2.8109199444058177,2671486,72.95


In [25]:
# slice through columns
con.sql(sql).df().iloc[25:50, :]

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
25,postFIRMConstructionIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.0
26,rateMethod,VARCHAR,1,W,22,,,,,,2671486,1.85
27,smallBusinessIndicatorBuilding,BOOLEAN,false,true,2,,,,,,2671486,0.0
28,totalBuildingInsuranceCoverage,BIGINT,0,243903000,12425,172980.48504127294,1253722.2614965667,39817.0,100309.0,221975.0,2671486,0.0
29,totalContentsInsuranceCoverage,BIGINT,0,6000000,3064,31127.78768283439,50873.49942270765,0.0,11715.0,46213.0,2671486,0.0
30,yearOfLoss,SMALLINT,1978,2024,47,2002.6466337461625,12.783941740449391,1993.0,2005.0,2012.0,2671486,0.0
31,primaryResidenceIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.0
32,buildingDamageAmount,BIGINT,0,927700000,202772,36354.07504655704,812334.7780210063,3430.0,11213.0,40260.0,2671486,23.76
33,buildingDeductibleCode,VARCHAR,0,H,15,,,,,,2671486,11.77
34,netBuildingPaymentAmount,DOUBLE,-162432.16,10006145.71,1286178,24319.109697074808,59591.4128871141,0.0089236811693579,4435.547361433496,24077.63331608833,2671486,0.0


In [26]:
# slice through columns
con.sql(sql).df().iloc[50:, :]

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
50,netIccPaymentAmount,DOUBLE,-6450.0,60000.0,8709,353.69498594789553,3066.941960282318,0.0,0.0,0.0,2671486,0.0
51,nfipRatedCommunityNumber,VARCHAR,000000,999999,16299,,,,,,2671486,0.0
52,nfipCommunityNumberCurrent,VARCHAR,0000,815000,12065,,,,,,2671486,72.1
53,nfipCommunityName,VARCHAR,ABBEVILLE COUNTY *,"ZUMBRO FALLS, CITY OF",9870,,,,,,2671486,70.76
54,nonPaymentReasonContents,VARCHAR,01,99,23,,,,,,2671486,68.95
55,nonPaymentReasonBuilding,VARCHAR,01,99,23,,,,,,2671486,78.02
56,numberOfUnits,INTEGER,0,99999,450,1.3967815247755893,65.44451776434683,1.0,1.0,1.0,2671486,0.11
57,buildingReplacementCost,BIGINT,0,9999000000,447053,8569836.741231738,235005475.3540741,0.0,120927.0,223990.0,2671486,23.76
58,contentsReplacementCost,BIGINT,0,20000000,8887,2809.0997141881894,46529.60714476826,0.0,0.0,0.0,2671486,59.75
59,replacementCostBasis,VARCHAR,A,R,2,,,,,,2671486,8.56


## Amount Paid on Claims Summary Statistics

**amountPaidOnBuildingClaim**: Dollar amount paid on the building claim. In some instances, a negative amount may appear which occurs when a check issued to a policy holder is not cashed and has to be re-issued.

**amountPaidOnContentsClaim**: Dollar amount paid on the contents claim. In some instances, a negative amount may appear, which occurs when a check issued to a policy holder is not cashed and has to be re-issued.

**amountPaidOnIncreasedCostOfComplianceClaim**: ICC coverage is one of several flood insurance resources for policyholders who need additional help rebuilding after a flood. It provides up to $30,000 to help cover the cost of mitigation measures that will reduce the flood risk.

Source: [Data Dictionary](https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2)

In [27]:
# examine summary statistics on paid total claims
con.sql("""
    SELECT
        column_name, column_type, min, max, approx_unique,
        ROUND(avg::DOUBLE, 2) AS avg,
        ROUND(std::DOUBLE, 2) AS std,
        ROUND(q25::DOUBLE, 2) AS q25,
        ROUND(q50::DOUBLE, 2) AS q50,
        ROUND(q75::DOUBLE, 2) AS q75,
        count, null_percentage
    FROM (SUMMARIZE claims)
    WHERE column_name IN (
        'amountPaidOnBuildingClaim',
        'amountPaidOnContentsClaim',
        'amountPaidOnIncreasedCostOfComplianceClaim'
    )
""").df()

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
0,amountPaidOnBuildingClaim,DOUBLE,-162432.16,10006145.71,1286108,31531.11,66190.18,2319.32,9616.89,36623.13,2671486,22.78
1,amountPaidOnContentsClaim,DOUBLE,-41276.32,757048.95,472789,7096.52,22183.21,0.0,0.0,4788.26,2671486,22.78
2,amountPaidOnIncreasedCostOfComplianceClaim,DOUBLE,-6450.0,60000.0,8752,459.73,3489.78,0.0,0.0,0.0,2671486,22.78


In [28]:
# top 5 paid total claim records
con.sql("""
    SELECT
        dateOfLoss,
        asOfDate,
        floodEvent,
        state,
        ROUND(
            amountPaidOnBuildingClaim
            + amountPaidOnContentsClaim
            + amountPaidOnIncreasedCostOfComplianceClaim, 0)::BIGINT AS paidTotalClaim
    FROM claims
    ORDER BY paidTotalClaim DESC
    LIMIT 5
""")

┌────────────┬─────────────────────────┬────────────────────────┬─────────┬────────────────┐
│ dateOfLoss │        asOfDate         │       floodEvent       │  state  │ paidTotalClaim │
│    date    │        timestamp        │        varchar         │ varchar │     int64      │
├────────────┼─────────────────────────┼────────────────────────┼─────────┼────────────────┤
│ 2022-09-28 │ 2024-05-22 15:40:06.671 │ Hurricane Ian          │ FL      │       10106146 │
│ 2005-08-29 │ 2020-01-22 16:55:53.194 │ Hurricane Katrina      │ MS      │       10000000 │
│ 2012-10-29 │ 2020-01-22 16:55:53.194 │ Hurricane Sandy        │ NY      │        9467720 │
│ 2004-09-15 │ 2020-01-22 16:55:53.194 │ Hurricane Ivan         │ FL      │        9169507 │
│ 2001-06-09 │ 2020-01-22 16:55:53.194 │ Tropical Storm Allison │ TX      │        9023558 │
└────────────┴─────────────────────────┴────────────────────────┴─────────┴────────────────┘

## Building Damage, Values, and Replacement Costs Summary Statistics

In [29]:
# examine summary statisitics on building damage amounts, property values and replacement costs columns
con.sql("""
    SELECT
        column_name, column_type, min, max, approx_unique,
        ROUND(avg::DOUBLE, 2) AS avg,
        ROUND(std::DOUBLE, 2) AS std,
        ROUND(q25::DOUBLE, 2) AS q25,
        ROUND(q50::DOUBLE, 2) AS q50,
        ROUND(q75::DOUBLE, 2) AS q75,
        count, null_percentage
    FROM (SUMMARIZE claims)
    WHERE column_name IN (
        'buildingPropertyValue',
        'buildingReplacementCost',
        'buildingDamageAmount'
    )
""").df()

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
0,buildingPropertyValue,BIGINT,0,9950000000,420035,7148318.89,199917800.0,60871.0,109827.0,190193.0,2671486,23.76
1,buildingReplacementCost,BIGINT,0,9999000000,447053,8569836.74,235005500.0,0.0,120942.0,224120.0,2671486,23.76
2,buildingDamageAmount,BIGINT,0,927700000,202772,36354.08,812334.8,3445.0,11188.0,39945.0,2671486,23.76


## Contents Damage, Values, and Replacement Costs Summary Statistics

In [30]:
# examine summary statisitics on contents amounts, property values and replacement costs columns
con.sql("""
    SELECT
        column_name, column_type, min, max, approx_unique,
        ROUND(avg::DOUBLE, 2) AS avg,
        ROUND(std::DOUBLE, 2) AS std,
        ROUND(q25::DOUBLE, 2) AS q25,
        ROUND(q50::DOUBLE, 2) AS q50,
        ROUND(q75::DOUBLE, 2) AS q75,
        count, null_percentage
    FROM (SUMMARIZE claims)
    WHERE column_name IN (
        'contentsPropertyValue',
        'contentsReplacementCost',
        'contentsDamageAmount'
    )
""").df()

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
0,contentsPropertyValue,BIGINT,-1,281895323,59971,31580.59,488372.36,0.0,0.0,19908.0,2671486,59.72
1,contentsReplacementCost,BIGINT,0,20000000,8887,2809.1,46529.61,0.0,0.0,0.0,2671486,59.75
2,contentsDamageAmount,BIGINT,0,19230507,102299,18331.89,86944.02,1617.0,5675.0,18472.0,2671486,59.72


## Elevation and Water Depth Summary Statistics

In [31]:
# examine summary statisitics on elevation and water depth columns
con.sql("""
    SELECT
        column_name, column_type, min, max, approx_unique,
        ROUND(avg::DOUBLE, 2) AS avg,
        ROUND(std::DOUBLE, 2) AS std,
        ROUND(q25::DOUBLE, 2) AS q25,
        ROUND(q50::DOUBLE, 2) AS q50,
        ROUND(q75::DOUBLE, 2) AS q75,
        count, null_percentage
    FROM (SUMMARIZE claims)
    WHERE column_name IN (
        'baseFloodElevation',
        'waterDepth',
        'lowestAdjacentGrade',
        'lowestFloorElevation',
        'elevationDifference',
    )
""").df()

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
0,elevationDifference,DOUBLE,-9989.0,998.0,371,1.25,29.09,0.0,1.0,2.78,2671486,72.95
1,baseFloodElevation,DOUBLE,-9999.0,9998.0,10427,126.73,767.44,6.87,9.0,13.96,2671486,75.7
2,lowestAdjacentGrade,DOUBLE,-99999.9,9998.9,12638,49.7,1483.25,3.15,6.07,10.77,2671486,81.26
3,lowestFloorElevation,DOUBLE,-9999.0,9998.9,13294,104.33,616.41,6.53,10.27,17.29,2671486,76.38
4,waterDepth,SMALLINT,-999.0,999.0,474,4.29,16.45,0.0,1.0,2.0,2671486,9.4


# Analysis

## Paid Total Claims

In [32]:
con.sql("""
    SELECT
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim, 
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
""")

┌─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│    int64    │     int64      │       int64       │       int64       │   int64   │
├─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│     2671486 │    80638352603 │       65049593608 │       14640321888 │ 948437107 │
└─────────────┴────────────────┴───────────────────┴───────────────────┴───────────┘

## Total Paid Claims by State

In [33]:
con.sql("""
    SELECT
        state,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim, 
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY state
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""")

┌─────────┬─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│  state  │ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│ varchar │    int64    │     int64      │       int64       │       int64       │   int64   │
├─────────┼─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│ LA      │      483956 │    20869805385 │       16459981751 │        4132267467 │ 277556167 │
│ TX      │      390889 │    17218491455 │       13253973375 │        3906593054 │  57925025 │
│ FL      │      415410 │    11433783265 │        9699132485 │        1696552579 │  38098201 │
│ NJ      │      201234 │     6470109817 │        5349181327 │         870522071 │ 250406420 │
│ NY      │      174823 │     5727350837 │        4955149755 │         719127080 │  53074002 │
│ MS      │       64129 │     3122839913 │        2386323813 │         675594228 │  60921873 │
│ NC      │      107850 │     2009961692 │        

## Total Paid Claims by County

In [34]:
con.sql("""
    SELECT
        state,
        countyCode,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY state, countyCode
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""")

┌─────────┬────────────┬─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│  state  │ countyCode │ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│ varchar │  varchar   │    int64    │     int64      │       int64       │       int64       │   int64   │
├─────────┼────────────┼─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│ TX      │ 48201      │      170625 │     8748468467 │        6794800334 │        1933648655 │  20019478 │
│ LA      │ 22071      │      127137 │     7296273827 │        5892518413 │        1293261898 │ 110493515 │
│ LA      │ 22051      │      135361 │     3542723793 │        2633465878 │         847684749 │  61573166 │
│ FL      │ 12071      │       43354 │     3413583297 │        3050925028 │         360290319 │   2367950 │
│ NJ      │ 34029      │       52795 │     2607964807 │        2160867823 │         275256283 │ 171840701 │
│ TX      │ 48167      │    

## Total Paid Claims by Flood Event

In [35]:
con.sql("""
    SELECT
        floodEvent,
        yearOfLoss,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    WHERE floodEvent NOT NULL
    GROUP BY floodEvent, yearOfLoss
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""").df()

Unnamed: 0,floodEvent,yearOfLoss,countClaims,paidTotalClaim,paidBuildingClaim,paidContentsClaim,paidICC
0,Hurricane Katrina,2005,208348,16261697056,12659081935,3360020221,242594900
1,Hurricane Harvey,2017,92396,9055522699,6925370027,2115077279,15075393
2,Hurricane Sandy,2012,144848,8956609943,7707744797,951637071,297228075
3,Hurricane Ian,2022,48721,4757384580,4229314739,525543744,2526097
4,Hurricane Ike,2008,58126,2702511916,2073801567,577791589,50918760
5,Mid-summer severe storms,2016,30017,2533431383,2175218939,347263588,10948856
6,Hurricane Irene,2011,52493,1347399996,1139897981,183189121,24312894
7,Hurricane Ida,2021,28317,1346749832,1115951795,229650151,1147886
8,Hurricane Ivan,2004,20137,1325419294,1083795424,221959720,19664150
9,Hurricane Irma,2017,33339,1114668772,945821300,162148464,6699008


## Total Paid Claims by Flood Event and State

In [36]:
con.sql("""
    SELECT
        state,
        floodEvent,
        yearOfLoss,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    WHERE floodEvent NOT NULL
    GROUP BY 1, 2, 3
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""").df()

Unnamed: 0,state,floodEvent,yearOfLoss,countClaims,paidTotalClaim,paidBuildingClaim,paidContentsClaim,paidICC
0,LA,Hurricane Katrina,2005,176276,13347164644,10441743931,2721451049,183969663
1,TX,Hurricane Harvey,2017,91870,9040181063,6912549831,2112555840,15075393
2,FL,Hurricane Ian,2022,47457,4711517234,4187737200,521388936,2391097
3,NJ,Hurricane Sandy,2012,74983,4372115685,3673280769,457640033,241194883
4,NY,Hurricane Sandy,2012,57405,4214352197,3702376632,465410811,46564754
5,LA,Mid-summer severe storms,2016,30017,2533431383,2175218939,347263588,10948856
6,MS,Hurricane Katrina,2005,19051,2521512578,1910623703,558996059,51892816
7,TX,Hurricane Ike,2008,44095,2230307106,1723323896,481446923,25536286
8,FL,Hurricane Irma,2017,28755,985551621,834411306,145153321,5986994
9,TX,Tropical Storm Allison,2001,26821,978096837,723017851,250198245,4880741


## Total Paid Claims by Rated Flood Zone
>Formerly called floodZone. NFIP Flood Zone derived from the Flood Insurance Rate Map (FIRM) used to rate the insured property. A - Special Flood with no Base Flood Elevation on FIRM; AE, A1-A30 - Special Flood with Base Flood Elevation on FIRM; A99 - Special Flood with Protection Zone; AH, AHB* - Special Flood with Shallow Ponding; AO, AOB* - Special Flood with Sheet Flow; X, B - Moderate Flood from primary water source. Pockets of areas subject to drainage problems; X, C - Minimal Flood from primary water source. Pockets of areas subject to drainage problems; D - Possible Flood; V - Velocity Flood with no Base Flood Elevation on FIRM; VE, V1-V30 - Velocity Flood with Base Flood Elevation on FIRM; AE, VE, X - New zone designations used on new maps starting January 1, 1986, in lieu of A1-A30, V1-V30, and B and C; AR - A Special Flood Hazard Area that results from the decertification of a previously accredited flood protection system that is determined to be in the process of being restored to provide base flood protection; AR Dual Zones - (AR/AE, AR/A1-A30, AR/AH, AR/AO, AR/A) Areas subject to flooding from failure of the flood protection system (Zone AR) which also overlap an existing Special Flood Hazard Area as a dual zone; *AHB, AOB, ARE, ARH, ARO, and ARA are not risk zones shown on a map, but are acceptable values for rating purposes*

In [37]:
con.sql("""
    SELECT
        ratedFloodZone as ratedFloodZone,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""").df()

Unnamed: 0,ratedFloodZone,countClaims,paidTotalClaim,paidBuildingClaim,paidContentsClaim,paidICC
0,AE,902694,35479231566,29574222634,5389301443,515707490
1,X,398598,13436983455,10383896784,3024472374,28614297
2,B,114822,3826110386,2835364624,971908875,18836887
3,A,198639,3130755422,2486940415,606541394,37273612
4,C,163832,2853210440,2119312629,723803760,10094051
5,A04,51806,2185871995,1800832671,364399597,20639726
6,A03,41872,1650992713,1337863894,284704083,28424737
7,VE,46418,1604044125,1427115498,158958851,17969776
8,A01,51362,1526019156,1188933368,320999694,16086094
9,A08,45657,1455607732,1185003260,237429765,33174708


## Total Paid Claims by Post-FIRM Construction Indicator
>Indicates whether construction was started before or after publication of the FIRM. Yes is indicated with either a 'true' or '1'. No is indicated with either a 'false' or '0'. For insurance rating purposes, buildings for which the start of construction or substantial improvement was after December 31, 1974, or on or after the effective date of the initial FIRM for the community, whichever is later, are considered Post-FIRM construction.

In [38]:
con.sql("""
    SELECT
        postFIRMConstructionIndicator,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
""")

┌───────────────────────────────┬─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│ postFIRMConstructionIndicator │ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│            boolean            │    int64    │     int64      │       int64       │       int64       │   int64   │
├───────────────────────────────┼─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│ false                         │     1961171 │    53017758311 │       43012194948 │        9378595126 │ 626968236 │
│ true                          │      710315 │    27620594292 │       22037398660 │        5261726761 │ 321468871 │
└───────────────────────────────┴─────────────┴────────────────┴───────────────────┴───────────────────┴───────────┘

## Total Paid Claims by Basement Enclosure Crawlspace Type
>Basement is defined for purposes of the NFIP as any level or story which has its floor subgrade on all sides. Basement structure values are as follows: 0 - None; 1 - Finished Basement/Enclosure; 2 - Unfinished Basement/Enclosure; 3 - Crawlspace; 4 - Subgrade Crawlspace

In [39]:
con.sql("""
    SELECT
        basementEnclosureCrawlspaceType,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 20
""").df()

Unnamed: 0,basementEnclosureCrawlspaceType,countClaims,paidTotalClaim,paidBuildingClaim,paidContentsClaim,paidICC
0,,1867992,55736598435,43974709709,11010313677,751575049
1,2.0,283047,6023112182,5309665486,621057933,92388762
2,0.0,258080,14118916150,11621906529,2431677559,65332063
3,1.0,218434,3411584798,2953210350,440219829,18154620
4,4.0,43933,1348141039,1190101534,137052891,20986614


## Total Paid Claims by Elevated Building Indicator
>Indicates whether a building meets the NFIP definition of an elevated building. Yes is indicated with either a 'true' or '1'. No is indicated with either a 'false' or '0'. An elevated building is a no-basement building that was constructed so as to meet the following criteria: 1. The top of the elevated floor (all A zones) or the bottom of the lowest horizontal structural member of the lowest floor (all V zones) is above ground level. 2. The building is adequately anchored. 3. The method of elevation is pilings, columns (posts and piers), shear walls (not in V zones), or solid foundation perimeter walls (not in V zones).

In [40]:
con.sql("""
    SELECT
        elevatedBuildingIndicator,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
""")

┌───────────────────────────┬─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│ elevatedBuildingIndicator │ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│          boolean          │    int64    │     int64      │       int64       │       int64       │   int64   │
├───────────────────────────┼─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│ false                     │     2187042 │    66083874242 │       52806310817 │       12838681833 │ 438881592 │
│ true                      │      484444 │    14554478361 │       12243282792 │        1801640054 │ 509555515 │
└───────────────────────────┴─────────────┴────────────────┴───────────────────┴───────────────────┴───────────┘

## Total Paid Claims by Occupancy Type
>Code indicating the use and occupancy type of the insured structure. Note, 2-digit codes are for Risk Rating 2.0 policies. 1=single family residence; 2 = 2 to 4 unit residential building; 3 = residential building with more than 4 units; 4 = Non-residential building; 6 = Non Residential - Business; 11 = Single-family residential building with the exception of a mobile home or a single residential unit within a multi unit building; 12 = A residential non-condo building with 2, 3, or 4 units seeking insurance on all units; 13 = A residential non-condo building with 5 or more units seeking insurance on all units; 14 = Residential mobile/manufactured home; 15 = Residential condo association seeking coverage on a building with one or more units; 16 = Single residential unit within a multi-unit building; 17 = Non-residential mobile/manufactured home; 18 = A non-residential building; 19 = a non-residential unit within a multi-unit building

In [41]:
con.sql("""
    SELECT
        occupancyType,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""")

┌───────────────┬─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│ occupancyType │ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│     int16     │    int64    │     int64      │       int64       │       int64       │   int64   │
├───────────────┼─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│             1 │     2085505 │    58074889678 │       46438196346 │       10745221948 │ 891471384 │
│             4 │      174509 │     6399512277 │        4292656524 │        2096028112 │  10827641 │
│             3 │       93350 │     4398976557 │        4253333229 │         138336377 │   7306952 │
│             2 │      148432 │     3733231026 │        3410379315 │         287688446 │  35163265 │
│             6 │       44282 │     3360242402 │        2512042450 │         846579743 │   1620209 │
│            11 │       94093 │     2822556575 │        2447270648 │         373666665 │   

# Additional Analysis

## Total Paid Claims by Cause of Damage Code
>Indicates the method by which the insured's property and contents were damaged. Legal values (value : description): 0 : Other causes; 1 : Tidal water overflow; 2 : Stream, river, or lake overflow; 3 : Alluvial fan overflow; 4 : Accumulation of rainfall or snowmelt; 7 : Erosion-demolition; 8 : Erosion-removal; 9 : Earth movement, landslide, land subsidence, sinkholes, etc.; A : Closed basin lake; B : Expedited claim handling process without site inspection; C : Expedited claim handling process follow-up site inspection; D : Expedited claim handling process by Adjusting Process Pilot Program (Remote Adjustment); NOTE: Due to certain provisions of the Upton Jones Amendment to the National Flood Insurance Act, cause of loss codes 7 and 8 may be used only if the date of loss is prior to September 23, 1995. More than one cause of loss code can be selected. For example, you may select 2 (stream, river, or lake overflow) and D (remote adjustment) on a single claim, or any combination of letters/numbers as appropriate

In [42]:
con.sql("""
    SELECT
        causeOfDamage,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""")

┌───────────────┬─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│ causeOfDamage │ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│    varchar    │    int64    │     int64      │       int64       │       int64       │   int64   │
├───────────────┼─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│ 4             │     1165067 │    32297114786 │       25920976704 │        6094556798 │ 281581284 │
│ 1             │      652400 │    21153634243 │       17340791855 │        3445277646 │ 367564742 │
│ 2             │      535954 │    18498106661 │       14798505265 │        3483557466 │ 216043929 │
│ 0             │      219652 │     6152450677 │        4980477176 │        1103245478 │  68728023 │
│ B             │       15809 │     2101842424 │        1650532490 │         440000279 │  11309655 │
│ 3             │       12895 │      236698761 │         184969651 │          49490096 │   

## Total Paid Claims by Flood Characteristics Indicator
>Indicates characteristics of the flood waters. Legal values (value : description): 1 : Velocity Flow; 2 : Low-Velocity Flow or Ponding; 3 : Wave Action; 4 : Mudflow; 5 : Erosion;

In [43]:
con.sql("""
    SELECT
        floodCharacteristicsIndicator,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""")

┌───────────────────────────────┬─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│ floodCharacteristicsIndicator │ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│             int16             │    int64    │     int64      │       int64       │       int64       │   int64   │
├───────────────────────────────┼─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│                          NULL │     2631633 │    80040375269 │       64608337763 │       14488023555 │ 944013951 │
│                             2 │       29123 │      392139926 │         284532994 │         105348196 │   2258735 │
│                             1 │        8026 │      169487124 │         126607466 │          41429777 │   1449881 │
│                             3 │        1818 │       28177518 │          23333657 │           4129321 │    714539 │
│                             4 │         649 │        6794025 │

## Total Paid Claims by Floodproofed Indicator
>Indicates if the insured structure is floodproofed. Yes is indicated with either a 'true' or '1'. No is indicated with either a 'false' or '0'. NOTE: Floodproofing is any combination of structural and non-structural additions, changes, or adjustments to structures which reduce or eliminate flood damage to a property. Floodproofing may be an alternative to elevating a building to or above the BFE; however, the NFIP requires a Floodproofing Certificate prior to considering floodproofing mitigation measures in rating a structure

In [44]:
con.sql("""
    SELECT
        floodproofedIndicator,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
""")

┌───────────────────────┬─────────────┬────────────────┬───────────────────┬───────────────────┬───────────┐
│ floodproofedIndicator │ countClaims │ paidTotalClaim │ paidBuildingClaim │ paidContentsClaim │  paidICC  │
│        boolean        │    int64    │     int64      │       int64       │       int64       │   int64   │
├───────────────────────┼─────────────┼────────────────┼───────────────────┼───────────────────┼───────────┤
│ false                 │     2671109 │    80619514527 │       65035738829 │       14635368591 │ 948407107 │
│ true                  │         377 │       18838076 │          13854779 │           4953296 │     30000 │
└───────────────────────┴─────────────┴────────────────┴───────────────────┴───────────────────┴───────────┘

## Total Paid Claims by Building Description Code
>Indicates the description of the use of the insured building. Legal values (value : description): 01 : Main House; 02 : Detached Guest House; 03 : Detached Garage; 04 : Agricultural Building; 05 : Warehouse; 06 : Pool House, Clubhouse, Recreation Building; 07 : Tool/Storage Shed; 08 : Other; 09 : Barn; 10 : Apartment Building; 11 : Apartment - Unit; 12 : Cooperative Building; 13 : Cooperative - Unit; 14 : Commercial Building; 15 : Condominium (Entire Building); 16 : Condominium - Unit; 17 : House of Worship; 18 : Manufactured (Mobile) Home; 19 : Travel Trailer; 20 : Townhouse/Rowhouse;

In [45]:
con.sql("""
    SELECT
        buildingDescriptionCode,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
    LIMIT 20
""").df()

Unnamed: 0,buildingDescriptionCode,countClaims,paidTotalClaim,paidBuildingClaim,paidContentsClaim,paidICC
0,1.0,905145,39094112092,32540928483,6132748786,420434822
1,,1717297,38230967250,29855329355,7851193461,524444435
2,8.0,37859,2754729151,2195108224,556837496,2783431
3,5.0,2367,199062000,130086791,68975209,0
4,10.0,2330,188592994,187206254,1386741,0
5,2.0,2258,58607574,49616808,8324777,665989
6,6.0,1324,56060632,47843930,8162650,54053
7,7.0,957,14025659,9901318,4095639,28703
8,3.0,760,13168321,11508338,1647109,12875
9,12.0,137,8563373,7334598,1215975,12800


## Total Paid Claims by Rental Property Indicator
>Indicates if the property is a rental property. Yes is indicated with either a 'true' or '1'. No is indicated with either a 'false' or '0'. NOTE: A rental property is a property from which the owner receives payment from the occupant(s), known as tenants, in return for occupying or using the property. Rental properties may be either residential or commercial.

In [46]:
con.sql("""
    SELECT
        rentalPropertyIndicator,
        COUNT(id) AS countClaims,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidTotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS paidBuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS paidContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS paidICC
    FROM claims
    GROUP BY 1
    ORDER BY paidTotalClaim DESC
""").df()

Unnamed: 0,rentalPropertyIndicator,countClaims,paidTotalClaim,paidBuildingClaim,paidContentsClaim,paidICC
0,False,2638216,79198225176,63777570301,14474933322,945721553
1,True,33270,1440127427,1272023307,165388566,2715554
