# Analyzing FEMA's National Flood Insurance Program (NFIP) Data With DuckDB
Author: Mark Bauer

# Introduction

The FEMA National Flood Insurance Program (NFIP) website offers a trove of valuable information. Among its highlights is the [Flood Insurance Data and Analytics](https://nfipservices.floodsmart.gov/reports-flood-insurance-data) section, featuring data visualizations, tables, and reports. This project was inspired by these resources, particularly the Financial Losses by State and Policy, and Loss Statistics by Flood Zone Excel files.

Additionally, this analysis focuses solely on the [NFIP Redated Claims](https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2) dataset, and fortunately, it is available as a Parquet file. Given the size of this dataset, it was a good opportunity to utilize and learn more about [DuckDB](https://duckdb.org/).

# Data
Title: **OpenFEMA Dataset: FIMA NFIP Redacted Claims - v2**  

You can find information about the data, as well as the data dictionary, here: https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2.

In [1]:
# import libraries
import duckdb
import pandas as pd

In [2]:
# reproducibility
%reload_ext watermark
%watermark -v -p duckdb,pandas

Python implementation: CPython
Python version       : 3.11.0
IPython version      : 8.6.0

duckdb: 1.0.0
pandas: 1.5.1



In [3]:
ls data/

FimaNfipClaims.parquet


# Create Table Using DuckDB

In [4]:
# create a DuckDB database instance
con = duckdb.connect()

# create table of the claims dataset
con.execute("""
    CREATE TABLE claims AS
    FROM read_parquet('data/FimaNfipClaims.parquet')
""")

# sanity check, examine count of rows
con.sql("""
    SELECT COUNT(*) AS count_rows FROM claims
""").show()

┌────────────┐
│ count_rows │
│   int64    │
├────────────┤
│    2671486 │
└────────────┘



# Examine and Describe Data Set

In [5]:
# preview data
con.sql("""
    SELECT *
    FROM claims
    LIMIT 5
""").show()

┌──────────────────────┬──────────────────────┬──────────────────────┬───┬──────────────────────┬──────────┬───────────┐
│          id          │ agricultureStructu…  │       asOfDate       │ … │ censusBlockGroupFips │ latitude │ longitude │
│       varchar        │       boolean        │      timestamp       │   │       varchar        │  double  │  double   │
├──────────────────────┼──────────────────────┼──────────────────────┼───┼──────────────────────┼──────────┼───────────┤
│ e7af3d9f-b605-4653…  │ false                │ 2020-11-13 14:50:3…  │ … │ 010030114073         │     30.3 │     -87.7 │
│ bbaeaf64-c162-41bf…  │ false                │ 2020-12-11 16:25:4…  │ … │ 010030114072         │     30.3 │     -87.7 │
│ 256da746-b30b-4129…  │ false                │ 2020-03-27 12:15:4…  │ … │ 040131089022         │     33.5 │    -112.1 │
│ e3dcbb27-a2d0-4a9e…  │ false                │ 2020-03-27 12:15:4…  │ … │ 040131089022         │     33.5 │    -112.1 │
│ f77efb94-0188-4fa0…  │ false  

In [6]:
# preview data
sql = """
    SELECT *
    FROM claims
    LIMIT 5
"""

con.sql(sql).df().iloc[:, :15]

Unnamed: 0,id,agricultureStructureIndicator,asOfDate,basementEnclosureCrawlspaceType,policyCount,crsClassificationCode,dateOfLoss,elevatedBuildingIndicator,elevationCertificateIndicator,elevationDifference,baseFloodElevation,ratedFloodZone,houseWorship,locationOfContents,lowestAdjacentGrade
0,e7af3d9f-b605-4653-958f-bc48f413766c,False,2020-11-13 14:50:38.288,2,1,,2020-09-16,False,,6.0,10.0,AE,False,,0.0
1,bbaeaf64-c162-41bf-a399-862d0e832447,False,2020-12-11 16:25:40.587,0,1,,2020-09-16,True,,4.0,10.0,AE,False,7.0,4.4
2,256da746-b30b-4129-a391-93f50a6190c8,False,2020-03-27 12:15:45.887,0,1,,2016-08-02,False,2.0,,,AH,False,3.0,
3,e3dcbb27-a2d0-4a9e-a70d-ff7e578d4e6a,False,2020-03-27 12:15:45.887,0,1,,2014-09-08,False,2.0,,,AH,False,3.0,
4,f77efb94-0188-4fa0-b52e-3aa71499baeb,False,2020-03-26 12:56:27.476,0,1,,2018-01-09,True,,,,AE,False,,


In [7]:
con.sql(sql).df().iloc[:, 15:30]

Unnamed: 0,lowestFloorElevation,numberOfFloorsInTheInsuredBuilding,nonProfitIndicator,obstructionType,occupancyType,originalConstructionDate,originalNBDate,amountPaidOnBuildingClaim,amountPaidOnContentsClaim,amountPaidOnIncreasedCostOfComplianceClaim,postFIRMConstructionIndicator,rateMethod,smallBusinessIndicatorBuilding,totalBuildingInsuranceCoverage,totalContentsInsuranceCoverage
0,16.0,2,False,,2,1975-01-01,2008-10-20,2695.44,0.0,0.0,True,8,False,7400,0
1,14.4,3,False,54.0,1,1991-09-17,2017-10-05,2750.18,121.0,0.0,True,8,False,10800,700
2,,1,False,,1,1949-01-01,2014-09-02,,,,False,B,False,153500,15800
3,,1,False,,1,1949-01-01,2014-09-02,25156.27,0.0,0.0,False,B,False,139500,15000
4,,1,False,10.0,1,1960-01-01,2001-08-05,,,,False,1,False,250000,0


In [8]:
con.sql(sql).df().iloc[:, 30:45]

Unnamed: 0,yearOfLoss,primaryResidenceIndicator,buildingDamageAmount,buildingDeductibleCode,netBuildingPaymentAmount,buildingPropertyValue,causeOfDamage,condominiumCoverageTypeCode,contentsDamageAmount,contentsDeductibleCode,netContentsPaymentAmount,contentsPropertyValue,disasterAssistanceCoverageRequired,eventDesignationNumber,ficoNumber
0,2020,False,3945.0,F,2695.44,167531.0,1,N,0.0,,0.0,0.0,0,AL0520,
1,2020,True,4000.0,F,2750.18,278027.0,1,N,1371.0,F,121.0,50000.0,0,AL0520,
2,2016,True,1698.0,5,0.0,140422.0,4,N,3690.0,5,0.0,21500.0,0,,
3,2014,True,30098.0,5,25156.27,140919.0,4,N,,5,0.0,,0,,
4,2018,True,,2,0.0,,4,N,,0,0.0,,0,,368.0


In [9]:
con.sql(sql).df().iloc[:, 45:60]

Unnamed: 0,floodCharacteristicsIndicator,floodWaterDuration,floodproofedIndicator,floodEvent,iccCoverage,netIccPaymentAmount,nfipRatedCommunityNumber,nfipCommunityNumberCurrent,nfipCommunityName,nonPaymentReasonContents,nonPaymentReasonBuilding,numberOfUnits,buildingReplacementCost,contentsReplacementCost,replacementCostBasis
0,,,False,Hurricane Sally,30000,0.0,15005,0,,,,0,185457.0,0.0,A
1,,,False,Hurricane Sally,30000,0.0,15005,0,,,,0,314835.0,75000.0,A
2,,0.0,False,,30000,0.0,40051,0,,1.0,1.0,0,175528.0,0.0,R
3,,0.0,False,,30000,0.0,40051,0,,,,0,173974.0,,R
4,,0.0,False,Mid-Winter California Flooding,30000,0.0,60331,0,,,6.0,0,,,A


In [10]:
con.sql(sql).df().iloc[:, 60:75]

Unnamed: 0,stateOwnedIndicator,waterDepth,floodZoneCurrent,buildingDescriptionCode,rentalPropertyIndicator,state,reportedCity,reportedZipCode,countyCode,censusTract,censusBlockGroupFips,latitude,longitude
0,False,0,,1,False,AL,Currently Unavailable,36542,1003,1003011407,10030114073,30.3,-87.7
1,False,0,AE,1,False,AL,Currently Unavailable,36542,1003,1003011407,10030114072,30.3,-87.7
2,False,1,,1,False,AZ,Currently Unavailable,85015,4013,4013108902,40131089022,33.5,-112.1
3,False,1,,1,False,AZ,Currently Unavailable,85015,4013,4013108902,40131089022,33.5,-112.1
4,False,0,AE,1,False,CA,Currently Unavailable,93108,6083,6083001402,60830014022,34.4,-119.6


In [11]:
# examine column metadata and push to pandas df
describe_df = con.sql("DESCRIBE claims").df()

describe_df

Unnamed: 0,column_name,column_type,null,key,default,extra
0,id,VARCHAR,YES,,,
1,agricultureStructureIndicator,BOOLEAN,YES,,,
2,asOfDate,TIMESTAMP,YES,,,
3,basementEnclosureCrawlspaceType,SMALLINT,YES,,,
4,policyCount,SMALLINT,YES,,,
...,...,...,...,...,...,...
68,countyCode,VARCHAR,YES,,,
69,censusTract,VARCHAR,YES,,,
70,censusBlockGroupFips,VARCHAR,YES,,,
71,latitude,DOUBLE,YES,,,


In [12]:
# examine each column in sections because of large number of columns
describe_df.iloc[:25, :2]

Unnamed: 0,column_name,column_type
0,id,VARCHAR
1,agricultureStructureIndicator,BOOLEAN
2,asOfDate,TIMESTAMP
3,basementEnclosureCrawlspaceType,SMALLINT
4,policyCount,SMALLINT
5,crsClassificationCode,SMALLINT
6,dateOfLoss,DATE
7,elevatedBuildingIndicator,BOOLEAN
8,elevationCertificateIndicator,VARCHAR
9,elevationDifference,DOUBLE


In [13]:
describe_df.iloc[25:50, :2]

Unnamed: 0,column_name,column_type
25,postFIRMConstructionIndicator,BOOLEAN
26,rateMethod,VARCHAR
27,smallBusinessIndicatorBuilding,BOOLEAN
28,totalBuildingInsuranceCoverage,BIGINT
29,totalContentsInsuranceCoverage,BIGINT
30,yearOfLoss,SMALLINT
31,primaryResidenceIndicator,BOOLEAN
32,buildingDamageAmount,BIGINT
33,buildingDeductibleCode,VARCHAR
34,netBuildingPaymentAmount,DOUBLE


In [14]:
describe_df.iloc[50:, :2]

Unnamed: 0,column_name,column_type
50,netIccPaymentAmount,DOUBLE
51,nfipRatedCommunityNumber,VARCHAR
52,nfipCommunityNumberCurrent,VARCHAR
53,nfipCommunityName,VARCHAR
54,nonPaymentReasonContents,VARCHAR
55,nonPaymentReasonBuilding,VARCHAR
56,numberOfUnits,INTEGER
57,buildingReplacementCost,BIGINT
58,contentsReplacementCost,BIGINT
59,replacementCostBasis,VARCHAR


# Summary Statistics

In [15]:
# calculate summary statistics of each column
summarize_df = con.sql("SUMMARIZE claims").df()

summarize_df

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
0,id,VARCHAR,0000094b-1d1a-4dbe-a944-97c62b225b9c,fffffe7c-7ca2-4431-b88b-8a8216ffc934,2623176,,,,,,2671486,0.00
1,agricultureStructureIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.00
2,asOfDate,TIMESTAMP,2019-09-19 06:12:43.388,2024-10-04 00:59:55.552,479449,,,,,,2671486,0.00
3,basementEnclosureCrawlspaceType,SMALLINT,0,4,4,1.195105377264796,1.0618378842847205,0,1,2,2671486,69.92
4,policyCount,SMALLINT,1,1090,399,1.2961583927447122,6.742744486702108,1,1,1,2671486,0.00
...,...,...,...,...,...,...,...,...,...,...,...,...
68,countyCode,VARCHAR,01001,78030,2946,,,,,,2671486,2.32
69,censusTract,VARCHAR,01001020100,78030961200,59577,,,,,,2671486,5.16
70,censusBlockGroupFips,VARCHAR,010010201001,780309612002,117722,,,,,,2671486,5.16
71,latitude,DOUBLE,-36.0,69.9,334,33.755900244381905,5.827326735346326,29.81055607671001,30.64570817945568,39.68492345385303,2671486,1.51


In [16]:
# examine each column in sections because of large number of columns
summarize_df.iloc[:25, :]

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
0,id,VARCHAR,0000094b-1d1a-4dbe-a944-97c62b225b9c,fffffe7c-7ca2-4431-b88b-8a8216ffc934,2623176,,,,,,2671486,0.0
1,agricultureStructureIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.0
2,asOfDate,TIMESTAMP,2019-09-19 06:12:43.388,2024-10-04 00:59:55.552,479449,,,,,,2671486,0.0
3,basementEnclosureCrawlspaceType,SMALLINT,0,4,4,1.195105377264796,1.0618378842847205,0.0,1.0,2.0,2671486,69.92
4,policyCount,SMALLINT,1,1090,399,1.2961583927447122,6.742744486702108,1.0,1.0,1.0,2671486,0.0
5,crsClassificationCode,SMALLINT,1,10,10,6.708258215272243,1.526304305976166,5.0,7.0,8.0,2671486,80.65
6,dateOfLoss,DATE,1978-01-01,2024-10-03,16868,,,,,,2671486,0.0
7,elevatedBuildingIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.0
8,elevationCertificateIndicator,VARCHAR,1,E,9,,,,,,2671486,77.72
9,elevationDifference,DOUBLE,-9989.0,998.0,371,1.253409617633019,29.091407737414052,0.0,1.0,2.868431722185448,2671486,72.95


In [17]:
summarize_df.iloc[25:50, :]

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
25,postFIRMConstructionIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.0
26,rateMethod,VARCHAR,1,W,22,,,,,,2671486,1.85
27,smallBusinessIndicatorBuilding,BOOLEAN,false,true,2,,,,,,2671486,0.0
28,totalBuildingInsuranceCoverage,BIGINT,0,243903000,12425,172980.48504127294,1253722.2614965644,39847.0,100216.0,222092.0,2671486,0.0
29,totalContentsInsuranceCoverage,BIGINT,0,6000000,3064,31127.78768283439,50873.49942270788,0.0,11685.0,46063.0,2671486,0.0
30,yearOfLoss,SMALLINT,1978,2024,47,2002.6466337461625,12.783941740451548,1993.0,2005.0,2012.0,2671486,0.0
31,primaryResidenceIndicator,BOOLEAN,false,true,2,,,,,,2671486,0.0
32,buildingDamageAmount,BIGINT,0,927700000,202772,36354.07504655704,812334.7780210067,3436.0,11193.0,40094.0,2671486,23.76
33,buildingDeductibleCode,VARCHAR,0,H,15,,,,,,2671486,11.77
34,netBuildingPaymentAmount,DOUBLE,-162432.16,10006145.71,1286178,24319.109697074862,59591.412887114006,0.0138960770064135,4444.778030483547,24100.00475721046,2671486,0.0


In [18]:
summarize_df.iloc[50:, :]

Unnamed: 0,column_name,column_type,min,max,approx_unique,avg,std,q25,q50,q75,count,null_percentage
50,netIccPaymentAmount,DOUBLE,-6450.0,60000.0,8709,353.6949859478956,3066.941960282329,0.0,0.0,0.0,2671486,0.0
51,nfipRatedCommunityNumber,VARCHAR,000000,999999,16299,,,,,,2671486,0.0
52,nfipCommunityNumberCurrent,VARCHAR,0000,815000,12065,,,,,,2671486,72.1
53,nfipCommunityName,VARCHAR,ABBEVILLE COUNTY *,"ZUMBRO FALLS, CITY OF",9870,,,,,,2671486,70.76
54,nonPaymentReasonContents,VARCHAR,01,99,23,,,,,,2671486,68.95
55,nonPaymentReasonBuilding,VARCHAR,01,99,23,,,,,,2671486,78.02
56,numberOfUnits,INTEGER,0,99999,450,1.3967815247755893,65.4445177643468,1.0,1.0,1.0,2671486,0.11
57,buildingReplacementCost,BIGINT,0,9999000000,447053,8569836.741231738,235005475.3540723,0.0,121109.0,224205.0,2671486,23.76
58,contentsReplacementCost,BIGINT,0,20000000,8887,2809.0997141881894,46529.6071447682,0.0,0.0,0.0,2671486,59.75
59,replacementCostBasis,VARCHAR,A,R,2,,,,,,2671486,8.56


In [19]:
# examine columns with high null percentage
(summarize_df
 .sort_values(by='null_percentage', ascending=False)
 .loc[:, ['column_name', 'null_percentage']]
 .head(20)
)

Unnamed: 0,column_name,null_percentage
45,floodCharacteristicsIndicator,98.51
43,eventDesignationNumber,94.34
14,lowestAdjacentGrade,81.26
5,crsClassificationCode,80.65
55,nonPaymentReasonBuilding,78.02
8,elevationCertificateIndicator,77.72
15,lowestFloorElevation,76.38
10,baseFloodElevation,75.7
9,elevationDifference,72.95
62,floodZoneCurrent,72.15


## Amount Paid on Claims Summary Statistics

In [20]:
# examine summary statisitics on amount claims paid columns
col_names = [
    'amountPaidOnBuildingClaim',
    'amountPaidOnContentsClaim',
    'amountPaidOnIncreasedCostOfComplianceClaim'
]

cols = [
    'column_name', 'min', 'max', 'avg', 'std',
    'q25', 'q50', 'q75', 'null_percentage'
]

astype = {
    'avg':'float',
    'std':'float',
    'q25':'float',
    'q50':'float',
    'q75':'float'
}

cols_round = {'avg':0, 'std':0, 'q25':0, 'q50':0, 'q75':0}

(summarize_df
 .loc[summarize_df['column_name'].isin(col_names), cols]
 .astype(astype)
 .round(cols_round)
 .set_index('column_name')
)

Unnamed: 0_level_0,min,max,avg,std,q25,q50,q75,null_percentage
column_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
amountPaidOnBuildingClaim,-162432.16,10006145.71,31531.0,66190.0,2312.0,9634.0,36629.0,22.78
amountPaidOnContentsClaim,-41276.32,757048.95,7097.0,22183.0,0.0,0.0,4809.0,22.78
amountPaidOnIncreasedCostOfComplianceClaim,-6450.0,60000.0,460.0,3490.0,0.0,0.0,0.0,22.78


## Building Damage, Values, and Replacement Costs Summary Statistics

In [21]:
# examine summary statisitics on building damage amounts, property values and replacement costs columns
col_names = [
    'buildingDamageAmount',
    'netBuildingPaymentAmount',
    'buildingPropertyValue',
    'buildingReplacementCost'
    
]

cols = [
    'column_name', 'min', 'max', 'avg', 'std',
    'q25', 'q50', 'q75', 'null_percentage'
]

astype = {
    'avg':'float',
    'std':'float',
    'q25':'float',
    'q50':'float',
    'q75':'float'
}

cols_round = {'avg':0, 'std':0, 'q25':0, 'q50':0, 'q75':0}

(summarize_df
 .loc[summarize_df['column_name'].isin(col_names), cols]
 .astype(astype)
 .round(cols_round)
 .set_index('column_name')
)

Unnamed: 0_level_0,min,max,avg,std,q25,q50,q75,null_percentage
column_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
buildingDamageAmount,0.0,927700000.0,36354.0,812335.0,3436.0,11193.0,40094.0,23.76
netBuildingPaymentAmount,-162432.16,10006145.71,24319.0,59591.0,0.0,4445.0,24100.0,0.0
buildingPropertyValue,0.0,9950000000.0,7148319.0,199917804.0,60948.0,109815.0,190068.0,23.76
buildingReplacementCost,0.0,9999000000.0,8569837.0,235005475.0,0.0,121109.0,224205.0,23.76


## Contents Damage, Values, and Replacement Costs Summary Statistics

In [22]:
# examine summary statisitics on contents amounts, property values and replacement costs columns
col_names = [
    'contentsDamageAmount',
    'netContentsPaymentAmount',
    'contentsPropertyValue',
    'contentsReplacementCost'
    
]

cols = [
    'column_name', 'min', 'max', 'avg', 'std',
    'q25', 'q50', 'q75', 'null_percentage'
]

astype = {
    'avg':'float',
    'std':'float',
    'q25':'float',
    'q50':'float',
    'q75':'float'
}

cols_round = {'avg':0, 'std':0, 'q25':0, 'q50':0, 'q75':0}

(summarize_df
 .loc[summarize_df['column_name'].isin(col_names), cols]
 .astype(astype)
 .round(cols_round)
 .set_index('column_name')
)

Unnamed: 0_level_0,min,max,avg,std,q25,q50,q75,null_percentage
column_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
contentsDamageAmount,0.0,19230507.0,18332.0,86944.0,1612.0,5646.0,18548.0,59.72
netContentsPaymentAmount,-41276.32,757048.95,5459.0,19697.0,0.0,0.0,2078.0,0.0
contentsPropertyValue,-1.0,281895323.0,31581.0,488372.0,0.0,0.0,19608.0,59.72
contentsReplacementCost,0.0,20000000.0,2809.0,46530.0,0.0,0.0,0.0,59.75


## Elevation and Water Depth Summary Statistics

In [23]:
# examine summary statisitics on elevation and water depth columns
col_names = [
    'lowestAdjacentGrade',
    'lowestFloorElevation',
    'baseFloodElevation',
    'elevationDifference',    
    'waterDepth'
    
]

cols = [
    'column_name', 'min', 'max', 'avg', 'std',
    'q25', 'q50', 'q75', 'null_percentage'
]

astype = {
    'avg':'float',
    'std':'float',
    'q25':'float',
    'q50':'float',
    'q75':'float'
}

cols_round = {'avg':0, 'std':0, 'q25':0, 'q50':0, 'q75':0}

(summarize_df
 .loc[summarize_df['column_name'].isin(col_names), cols]
 .astype(astype)
 .round(cols_round)
 .set_index('column_name')
)

Unnamed: 0_level_0,min,max,avg,std,q25,q50,q75,null_percentage
column_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
elevationDifference,-9989.0,998.0,1.0,29.0,0.0,1.0,3.0,72.95
baseFloodElevation,-9999.0,9998.0,127.0,767.0,7.0,9.0,14.0,75.7
lowestAdjacentGrade,-99999.9,9998.9,50.0,1483.0,3.0,6.0,11.0,81.26
lowestFloorElevation,-9999.0,9998.9,104.0,616.0,7.0,10.0,17.0,76.38
waterDepth,-999.0,999.0,4.0,16.0,0.0,1.0,2.0,9.4


# Analysis

In [24]:
# preview column names to use for Analysis section
(con
 .sql("SELECT * FROM claims LIMIT 10")
 .df()
 .info()
)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 73 columns):
 #   Column                                      Non-Null Count  Dtype         
---  ------                                      --------------  -----         
 0   id                                          10 non-null     object        
 1   agricultureStructureIndicator               10 non-null     bool          
 2   asOfDate                                    10 non-null     datetime64[ns]
 3   basementEnclosureCrawlspaceType             10 non-null     int16         
 4   policyCount                                 10 non-null     int16         
 5   crsClassificationCode                       0 non-null      float64       
 6   dateOfLoss                                  10 non-null     datetime64[ns]
 7   elevatedBuildingIndicator                   10 non-null     bool          
 8   elevationCertificateIndicator               3 non-null      object        
 9   elevationDiff

## Total Claims

In [25]:
con.sql(
    """
    SELECT
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim, 
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    """
)

┌─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│    int64    │     int64     │     int64     │             int64              │
├─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│ 80638352603 │   65049593608 │   14640321888 │                      948437107 │
└─────────────┴───────────────┴───────────────┴────────────────────────────────┘

## Total Claims by State

In [26]:
con.sql(
    """
    SELECT
        state,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim, 
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY state
    ORDER BY TotalClaim DESC
    LIMIT 20
    """
)

┌─────────┬─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│  state  │ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│ varchar │    int64    │     int64     │     int64     │             int64              │
├─────────┼─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│ LA      │ 20869805385 │   16459981751 │    4132267467 │                      277556167 │
│ TX      │ 17218491455 │   13253973375 │    3906593054 │                       57925025 │
│ FL      │ 11433783265 │    9699132485 │    1696552579 │                       38098201 │
│ NJ      │  6470109817 │    5349181327 │     870522071 │                      250406420 │
│ NY      │  5727350837 │    4955149755 │     719127080 │                       53074002 │
│ MS      │  3122839913 │    2386323813 │     675594228 │                       60921873 │
│ NC      │  2009961692 │    1698489347 │     274446907 │                       37025438 │

## Total Claims by State for Fiscal Year 2024

In [27]:
con.sql(
    """
    SELECT
        state, 
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    WHERE dateOfLoss between '2023-10-01' AND '2024-09-30'
    GROUP BY state
    ORDER BY TotalClaim DESC
    LIMIT 20
    """
)

┌─────────┬────────────┬───────────────┬───────────────┬────────────────────────────────┐
│  state  │ TotalClaim │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│ varchar │   int64    │     int64     │     int64     │             int64              │
├─────────┼────────────┼───────────────┼───────────────┼────────────────────────────────┤
│ FL      │  179496222 │     159954749 │      19499473 │                          42000 │
│ TX      │  141641632 │     122577288 │      19064344 │                              0 │
│ LA      │   36697924 │      34205554 │       2492369 │                              0 │
│ ME      │   34229651 │      31037588 │       3182043 │                          10019 │
│ NY      │   33349928 │      30184898 │       3143712 │                          21319 │
│ CA      │   30242469 │      28002393 │       2240076 │                              0 │
│ NJ      │   23724751 │      20819633 │       2905118 │                              0 │
│ SC      

## Total Claims by County

In [28]:
con.sql(
    """
    SELECT
        state,
        countyCode,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY state, countyCode
    ORDER BY TotalClaim DESC
    LIMIT 20
    """
)

┌─────────┬────────────┬────────────┬───────────────┬───────────────┬────────────────────────────────┐
│  state  │ countyCode │ TotalClaim │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│ varchar │  varchar   │   int64    │     int64     │     int64     │             int64              │
├─────────┼────────────┼────────────┼───────────────┼───────────────┼────────────────────────────────┤
│ TX      │ 48201      │ 8748468467 │    6794800334 │    1933648655 │                       20019478 │
│ LA      │ 22071      │ 7296273827 │    5892518413 │    1293261898 │                      110493515 │
│ LA      │ 22051      │ 3542723793 │    2633465878 │     847684749 │                       61573166 │
│ FL      │ 12071      │ 3413583297 │    3050925028 │     360290319 │                        2367950 │
│ NJ      │ 34029      │ 2607964807 │    2160867823 │     275256283 │                      171840701 │
│ TX      │ 48167      │ 2466669297 │    1896169966 │     549528562 │    

## Total Claims by Flood Event

In [29]:
con.sql(
    """
    SELECT
        floodEvent,
        yearOfLoss,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS ICCClaim
    FROM claims
    WHERE floodEvent NOT NULL
    GROUP BY floodEvent, yearOfLoss
    ORDER BY TotalClaim DESC
    LIMIT 20
    """
)

┌──────────────────────────┬────────────┬─────────────┬───────────────┬───────────────┬───────────┐
│        floodEvent        │ yearOfLoss │ TotalClaim  │ BuildingClaim │ ContentsClaim │ ICCClaim  │
│         varchar          │   int16    │    int64    │     int64     │     int64     │   int64   │
├──────────────────────────┼────────────┼─────────────┼───────────────┼───────────────┼───────────┤
│ Hurricane Katrina        │       2005 │ 16261697056 │   12659081935 │    3360020221 │ 242594900 │
│ Hurricane Harvey         │       2017 │  9055522699 │    6925370027 │    2115077279 │  15075393 │
│ Hurricane Sandy          │       2012 │  8956609943 │    7707744797 │     951637071 │ 297228075 │
│ Hurricane Ian            │       2022 │  4757384580 │    4229314739 │     525543744 │   2526097 │
│ Hurricane Ike            │       2008 │  2702511916 │    2073801567 │     577791589 │  50918760 │
│ Mid-summer severe storms │       2016 │  2533431383 │    2175218939 │     347263588 │  10948856 │


In [30]:
con.sql(
    """
    SELECT
        floodEvent,
        yearOfLoss,
        state,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS ICCClaim
    FROM claims
    WHERE floodEvent NOT NULL
    GROUP BY 1, 2, 3
    ORDER BY 4 DESC
    LIMIT 20
    """
)

┌──────────────────────────┬────────────┬─────────┬─────────────┬───────────────┬───────────────┬───────────┐
│        floodEvent        │ yearOfLoss │  state  │ TotalClaim  │ BuildingClaim │ ContentsClaim │ ICCClaim  │
│         varchar          │   int16    │ varchar │    int64    │     int64     │     int64     │   int64   │
├──────────────────────────┼────────────┼─────────┼─────────────┼───────────────┼───────────────┼───────────┤
│ Hurricane Katrina        │       2005 │ LA      │ 13347164644 │   10441743931 │    2721451049 │ 183969663 │
│ Hurricane Harvey         │       2017 │ TX      │  9040181063 │    6912549831 │    2112555840 │  15075393 │
│ Hurricane Ian            │       2022 │ FL      │  4711517234 │    4187737200 │     521388936 │   2391097 │
│ Hurricane Sandy          │       2012 │ NJ      │  4372115685 │    3673280769 │     457640033 │ 241194883 │
│ Hurricane Sandy          │       2012 │ NY      │  4214352197 │    3702376632 │     465410811 │  46564754 │
│ Mid-summ

## Total Claims by Flood Zone
>Formerly called floodZone. NFIP Flood Zone derived from the Flood Insurance Rate Map (FIRM) used to rate the insured property. A - Special Flood with no Base Flood Elevation on FIRM; AE, A1-A30 - Special Flood with Base Flood Elevation on FIRM; A99 - Special Flood with Protection Zone; AH, AHB* - Special Flood with Shallow Ponding; AO, AOB* - Special Flood with Sheet Flow; X, B - Moderate Flood from primary water source. Pockets of areas subject to drainage problems; X, C - Minimal Flood from primary water source. Pockets of areas subject to drainage problems; D - Possible Flood; V - Velocity Flood with no Base Flood Elevation on FIRM; VE, V1-V30 - Velocity Flood with Base Flood Elevation on FIRM; AE, VE, X - New zone designations used on new maps starting January 1, 1986, in lieu of A1-A30, V1-V30, and B and C; AR - A Special Flood Hazard Area that results from the decertification of a previously accredited flood protection system that is determined to be in the process of being restored to provide base flood protection; AR Dual Zones - (AR/AE, AR/A1-A30, AR/AH, AR/AO, AR/A) Areas subject to flooding from failure of the flood protection system (Zone AR) which also overlap an existing Special Flood Hazard Area as a dual zone; *AHB, AOB, ARE, ARH, ARO, and ARA are not risk zones shown on a map, but are acceptable values for rating purposes*

In [31]:
con.sql(
    """
    SELECT
        ratedFloodZone as ratedFloodZone,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 20
    """
)

┌────────────────┬─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│ ratedFloodZone │ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│    varchar     │    int64    │     int64     │     int64     │             int64              │
├────────────────┼─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│ AE             │ 35479231566 │   29574222634 │    5389301443 │                      515707490 │
│ X              │ 13436983455 │   10383896784 │    3024472374 │                       28614297 │
│ B              │  3826110386 │    2835364624 │     971908875 │                       18836887 │
│ A              │  3130755422 │    2486940415 │     606541394 │                       37273612 │
│ C              │  2853210440 │    2119312629 │     723803760 │                       10094051 │
│ A04            │  2185871995 │    1800832671 │     364399597 │                       20639726 │
│ A03            │  

## Total Claims by Occupancy Type
>Code indicating the use and occupancy type of the insured structure. Note, 2-digit codes are for Risk Rating 2.0 policies. 1=single family residence; 2 = 2 to 4 unit residential building; 3 = residential building with more than 4 units; 4 = Non-residential building; 6 = Non Residential - Business; 11 = Single-family residential building with the exception of a mobile home or a single residential unit within a multi unit building; 12 = A residential non-condo building with 2, 3, or 4 units seeking insurance on all units; 13 = A residential non-condo building with 5 or more units seeking insurance on all units; 14 = Residential mobile/manufactured home; 15 = Residential condo association seeking coverage on a building with one or more units; 16 = Single residential unit within a multi-unit building; 17 = Non-residential mobile/manufactured home; 18 = A non-residential building; 19 = a non-residential unit within a multi-unit building

In [32]:
con.sql(
    """
    SELECT
        occupancyType,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 20
    """
)

┌───────────────┬─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│ occupancyType │ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│     int16     │    int64    │     int64     │     int64     │             int64              │
├───────────────┼─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│             1 │ 58074889678 │   46438196346 │   10745221948 │                      891471384 │
│             4 │  6399512277 │    4292656524 │    2096028112 │                       10827641 │
│             3 │  4398976557 │    4253333229 │     138336377 │                        7306952 │
│             2 │  3733231026 │    3410379315 │     287688446 │                       35163265 │
│             6 │  3360242402 │    2512042450 │     846579743 │                        1620209 │
│            11 │  2822556575 │    2447270648 │     373666665 │                        1619262 │
│            15 │   946313442 

## Total Claims by Flood Zone and Occupancy Type

In [33]:
con.sql(
    """
    SELECT
        ratedFloodZone,
        occupancyType,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY 1, 2
    ORDER BY 3 DESC
    LIMIT 20
    """
)

┌────────────────┬───────────────┬─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│ ratedFloodZone │ occupancyType │ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│    varchar     │     int16     │    int64    │     int64     │     int64     │             int64              │
├────────────────┼───────────────┼─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│ AE             │             1 │ 24292519550 │   19994275329 │    3805913053 │                      492331168 │
│ X              │             1 │ 10708475718 │    8230626419 │    2450440203 │                       27409096 │
│ B              │             1 │  2959370101 │    2183341811 │     758742954 │                       17285336 │
│ A              │             1 │  2377100074 │    1942533526 │     398588313 │                       35978235 │
│ AE             │            11 │  2243498692 │    1942575827 │     299391808 │        

## Total Claims by Basement Enclosure Crawlspace Type
>Basement is defined for purposes of the NFIP as any level or story which has its floor subgrade on all sides. Basement structure values are as follows: 0 - None; 1 - Finished Basement/Enclosure; 2 - Unfinished Basement/Enclosure; 3 - Crawlspace; 4 - Subgrade Crawlspace

In [34]:
con.sql(
    """
    SELECT
        basementEnclosureCrawlspaceType,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 20
    """
)

┌─────────────────────────────────┬─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│ basementEnclosureCrawlspaceType │ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│              int16              │    int64    │     int64     │     int64     │             int64              │
├─────────────────────────────────┼─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│                            NULL │ 55736598435 │   43974709709 │   11010313677 │                      751575049 │
│                               0 │ 14118916150 │   11621906529 │    2431677559 │                       65332063 │
│                               2 │  6023112182 │    5309665486 │     621057933 │                       92388762 │
│                               1 │  3411584798 │    2953210350 │     440219829 │                       18154620 │
│                               4 │  1348141039 │    1190101534 │     137052891 

## Total Claims by Cause of Damage
>Indicates the method by which the insured's property and contents were damaged. Legal values (value : description): 0 : Other causes; 1 : Tidal water overflow; 2 : Stream, river, or lake overflow; 3 : Alluvial fan overflow; 4 : Accumulation of rainfall or snowmelt; 7 : Erosion-demolition; 8 : Erosion-removal; 9 : Earth movement, landslide, land subsidence, sinkholes, etc.; A : Closed basin lake; B : Expedited claim handling process without site inspection; C : Expedited claim handling process follow-up site inspection; D : Expedited claim handling process by Adjusting Process Pilot Program (Remote Adjustment); NOTE: Due to certain provisions of the Upton Jones Amendment to the National Flood Insurance Act, cause of loss codes 7 and 8 may be used only if the date of loss is prior to September 23, 1995. More than one cause of loss code can be selected. For example, you may select 2 (stream, river, or lake overflow) and D (remote adjustment) on a single claim, or any combination of letters/numbers as appropriate

In [35]:
con.sql(
    """
    SELECT
        causeOfDamage,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 20
    """
)

┌───────────────┬─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│ causeOfDamage │ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│    varchar    │    int64    │     int64     │     int64     │             int64              │
├───────────────┼─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│ 4             │ 32297114786 │   25920976704 │    6094556798 │                      281581284 │
│ 1             │ 21153634243 │   17340791855 │    3445277646 │                      367564742 │
│ 2             │ 18498106661 │   14798505265 │    3483557466 │                      216043929 │
│ 0             │  6152450677 │    4980477176 │    1103245478 │                       68728023 │
│ B             │  2101842424 │    1650532490 │     440000279 │                       11309655 │
│ 3             │   236698761 │     184969651 │      49490096 │                        2239014 │
│ NULL          │    56071884 

## Total Claims by Building Description
>Indicates the description of the use of the insured building. Legal values (value : description): 01 : Main House; 02 : Detached Guest House; 03 : Detached Garage; 04 : Agricultural Building; 05 : Warehouse; 06 : Pool House, Clubhouse, Recreation Building; 07 : Tool/Storage Shed; 08 : Other; 09 : Barn; 10 : Apartment Building; 11 : Apartment - Unit; 12 : Cooperative Building; 13 : Cooperative - Unit; 14 : Commercial Building; 15 : Condominium (Entire Building); 16 : Condominium - Unit; 17 : House of Worship; 18 : Manufactured (Mobile) Home; 19 : Travel Trailer; 20 : Townhouse/Rowhouse

In [36]:
con.sql(
    """
    SELECT
        buildingDescriptionCode,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 20
    """
)

┌─────────────────────────┬─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│ buildingDescriptionCode │ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│          int16          │    int64    │     int64     │     int64     │             int64              │
├─────────────────────────┼─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│                       1 │ 39094112092 │   32540928483 │    6132748786 │                      420434822 │
│                    NULL │ 38230967250 │   29855329355 │    7851193461 │                      524444435 │
│                       8 │  2754729151 │    2195108224 │     556837496 │                        2783431 │
│                       5 │   199062000 │     130086791 │      68975209 │                              0 │
│                      10 │   188592994 │     187206254 │       1386741 │                              0 │
│                       2 │    586075

## Total Claims by Rental Property Indicator
>Indicates if the property is a rental property. Yes is indicated with either a 'true' or '1'. No is indicated with either a 'false' or '0'. NOTE: A rental property is a property from which the owner receives payment from the occupant(s), known as tenants, in return for occupying or using the property. Rental properties may be either residential or commercial.

In [37]:
con.sql(
    """
    SELECT
        rentalPropertyIndicator,
        ROUND(
            SUM(amountPaidOnBuildingClaim)
            + SUM(amountPaidOnContentsClaim)
            + SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS TotalClaim,  
        ROUND(SUM(amountPaidOnBuildingClaim), 0)::BIGINT AS BuildingClaim,
        ROUND(SUM(amountPaidOnContentsClaim), 0)::BIGINT AS ContentsClaim, 
        ROUND(SUM(amountPaidOnIncreasedCostOfComplianceClaim), 0)::BIGINT AS IncreasedCostOfComplianceClaim
    FROM claims
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 20
    """
)

┌─────────────────────────┬─────────────┬───────────────┬───────────────┬────────────────────────────────┐
│ rentalPropertyIndicator │ TotalClaim  │ BuildingClaim │ ContentsClaim │ IncreasedCostOfComplianceClaim │
│         boolean         │    int64    │     int64     │     int64     │             int64              │
├─────────────────────────┼─────────────┼───────────────┼───────────────┼────────────────────────────────┤
│ false                   │ 79198225176 │   63777570301 │   14474933322 │                      945721553 │
│ true                    │  1440127427 │    1272023307 │     165388566 │                        2715554 │
└─────────────────────────┴─────────────┴───────────────┴───────────────┴────────────────────────────────┘