## Prosafe Auto Insure Dataset
The ProSafe Auto Insure dataset is a large set of synthetic data that is generated on the insurance industry to help insurers understand what goes into the pricing of auto insurance premiums. Although the dataset is likely synthetic, the [Kaggle website hosting the data source](https://www.kaggle.com/datasets/zeesolver/prosafe-auto-insure) claims that this synthetic dataset is recognized industry-wide and is cited in 131 scholarly articles.

In [1]:
import pandas as pd
from pandasql import sqldf

Similar to the Allstate example, the datasets can be imported with Pandas' `read_csv()` function, leveraging the `base_dir` variable like so:

In [2]:
base_dir = "../data_sources/prosafe_dataset"
policy_from_2011_to_2014 = pd.read_csv(f"{base_dir}/policy_from_2011_to_2014.csv")
policy_from_2014_to_2018 = pd.read_csv(f"{base_dir}/policy_from_2014_to_2018.csv")

We can similarly execute SQL statements against the dataframes using the `sqldf` function:

In [3]:
sqldf("SELECT * FROM policy_from_2011_to_2014 LIMIT 10;")

Unnamed: 0,SEX,INSR_BEGIN,INSR_END,EFFECTIVE_YR,INSR_TYPE,INSURED_VALUE,PREMIUM,OBJECT_ID,PROD_YEAR,SEATS_NUM,CARRYING_CAPACITY,TYPE_VEHICLE,CCM_TON,MAKE,USAGE,CLAIM_PAID
0,0,08-AUG-13,07-AUG-14,8,1202,519755.22,7209.14,5000029885,2007.0,4.0,6.0,Pick-up,3153.0,NISSAN,Own Goods,
1,0,08-AUG-12,07-AUG-13,8,1202,519755.22,7203.89,5000029885,2007.0,4.0,6.0,Pick-up,3153.0,NISSAN,Own Goods,
2,0,08-AUG-11,07-AUG-12,8,1202,519755.22,7045.804,5000029885,2007.0,4.0,6.0,Pick-up,3153.0,NISSAN,Own Goods,
3,0,08-JUL-11,07-AUG-11,8,1202,519755.22,287.25,5000029885,2007.0,4.0,6.0,Pick-up,3153.0,NISSAN,Own Goods,
4,0,08-AUG-13,07-AUG-14,8,1202,285451.24,4286.9,5000029901,2010.0,4.0,7.0,Pick-up,2494.0,TOYOTA,Own Goods,19894.43
5,0,08-AUG-12,07-AUG-13,8,1202,285451.24,4286.65,5000029901,2010.0,4.0,7.0,Pick-up,2494.0,TOYOTA,Own Goods,26916.44
6,0,08-AUG-11,07-AUG-12,8,1202,285451.24,4123.564,5000029901,2010.0,4.0,7.0,Pick-up,2494.0,TOYOTA,Own Goods,
7,0,08-JUL-11,07-AUG-11,8,1202,285451.24,155.01,5000029901,2010.0,4.0,7.0,Pick-up,2494.0,TOYOTA,Own Goods,
8,0,08-JUL-12,07-JUL-13,11,1201,200000.0,3452.65,5000030346,1982.0,10.0,,Station Wagones,4164.0,TOYOTA,Private,
9,0,08-AUG-11,07-JUL-12,11,1201,200000.0,3077.54,5000030346,1982.0,10.0,,Station Wagones,4164.0,TOYOTA,Private,


In [4]:
sqldf("SELECT * FROM policy_from_2014_to_2018 LIMIT 10;")

Unnamed: 0,SEX,INSR_BEGIN,INSR_END,EFFECTIVE_YR,INSR_TYPE,INSURED_VALUE,PREMIUM,OBJECT_ID,PROD_YEAR,SEATS_NUM,CARRYING_CAPACITY,TYPE_VEHICLE,CCM_TON,MAKE,USAGE,CLAIM_PAID
0,0,08-AUG-17,07-AUG-18,8,1202,519755.22,5097.83,5000029885,2007.0,4.0,6.0,Pick-up,3153.0,NISSAN,Own Goods,
1,0,08-AUG-16,07-AUG-17,8,1202,519755.22,6556.52,5000029885,2007.0,4.0,6.0,Pick-up,3153.0,NISSAN,Own Goods,
2,0,08-AUG-15,07-AUG-16,8,1202,519755.22,6556.52,5000029885,2007.0,4.0,6.0,Pick-up,3153.0,NISSAN,Own Goods,
3,0,08-AUG-14,07-AUG-15,8,1202,519755.22,5102.83,5000029885,2007.0,4.0,6.0,Pick-up,3153.0,NISSAN,Own Goods,
4,0,08-AUG-17,07-AUG-18,8,1202,1400000.0,13304.87,5000029901,2010.0,4.0,7.0,Pick-up,2494.0,TOYOTA,Own Goods,
5,0,08-AUG-16,07-AUG-17,8,1202,1400000.0,16438.15,5000029901,2010.0,4.0,7.0,Pick-up,2494.0,TOYOTA,Own Goods,
6,0,08-AUG-15,07-AUG-16,8,1202,1400000.0,16438.15,5000029901,2010.0,4.0,7.0,Pick-up,2494.0,TOYOTA,Own Goods,365250.0
7,0,08-AUG-14,07-AUG-15,8,1202,285451.24,3931.23,5000029901,2010.0,4.0,7.0,Pick-up,2494.0,TOYOTA,Own Goods,12152.73
8,1,24-NOV-17,23-NOV-18,12,1202,3400000.0,26804.72,5000030358,2012.0,0.0,220.0,Truck,12880.0,IVECO,General Cartage,
9,1,24-NOV-16,23-NOV-17,12,1202,3400000.0,26804.72,5000030358,2012.0,0.0,220.0,Truck,12880.0,IVECO,General Cartage,
