# Coding Assignment #1

Welcome to your first coding assignment! You will work with the provided dataset, which contains information about roof insurance claims. In this assignment, you will:
1. Load and inspect the dataset.
2. Perform basic data exploration.
3. Practice Python programming skills, including:
   - Using basic arithmetic and comparison operators.
   - Creating and working with lists.
   - Using booleans.
   - Writing for loops.
   - Using if-elif-else statements.
4. Develop a Python-based workflow to prepare the roof insurance claim dataset for analysis. This assignment focuses on learning how to:
   - Extract, clean, and transform data.
   - Identify and handle missing values.
   - Filter and organize data for analysis.

## Dataset
The dataset you'll be working with is `final_insurance_fraud.xlsx`. This file contains information about insurance claims, including details like claim type, amount, and fraud status.



## CASE INTRODUCTION.

Casey Lee, an insurance claims processor was reviewing claims received from a recent storm before finalizing authorization for roof replacements. She pulled up and reread the U.S. National Weather Service Announcement:

&nbsp;&nbsp; TORNADO WARNING  
&nbsp;&nbsp; NATIONAL WEATHER SERVICE CHICAGO/ROMEOVILLE   
&nbsp;&nbsp;1215 AM CDT THU SEP 12 20XX  

&nbsp;&nbsp;THE NATIONAL WEATHER SERVICE IN CHICAGO HAS ISSUED A   
&nbsp;&nbsp;*TORNADO WARNING FOR...    
&nbsp;&nbsp;CENTRAL DEKALB COUNTY IN NORTH CENTRAL ILLINOIS...    
&nbsp;&nbsp;UNTIL 530 PM CDT.  

&nbsp;&nbsp;*AT 1218 AM CDT, A SEVERE THUNDERSTORM CAPABLE OF PRODUCING A  
&nbsp;&nbsp;TORNADO WAS LOCATED NEAR SYCAMORE,  
&nbsp;&nbsp;OR NEAR SHABBONA, MOVING SOUTHWEST AT 2 MPH.  
&nbsp;&nbsp;&nbsp;&nbsp;  HAZARD...TORNADO AND QUARTER-SIZED HAIL.  
&nbsp;&nbsp;&nbsp;&nbsp;  SOURCE...RADAR INDICATED ROTATION.  
&nbsp;&nbsp;&nbsp;&nbsp; IMPACT...FLYING DEBRIS WILL BE DANGEROUS TO THOSE CAUGHT WITHOUT SHELTER.   
 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MOBILE HOMES WILL BE DAMAGED OR DESTROYED.  
   &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;   DAMAGE TO ROOFS, WINDOWS, AND VEHICLES WILL OCCUR. TREE DAMAGE IS LIKELY.  

&nbsp;&nbsp;*THIS DANGEROUS STORM WILL BE NEAR...  
&nbsp;&nbsp;SYCAMORE AROUND 1240 AM CDT.   
&nbsp;&nbsp;DEKALB AROUND 600 AM CDT.  
&nbsp;&nbsp;COURTLAND AROUND 1140 AM.     

&nbsp;&nbsp;PRECAUTIONARY/PREPAREDNESS ACTIONS...   

&nbsp;&nbsp;TAKE COVER NOW! MOVE TO A BASEMENT OR AN INTERIOR ROOM  
&nbsp;&nbsp;ON THE LOWEST FLOOR OF A STURDY BUILDING.  
&nbsp;&nbsp;AVOID WINDOWS. IF YOU ARE OUTDOORS, IN A MOBILE HOME, OR IN A VEHICLE,   
&nbsp;&nbsp;MOVE TO THE CLOSEST SUBSTANTIAL SHELTER AND PROTECT YOURSELF FROM FLYING DEBRIS.    

Indeed, it appeared to be a bad storm, which could substantiate the large number of claims that she received for new roofs from hail and wind damage. Yet, she felt that something could be off.

While Casey could not process the data from multiple companies, she knew that the National Insurance Crime Bureau might be able to help by aggregating data from multiple insurance companies across the area hit by the storm and evaluating the data to look for anomalies. Casey's request landed on your desk. As a new fraud specialist, you have been hired to investigate claims following storm damage to hopefully reduce the payouts made to false claimants. You also knew you had to act fast. You began by pulling the claims data for roofs. You also received a database that showed the actual path of this storm. Your task is to sort through the claims to see if there were any unusual claim patterns from this recent weather event.

---
Case introduction and dataset comes from: Cheng, C., & Lee, C.-C. (2023). A Case Study Using Data Analytics to Detect Hail Damage Insurance Claim Fraud. *Journal of Forensic Accounting Research, 8,* 287â€“306.

# **Instructions: Steps to Complete**

### 1. **Load the Dataset:**
   - Load the roof insurance claim dataset (provided in `.xlsx` format) into a Pandas DataFrame named `df`

### 2. **Inspect the Dataset**
- Print the first 5 rows of the dataset to inspect its structure:
- Print all column names.
- Display the structure and data types for all variables.
- Display summary statistics for all numeric columns.

### 3. **Use Operators**
- Use basic *arithmetic* operators to:
  - Add 10 to the `Wind Speed` for each claim and print the result.

- Use *comparison* operators to:
  - Show all rows of the dataset where `Home Square Feet` is greater than or equal to 3750.  


### 4. **Create and Work with a List**
- Create and print out a `list` of all *unique* values from the `City` column. Call the list `city_names`

### 5. **For Loop**
- Use a `for` loop to iterate through the `Rainfall` column and print:
  - `"The amount of rainfall was <rainfall> inches."`, where `<rainfall>` refers to the amount of rainfall.

### 6. **If-Elif-Else Statement**
- Write a simple conditional statement to check the `Age of roof` for the first 5 rows:
  - If less than 5, print `"New Roof"`.
  - If between 5 and 10, print `"Moderately New Roof"`.
  - Otherwise, print `"Old Roof"`.
- Print the result for the first 5 rows.
- *Hint*: the `if-elif-else` statement is nested under a `for` loop.

### 7. **Filter Invalid Records:**
   - Use PANDAS to remove rows where the `Policy Number` is missing (`null`).
   - How many records do you have after using the filter tool for roof claims?

### 8. **Data Cleaning:**
   - Use PANDAS to replace `null` values with `0` in the `Estimated cost to repair` and `Estimated cost to replace` columns. For each claim, only one of these two columns will have data depending on the adjusterâ€™s recommendation.

### 9. **Identify Duplicate Claims:**
   - Use PANDAS to identify whether there are any duplicate claims in your dataset based on ``House/Apartment Number``,	``Street Address``, ``City``, and ``Zip Code``.

### 10. **Export Data**
   - Export the cleaned dataset to a new `CSV` file.
   - Example:
     ```python
     combined.to_csv('cleaned_roof_insurance_claims.csv', index=False)
     ```

---

## **Deliverables**
1. Submit the link to you Google Colab notebook in the assignment area in Canvas.
2. Include comments in your code to explain each step.

## **Tips**
- Use small chunks of code and inspect your dataset frequently.
- Handle missing and invalid data systematically to maintain data integrity.

Good luck! ðŸš€

## Submission
- Submit your completed Colab notebook with all code cells executed.
- Ensure your notebook includes helpful explanations (as Markdown cells) for each step.