## Problem Statement
The aim is to provide a report of the following indicators: Gross Written Premium cancelled and Gross Written Premium booked. Both indicators will have to be broken down by month and line of business (LOB).

Note that this report should be run automatically every end of the calendar month. You should not have to specify in the code which month to update: at every end of month, the report needs to show the current year figures (up to the current month) and last year for the same time period.

Summary:  
- 3 months worth of transactions  
- 2 types of LOB: Travel and PA
- ENDO and CANC are the only negative GWP
- Information utilised will be:
    - Policy table:
        - Policy Number
        - LOB
    - Premium Table:
        - Transaction Type
        - GWP
        - YrM

Deductions and Assumptions:  
- GWP Cancelled will be a SUM of the GWP where TRANTYPE = CANC sorted by LOB and by Month
- GWP Booked will be a SUM of the GWP where TRANTYPE != CANC sorted by LOB and by Month
- Hence utilising all of the data available
- All are correct and no Null
- Year and Month will be extracted from the transaction date from the YrM
- D_tran, D_eff, D_com, D_exp are unnecessary for this analysis

Table will be in the below format:

year|month|LOB|GWP Cancelled|GWP Booked
---|---|---|---|---

In [1]:
#importing libraries
import pandas as pd
import numpy as np
from datetime import datetime

In [2]:
#acquiring the datetime now
now = datetime.now()
#acquiring the year now
year_now = now.year
#acquiring the month now
mth_now = now.month
#subtracting by one to get last year
last_year = year_now - 1

2019
2020
9


In [3]:
#since the date values are in int
#we want to maniuplate them in such a way they present YrM from start of year to current month
#for both this year and last year
year_now_start = year_now * 100 + 1
last_year_start = last_year * 100 + 1
year_now_end = year_now * 100 + mth_now
last_year_end = last_year * 100 + mth_now

202001
201901
202009
201909


In [4]:
##importing the xlsx and reading the sheets into dataframes
xls = pd.ExcelFile('PREMIUM_POLICY_DATA.xlsx')
policy = pd.read_excel(xls, 'POLICY')
premium = pd.read_excel(xls, 'PREMIUM')

In [5]:
#acquiring the lob values from the policy table
premium['lob'] = policy['lob']

In [6]:
#acquiring only the important columns
data = premium[['YrM','TRANTYPE', 'GWP', 'lob']]

In [8]:
data = data.copy()

In [9]:
#Creating a separate column to wrangle TRANTYPE
data['TXN'] = data['TRANTYPE'].copy()

In [11]:
#Replacing all non CANC transaction type as BOOKED
data['TXN'] = data['TXN'].replace({'ENDO':'BOOKED', 'RNWL':'BOOKED', 'NWBS':'BOOKED', 'REIN':'BOOKED'})

In [15]:
#filtering by the start of the year to current month
data1 = data[(data['YrM']>=year_now_start ) & (data['YrM']<=year_now_end)]

In [16]:
#filtering by the start of last year to current month
data2 = data[(data['YrM']>=last_year_start ) & (data['YrM']<=last_year_end)]

In [17]:
#merging these 2 fitlered dataframes
data= pd.concat([data1, data2], axis=0, sort=False)
data.reset_index(inplace=True)

In [19]:
#creating an empty dataframe for our final table
final_df = pd.DataFrame(columns= ['Year', 'Month','YrM', 'LOB', 'GWP Cancelled', 'GWP Booked'])

In [20]:
#listing unique YrM values
yearmonth = data['YrM'].unique().tolist()

[202001, 202002]

In [21]:
#listing unique LOB values
lobiz = data['lob'].unique().tolist()

['017-Travel', '018-Personal Accident']

In [22]:
#entries for YrM
YrM = sorted(yearmonth*len(lobiz))

In [24]:
#entries for LOB
LOB = lobiz*len(yearmonth)

['017-Travel', '018-Personal Accident', '017-Travel', '018-Personal Accident']

In [25]:
#assigning the values in our YrM and LOB columns
final_df['YrM'] = YrM
final_df['LOB'] = LOB

In [26]:
#for loop to impute the year and month separated from YrM
for index, item in enumerate (final_df['YrM']):
    final_df['Year'][index] = str(item)[0:4]
    final_df['Month'][index] = str(item)[4:]

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until


In [27]:
#filling our null fields with a 0.0 float
final_df.fillna(value=0.0, inplace=True)

In [29]:
#defining function
def gwp_calc(row):
    year = str(row['YrM'])[0:4]  #getting the year from each row
    month = str(row['YrM'])[4:]  #getting the month from each row
    lob = row['lob']  #getting lob from each row
    gwp_canc = 0  #empty variable for GWP Cancelled
    gwp_booked = 0  #empty variable for GWP Booked
    
    if row['TXN']== 'CANC':  #condition for filter
        gwp_canc = row['GWP']  #assinging value
    elif row['TXN']== 'BOOKED':  #condition for filter
        gwp_booked = row['GWP']  #assigning value
    
    #checking condition by row in final_df and assigning value    
    current_gwp_cancelled = final_df[(final_df["Year"]==year) & (final_df["Month"]==month) & (final_df["LOB"]==lob)]["GWP Cancelled"].values[0] 
    current_gwp_booked = final_df[(final_df["Year"]==year) & (final_df["Month"]==month) & (final_df["LOB"]==lob)]["GWP Booked"].values[0]
   
     #checking condition by row in final_df and incrementing value
    final_df.loc[(final_df["Year"]== year) & (final_df["Month"]== month) & (final_df["LOB"]==lob),"GWP Cancelled"] =  current_gwp_cancelled + gwp_canc
    final_df.loc[(final_df["Year"]== year) & (final_df["Month"]== month) & (final_df["LOB"]==lob),"GWP Booked"] =  current_gwp_booked + gwp_booked


In [31]:
#applying the function for every row in the data
for i in list(range(len(data))):
    gwp_calc(data.loc[i])

In [39]:
final_df = final_df[['Year', 'Month', 'LOB', 'GWP Cancelled', 'GWP Booked']]

In [40]:
#saving the final table to csv for management perusal
final_df.to_csv('report.csv', index = False)