# Inactive customers/old products clean-up from loyalty program partner database

#### Problem Statement : Finding the inactive/old/not eligible credit cards/rewards products in the Bank's loyalty program partner database
The loyalty program partner provide variety of products/offerings/flexible redemptions to it's client(Bank) and customers/employees of the Bank. Benefits include redemptions that fit everyone’s needs, which in turn help in employee /customer retention with the Bank. However, all this comes with a cost to the Bank and Bank is charged for every single customer in the partner database. This is the reason why there was a need to reduce the cost by identifying the inactive/old/not eligible credit cards in the Bank's loyalty program partner database.

##### Datasets used as part of this process
1. Credit card customers database in excel sheet 
2. Reward points available for credit card customers (rewards vendor database) in excel sheet

Note : As per the PCI compliance, Bank's do not share the customer data and make it available to public, hence the datasets used in this analysis are dummy datasets and do not contain actual customer data. Also, as the datasets are large in size , the data used here is having limited data with just 500 customers data in excel.

##### Details about the datasets
1. The Rewards partner (third party vendor) for the Bank is not authorized to store the credit card numbers nor is the Bank supposed to share the data with the partner.
2. Only customer ID can be passed on to the rewards vendor. 
3. Rewards partner database will have Customer ID, Rewards ID, Rewards points Total, Redeemed points.
   along with the rewards status(Active(A), Block(B) or Closed (C)) along with many other fields to which Bank will not be          having access.
4. Each customer enrolled with the Bank will be assigned with a customer ID
5. Each customer will be tagged to many credit cards / debit cards as per the eligibility and bank offers availed. This data will be available in the Bank's customer database. 
6. Each credit card/debit card will be having a Rewards ID and Customer ID which makes it a unique field in the table.
7. The same Rewards ID and Customer ID combo will be available in the rewards database of the rewards partner.
8. Bank will not have access to third party database and third party will not have access to customer database of the bank
9. The transactions of the customers will be sent to the rewards partner. 
10. Rewards partner will look at the transactions and accordingly calculate the points based on the rewards loyalty program and customer's eligibilty based on rewards products. Any redemptions are also applied and total points will change accordingly.
11. Rewards partner will share only limited information back to the Bank for Bank's use that will have fields such as Customer ID, Rewards ID, Rewards points Total, Redeemed points, along with the rewards status(Active(A), Block(B) or Closed (C)). 

##### Reading CSV File into pandas DataFrame

In [3]:
pip install pandas

Note: you may need to restart the kernel to use updated packages.


In [4]:
pip install -U pandasql

Note: you may need to restart the kernel to use updated packages.


In [6]:
import pandas as pd
import pandasql as ps

In [19]:
 #import CSV file as DataFrame
rewards_dataset = pd.read_csv('Rewards4.csv')

In [20]:
#print the 5 records to see the structure of dataset
rewards_dataset.head()                 

Unnamed: 0,Reward_ID,Card_Family,Cust_ID,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
0,18196,Premium,CC67088,2257,1616,A,641
1,31515,Gold,CC12076,2810,2195,A,615
2,13530,Premium,CC97173,6397,5630,A,767
3,25706,Gold,CC55858,2144,1579,A,565
4,98219,Platinum,CC90518,9360,8959,A,401


In [24]:
#print only few columns
rewards_dataset = pd.read_csv('Rewards4.csv', usecols=['Reward_ID', 'Card_Family','Cust_ID','Rewards Status'])
print(rewards_dataset)

     Reward_ID Card_Family  Cust_ID Rewards Status
0        18196     Premium  CC67088              A
1        31515        Gold  CC12076              A
2        13530     Premium  CC97173              A
3        25706        Gold  CC55858              A
4        98219    Platinum  CC90518              A
..         ...         ...      ...            ...
495      16022     Premium  CC64993              A
496       2925        Gold  CC26787              A
497      73899     Premium  CC32532              A
498      75046     Premium  CC90246              A
499      87718        Gold  CC37803              A

[500 rows x 4 columns]


In [47]:
#find the cards with closed rewards status
closed = rewards_dataset.loc[(rewards_dataset['Rewards Status'] == 'C')] 
closed

Unnamed: 0,Reward_ID,Card_Family,Cust_ID,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
215,18084,Gold,CC16029,3548,2593,C,955
232,93624,Premium,CC18795,3573,3145,C,428
233,83377,Premium,CC99130,2323,1932,C,391
234,31891,Platinum,CC15310,8575,8100,C,475
235,8539,Premium,CC88046,2284,1467,C,817
236,1188,Gold,CC95099,9979,9265,C,714
237,34210,Premium,CC91963,8936,8747,C,189
240,39052,Premium,CC68522,8570,7784,C,786
241,44891,Gold,CC13173,5631,5186,C,445
242,26053,Gold,CC85400,9684,9364,C,320


In [51]:
#find the cards with blocked rewards status
blocked = rewards_dataset.loc[(rewards_dataset['Rewards Status'] == 'B')] 
blocked

Unnamed: 0,Reward_ID,Card_Family,Cust_ID,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
190,19719,Platinum,CC91963,5220,4463,B,757
199,5747,Gold,CC28847,9784,9151,B,633
210,45918,Platinum,CC43841,5529,4994,B,535
216,69001,Platinum,CC82406,4479,3646,B,833
217,84862,Premium,CC19277,3372,2664,B,708
218,85128,Premium,CC86189,5390,5232,B,158
219,46228,Gold,CC59641,7872,7616,B,256
220,85963,Gold,CC93075,5615,4915,B,700
221,19467,Premium,CC34817,5660,4898,B,762
222,42551,Premium,CC16420,7621,7324,B,297


In [53]:
#find the total number of closed cards
len(closed)

29

In [54]:
#find the number of blocked cards
len(blocked)

36

In [70]:
#find the number of blocked cards that belong to Gold card family
#Gold Card_Family is a rewards product that has been discontinued by the Bank
Goldblocked = blocked.loc[(rewards_dataset['Card_Family'] == 'Gold')] 
Goldblocked


Unnamed: 0,Reward_ID,Card_Family,Cust_ID,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
199,5747,Gold,CC28847,9784,9151,B,633
219,46228,Gold,CC59641,7872,7616,B,256
220,85963,Gold,CC93075,5615,4915,B,700
226,18577,Gold,CC80950,8251,8024,B,227
258,99734,Gold,CC50124,7790,7634,B,156
270,53251,Gold,CC46077,5640,4864,B,776
271,33387,Gold,CC68752,6963,6523,B,440
287,12028,Gold,CC39362,9530,8956,B,574


In [71]:
#find the number of Gold blocked cards
len(Goldblocked)

8

In [72]:
#find the number of closed cards that belong to Gold card family
#Gold Card_Family is a rewards product that has been discontinued by the Bank
Goldclosed = closed.loc[(rewards_dataset['Card_Family'] == 'Gold')] 
Goldclosed


Unnamed: 0,Reward_ID,Card_Family,Cust_ID,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
215,18084,Gold,CC16029,3548,2593,C,955
236,1188,Gold,CC95099,9979,9265,C,714
241,44891,Gold,CC13173,5631,5186,C,445
242,26053,Gold,CC85400,9684,9364,C,320
243,39968,Gold,CC29028,4138,3261,C,877
249,81250,Gold,CC67129,7156,6821,C,335
250,42086,Gold,CC36447,9328,8656,C,672
304,5513,Gold,CC66648,5297,4493,C,804
318,81318,Gold,CC48782,6472,5656,C,816
408,27199,Gold,CC76008,5566,4945,C,621


In [73]:
#find the number of Gold closed cards
len(Goldclosed)

12

##### Inference for credit cards having rewards status as "C" (closed) :
Gold Card Family is a rewards product that has been discontinued by the Bank, so all the cards that are already closed can be removed from the rewards partner database.
The above analysis shows there are 12 credit cards that are closed and belong to the Gold card family. The above results will have to be reported to the Line of Business to notify the rewards partner to delete these from their database.

##### Inference for credit cards having rewards status as "B" (blocked) :
Gold Card Family is a rewards product that has been discontinued by the Bank, so all the cards that are already blocked can be removed from the rewards partner database provided the reason is identified behind these blocked cards. 
Reasons could be due to misplace, lost/stolen , fraud. 
1. If the reason is due to misplace, the customer might get the card unblocked so the credit card rewards status might change later so these cards should not be cleaned from the rewards vendor database.
2. If the reason is due to lost/stolen, the customer might have already got a replacement card. This needs to be verified and then cleaned from the rewards vendor database.
3. If the reason is due to fraud , such cards to be identified and reported to LOB so that after approval it can be removed from rewards vendor database
4. Out of scope : If the reason is due to anything other than misplace, lost/stolen , fraud then it needs to be studied, however it shall not be considered for this project.
The above analysis shows there are 8 credit cards that are blocked and belong to the Gold card family. The above results will have to be reported to the Line of Business to notify the rewards partner to delete these from their database.

In [115]:
 #import the customer database CSV file as DataFrame
cardbase_dataset = pd.read_csv('Cardbase.csv')

In [116]:
cardbase_dataset.head()

Unnamed: 0,Card_Number,Card_Family,Cust_ID,Account Status,Reason_Code,Reason
0,8638-5407-3631-8196,Platinum,CC67088,A,G,Good
1,7106-4239-7093-1515,Gold,CC12076,A,G,Good
2,6492-5655-8241-3530,Premium,CC97173,A,G,Good
3,2868-5606-5152-5706,Gold,CC55858,A,G,Good
4,1438-6906-2509-8219,Platinum,CC90518,A,G,Good


In [117]:
cdgoldclosed = cardbase_dataset.merge(Goldclosed, on='Cust_ID', how='inner')
cdgoldclosed

Unnamed: 0,Card_Number,Card_Family_x,Cust_ID,Account Status,Reason_Code,Reason,Reward_ID,Card_Family_y,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
0,9604-6821-2861-8084,Gold,CC16029,C,F,Fraud,18084,Gold,3548,2593,C,955
1,4138-5166-7490-1188,Gold,CC95099,C,F,Fraud,1188,Gold,9979,9265,C,714
2,9162-3465-6654-4891,Gold,CC13173,C,F,Fraud,44891,Gold,5631,5186,C,445
3,7815-2405-2962-6053,Gold,CC85400,C,F,Fraud,26053,Gold,9684,9364,C,320
4,7047-9622-9693-9968,Gold,CC29028,C,F,Fraud,39968,Gold,4138,3261,C,877
5,4530-1687-6778-1250,Gold,CC67129,C,S,Stolen,81250,Gold,7156,6821,C,335
6,4973-1293-1664-2086,Gold,CC36447,C,S,Stolen,42086,Gold,9328,8656,C,672
7,5811-4353-3490-5513,Gold,CC66648,C,S,Stolen,5513,Gold,5297,4493,C,804
8,6765-2732-8888-1318,Gold,CC48782,C,S,Stolen,81318,Gold,6472,5656,C,816
9,3295-6390-4452-7199,Gold,CC76008,C,S,Stolen,27199,Gold,5566,4945,C,621


In [118]:
len(cdgoldclosed)

12

In [119]:
#Verify the cards with closed rewards status in  the partner database is also closed in the Bank's database
closed1 = cdgoldclosed.loc[(cdgoldclosed['Account Status'] == 'C') & (cdgoldclosed['Rewards Status'] == 'C')] 
closed1

Unnamed: 0,Card_Number,Card_Family_x,Cust_ID,Account Status,Reason_Code,Reason,Reward_ID,Card_Family_y,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
0,9604-6821-2861-8084,Gold,CC16029,C,F,Fraud,18084,Gold,3548,2593,C,955
1,4138-5166-7490-1188,Gold,CC95099,C,F,Fraud,1188,Gold,9979,9265,C,714
2,9162-3465-6654-4891,Gold,CC13173,C,F,Fraud,44891,Gold,5631,5186,C,445
3,7815-2405-2962-6053,Gold,CC85400,C,F,Fraud,26053,Gold,9684,9364,C,320
4,7047-9622-9693-9968,Gold,CC29028,C,F,Fraud,39968,Gold,4138,3261,C,877
5,4530-1687-6778-1250,Gold,CC67129,C,S,Stolen,81250,Gold,7156,6821,C,335
6,4973-1293-1664-2086,Gold,CC36447,C,S,Stolen,42086,Gold,9328,8656,C,672
7,5811-4353-3490-5513,Gold,CC66648,C,S,Stolen,5513,Gold,5297,4493,C,804
8,6765-2732-8888-1318,Gold,CC48782,C,S,Stolen,81318,Gold,6472,5656,C,816
9,3295-6390-4452-7199,Gold,CC76008,C,S,Stolen,27199,Gold,5566,4945,C,621


In [120]:
len(closed1)


12

##### Inference:
The list of cards(12) that are closed in loyalty partner database is same as the list of cards(12) in Bank's customer database, these cards are to be reported to the LOB for approval and later shared with the loyalty program partner database to delete these from their database.

In [123]:
#print only few columns that are required to be sent to LOB
closedrpt = closed1[['Card_Number','Reward_ID','Cust_ID','Card_Family_x','Rewards Status','Account Status']]
print(closedrpt)

            Card_Number  Reward_ID  Cust_ID Card_Family_x Rewards Status  \
0   9604-6821-2861-8084      18084  CC16029          Gold              C   
1   4138-5166-7490-1188       1188  CC95099          Gold              C   
2   9162-3465-6654-4891      44891  CC13173          Gold              C   
3   7815-2405-2962-6053      26053  CC85400          Gold              C   
4   7047-9622-9693-9968      39968  CC29028          Gold              C   
5   4530-1687-6778-1250      81250  CC67129          Gold              C   
6   4973-1293-1664-2086      42086  CC36447          Gold              C   
7   5811-4353-3490-5513       5513  CC66648          Gold              C   
8   6765-2732-8888-1318      81318  CC48782          Gold              C   
9   3295-6390-4452-7199      27199  CC76008          Gold              C   
10  3697-6001-4909-5350      95350  CC62261          Gold              C   
11  6620-4005-4574-6263      46263  CC95042          Gold              C   

   Account 

In [125]:
#Match merge the cardbase dataset with the Gold family blocked cards on customer ID extracted from rewards db
cdgoldblocked = cardbase_dataset.merge(Goldblocked, on='Cust_ID', how='inner')
cdgoldblocked

Unnamed: 0,Card_Number,Card_Family_x,Cust_ID,Account Status,Reason_Code,Reason,Reward_ID,Card_Family_y,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
0,7296-3224-2880-5747,Gold,CC28847,B,F,Fraud,5747,Gold,9784,9151,B,633
1,2158-2612-7934-6228,Gold,CC59641,B,F,Fraud,46228,Gold,7872,7616,B,256
2,1655-7617-4318-5963,Gold,CC93075,B,F,Fraud,85963,Gold,5615,4915,B,700
3,4933-8895-6001-8577,Gold,CC80950,B,L,Lost,18577,Gold,8251,8024,B,227
4,6570-8163-4369-9734,Gold,CC50124,B,S,Stolen,99734,Gold,7790,7634,B,156
5,6876-7378-4945-3251,Gold,CC46077,B,S,Stolen,53251,Gold,5640,4864,B,776
6,1290-5480-3763-3387,Gold,CC68752,B,S,Stolen,33387,Gold,6963,6523,B,440
7,9304-8255-4381-2028,Gold,CC39362,B,S,Stolen,12028,Gold,9530,8956,B,574


In [127]:
#Verify the cards with blocked rewards status in the partner database is also blocked in the Bank's database
blocked1 = cdgoldblocked.loc[(cdgoldblocked['Account Status'] == 'B') & (cdgoldblocked['Rewards Status'] == 'B')] 
blocked1

Unnamed: 0,Card_Number,Card_Family_x,Cust_ID,Account Status,Reason_Code,Reason,Reward_ID,Card_Family_y,Reward Points Total,Rewards Point Available,Rewards Status,Redeemed Points
0,7296-3224-2880-5747,Gold,CC28847,B,F,Fraud,5747,Gold,9784,9151,B,633
1,2158-2612-7934-6228,Gold,CC59641,B,F,Fraud,46228,Gold,7872,7616,B,256
2,1655-7617-4318-5963,Gold,CC93075,B,F,Fraud,85963,Gold,5615,4915,B,700
3,4933-8895-6001-8577,Gold,CC80950,B,L,Lost,18577,Gold,8251,8024,B,227
4,6570-8163-4369-9734,Gold,CC50124,B,S,Stolen,99734,Gold,7790,7634,B,156
5,6876-7378-4945-3251,Gold,CC46077,B,S,Stolen,53251,Gold,5640,4864,B,776
6,1290-5480-3763-3387,Gold,CC68752,B,S,Stolen,33387,Gold,6963,6523,B,440
7,9304-8255-4381-2028,Gold,CC39362,B,S,Stolen,12028,Gold,9530,8956,B,574


Note :Considering all of these cards have received replacement (customer has asked for replacement card), we choose to cleanup these from the rewards vendor database

In [128]:
#print only few columns that are required to be sent to LOB
blockedrpt = blocked1[['Card_Number','Reward_ID','Cust_ID','Card_Family_x','Rewards Status','Account Status']]
print(blockedrpt)

           Card_Number  Reward_ID  Cust_ID Card_Family_x Rewards Status  \
0  7296-3224-2880-5747       5747  CC28847          Gold              B   
1  2158-2612-7934-6228      46228  CC59641          Gold              B   
2  1655-7617-4318-5963      85963  CC93075          Gold              B   
3  4933-8895-6001-8577      18577  CC80950          Gold              B   
4  6570-8163-4369-9734      99734  CC50124          Gold              B   
5  6876-7378-4945-3251      53251  CC46077          Gold              B   
6  1290-5480-3763-3387      33387  CC68752          Gold              B   
7  9304-8255-4381-2028      12028  CC39362          Gold              B   

  Account Status  
0              B  
1              B  
2              B  
3              B  
4              B  
5              B  
6              B  
7              B  


In [129]:
len(blockedrpt) + len(closedrpt)

20

Number of cards to be cleaned from rewards vendor database = 20,
Print final report for LOB review

In [132]:
Final_rpt = pd.concat([blockedrpt, closedrpt], ignore_index=True)
Final_rpt

Unnamed: 0,Card_Number,Reward_ID,Cust_ID,Card_Family_x,Rewards Status,Account Status
0,7296-3224-2880-5747,5747,CC28847,Gold,B,B
1,2158-2612-7934-6228,46228,CC59641,Gold,B,B
2,1655-7617-4318-5963,85963,CC93075,Gold,B,B
3,4933-8895-6001-8577,18577,CC80950,Gold,B,B
4,6570-8163-4369-9734,99734,CC50124,Gold,B,B
5,6876-7378-4945-3251,53251,CC46077,Gold,B,B
6,1290-5480-3763-3387,33387,CC68752,Gold,B,B
7,9304-8255-4381-2028,12028,CC39362,Gold,B,B
8,9604-6821-2861-8084,18084,CC16029,Gold,C,C
9,4138-5166-7490-1188,1188,CC95099,Gold,C,C


In [134]:
#Export the dataframe to csv
Final_rpt.to_csv('Final_rpt.csv', index=False)

##### Conclusion : 
After the LOB's review and approval received the loyalty partner is asked to clean up their database for these records so they going forward the Bank is not charged for these credit cards that were not eligible(GOLD FAMILY CARD) and were also blocked/closed.