Added QA layer for onboarding. Current resource is Becky Akinyode. This report surfaces those prospective members that experienced a payment failure. Some of them go on to record a successful payment which results in a membership or 6-mo trial activation (verified on the New Member report), while others never record an activation. The latter are sent to Becky for follow-up. The relevant CIVI source reports are **'New And Re-Activated Members'** (make sure date filters are appropriate) and **'Contribution Details - Failures'**

In [1]:
import pandas as pd
import datetime
import os
import itertools

In [2]:
os.chdir('/home/candela/Documents/greeneHill/membershipReportsCIVI/membership_QA_process')

In [8]:
new_member = pd.read_csv('./newMemberReports_rawCIVI/Report_20240109-1846.csv')
failures = pd.read_csv('./contributionsFailureReports_rawCIVI/Report_20240109-1848.csv')

In [9]:
new_member.columns = [i.replace(' ','_') for i in new_member.columns]
failures.columns = [i.replace(' ','_') for i in failures.columns]

In [10]:
#change date-like fields to date datatype
failures.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 42 entries, 0 to 41
Data columns (total 6 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   Donor_Name      42 non-null     object
 1   Donor_Email     42 non-null     object
 2   Donor_Phone     42 non-null     object
 3   Financial_Type  42 non-null     object
 4   Date_Received   42 non-null     object
 5   Amount          42 non-null     object
dtypes: object(6)
memory usage: 2.1+ KB


In [22]:
new_member.head()
#pd.to_datetime(new_member['Start_Date'])

Unnamed: 0,Contact_Name,First_Name,Last_Name,Membership_Type,Start_Date,End_Date,Status,Email,Phone
0,"Saffold, Sydney",Sydney,Saffold,Zucchini Plan,2023-12-27,2024-06-26,New,s.saffold91@gmail.com,4785088710
1,"Obrien, Kate",Kate,Obrien,Zucchini Plan,2023-12-14,2024-06-13,New,katemaryobrien@gmail.com,2158721756
2,"Funk, Lindsay",Lindsay,Funk,Zucchini Plan,2023-12-07,2024-06-06,New,lfunk.stanford@gmail.com,4255015012
3,"Graney, Trevor",Trevor,Graney,Zucchini Plan,2023-11-29,2024-05-28,New,tgraney11@gmail.com,9787717429
4,"Josties, Bettine",Bettine,Josties,Zucchini Plan,2023-11-27,2024-05-26,New,bettine.josties@gmail.com,9293055818


In [12]:
#adjust all the "date" fields simultaneously
#index a list via another bool list
date_fields = list(itertools.compress(new_member.columns,['date' in i.lower() for i in new_member.columns]))
new_member[date_fields] = new_member[date_fields].apply(pd.to_datetime)

date_fields2 = list(itertools.compress(failures.columns,['date' in i.lower() for i in failures.columns]))
failures[date_fields2] = failures[date_fields2].apply(pd.to_datetime)

In [31]:
#Start_Date is in date format, while Date_Received is in timestamp
failures['Date_Received'] = failures['Date_Received'].dt.date

Compress the failures dataset and compile metadata on each person: email, number of failures, array of dollar value, last failure date, Donor Name,	Donor Email,Donor Phone, Financial Type

In [13]:
failures.columns

Index(['Donor_Name', 'Donor_Email', 'Donor_Phone', 'Financial_Type',
       'Date_Received', 'Amount'],
      dtype='object')

In [33]:
failures.groupby(['Donor_Email','Donor_Name','Donor_Phone','Financial_Type']).agg({'Date_Received':'max','Amount':lambda x:x.tolist(),'Financial_Type':'size'})

failures_grouped = failures.groupby(['Donor_Email','Donor_Name','Donor_Phone','Financial_Type']).agg({'Date_Received':'max','Amount':lambda x:x.tolist(),'Financial_Type':'size'}).rename_axis(['Donor_Email','Donor_Name','Donor_Phone','Financial_Type_meta']).reset_index().rename(columns = {'Date_Received':'latestFailureDate','Financial_Type':'numOfInstances','Amount':'amountArray'})
#'dateLastFail','chargesArray','count'

In [34]:
failures_grouped.head()

Unnamed: 0,Donor_Email,Donor_Name,Donor_Phone,Financial_Type_meta,latestFailureDate,amountArray,numOfInstances
0,aaron1215@me.com,"Alcouloumre, Aaron",9496370120,Member Investment,2024-01-08,[$ 80.00],1
1,alexmcqw@yahoo.com,"McQuilkin, Alexander",6507043952,Member Investment,2023-12-06,"[$ 21.43, $ 21.43, $ 21.43, $ 21.43]",4
2,amanda.brianna@me.com,"Escamilla, Amanda",5616282407,Member Investment,2023-11-14,"[$ 14.29, $ 14.29, $ 14.29, $ 50.00]",4
3,amelia.h.clark@gmail.com,"Clark, Amelia",2032169492,Member Investment,2023-12-09,[$ 100.00],1
4,anjblue8@gmail.com,"Krishnakumar, Anjali",3474221871,Member Investment,2023-11-17,[$ 30.00],1


In [76]:
new_member.head()

Unnamed: 0,Contact_Name,First_Name,Last_Name,Membership_Type,Start_Date,End_Date,Status,Email,Phone
0,"Saffold, Sydney",Sydney,Saffold,Zucchini Plan,2023-12-27,2024-06-26,New,s.saffold91@gmail.com,4785088710
1,"Obrien, Kate",Kate,Obrien,Zucchini Plan,2023-12-14,2024-06-13,New,katemaryobrien@gmail.com,2158721756
2,"Funk, Lindsay",Lindsay,Funk,Zucchini Plan,2023-12-07,2024-06-06,New,lfunk.stanford@gmail.com,4255015012
3,"Graney, Trevor",Trevor,Graney,Zucchini Plan,2023-11-29,2024-05-28,New,tgraney11@gmail.com,9787717429
4,"Josties, Bettine",Bettine,Josties,Zucchini Plan,2023-11-27,2024-05-26,New,bettine.josties@gmail.com,9293055818


In [35]:
merged = failures_grouped.merge(new_member, how = 'left', left_on = 'Donor_Email',right_on = 'Email')

In [26]:
merged.columns

Index(['Donor_Email', 'Donor_Name', 'Donor_Phone', 'Financial_Type_meta',
       'latestFailureDate', 'amountArray', 'numOfInstances', 'Contact_Name',
       'First_Name', 'Last_Name', 'Membership_Type', 'Start_Date', 'End_Date',
       'Status', 'Email', 'Phone'],
      dtype='object')

In [30]:
merged.dtypes

Donor_Email                    object
Donor_Name                     object
Donor_Phone                    object
Financial_Type_meta            object
latestFailureDate      datetime64[ns]
amountArray                    object
numOfInstances                  int64
Contact_Name                   object
First_Name                     object
Last_Name                      object
Membership_Type                object
Start_Date             datetime64[ns]
End_Date               datetime64[ns]
Status                         object
Email                          object
Phone                          object
dtype: object

The problematic cases are when there is no/null data from the new_member dataframe (indicates the person wasn't provisioned an account), and arguably if the date of account creation ("Start_Date" in new_member) is BEFORE the max Date_Received.

In [36]:
merged.loc[(merged['Contact_Name'].isnull()) | (merged['latestFailureDate'] > merged['Start_Date']),:]



Unnamed: 0,Donor_Email,Donor_Name,Donor_Phone,Financial_Type_meta,latestFailureDate,amountArray,numOfInstances,Contact_Name,First_Name,Last_Name,Membership_Type,Start_Date,End_Date,Status,Email,Phone
20,kvyas19@yahoo.com,"Vyas, Karishma",3237727664,Donation,2023-12-11,[$ 25.00],1,,,,,NaT,NaT,,,
29,rcawdette.9@gmail.com,"Cawdette, Roger",8577191723,Member Investment,2023-12-21,[$ 50.00],1,"Cawdette, Roger",Roger,Cawdette,Trial Membership,2023-12-19,2024-06-19,New,rcawdette.9@gmail.com,8577191723.0
31,simonewforsberg@gmail.com,"Forsberg, Simone",8186369856,Member Investment,2023-11-15,[$ 100.00],1,,,,,NaT,NaT,,,


In [None]:

merged.loc[merged['Contact_Name'].isnull(),:].to_csv('/home/candela/Documents/greeneHill/membershipReportsCIVI/membership_QA_process/failures_qa_01072024_Becky.csv',index = False)

In [81]:
merged.to_csv('/home/candela/Documents/greeneHill/membershipReportsCIVI/membership_QA_process/failures_qa_01072024.csv',index = False)