# Trigger Script

The "Trigger Script" is the single script that compiles the slides created by scripts in the "Graph Creation" folder of this repository. 

Due to copyright reasons, the code has been largely modified and generalized so that code is vague and not revealing of corporate information. However, my hope is the logic and planned structure of the Capital One's First Party Fraud Monthly Business Report Repository is communicated.

## Script Outline

The script is organized as follows:

        1.Set-Up (imports, connections, creating variables)
        2.Slide Creation
        3.Emailing the deliverable



## Set-Up Explanation

In order to successfully run this script there are a number of processes that must be done in order to connect to the data and run code. They are

        Running the credentials file
        Running utility scripts
        Install the Capital One built package pptmaker
        Importing packages
        Creating useful variables

In [2]:
#Step 1, run credentials files to connect to Capital One's Data infrastructure
%run "Users/[EID]/creds"

#If you are cloning this repository you will have to change the above to speciy your EID

ERROR:root:File `'Users/[EID]/creds.py'` not found.


In [None]:
#Step 2, run helpful utility scripts that predefine functions used throughout the script
%run "./Utilities/fraud_helper_fx"

In [None]:
%run "./Utilities/MBR_fx"

In [None]:
#Step 3, install Capital One internally created package that can create a .pptx file of graphs/tables
dbutils.library.installPyPi("pptmaker", repo='....')

In [2]:
#Step 4, import packages and create helpful variables

from pptmaker import pptMaker
import pyspark.sql.functions as F
from pyspark.sql import DataFrameStatFunctions as FS
from pyspark.sql.functions import *
from pyspark.sql.types import *
import requests
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta
import re
import json
import pytz
import os.path
from pytz import timezone

#name developers and recipients -- change this if you are cloning the repository

dev_email = ['joby.george@capitalone.com']
recipients = ['joby.george@capitalone.com']

#set timezone to EST 
tz = pytz.timezone('America/New_York')




In [None]:
#create useful date variables for monthly reporting
today = pd.datetime.today()
two_months = (datetime.today() + timedelta(days=60)).replace(day=1)
one_month = (datetime.today() + timedelta(days=30)).replace(day=1)
this_month = datetime.today().replace(day=1)
one_month_ago = (datetime.today() + timedelta(days = -30))
two_months_ago = datetime.today() + timedelta(days=-60)
two_years_gao = this_month + relativedelta(years=-2)

In [None]:
#set up connection to snowflake so we can access productionized data
snowflake_source_name = "net.snowflake.spark.snowflake"
sfOptions = {
    "sfUrl":"...",
    "sfUser":username, #accessed from running creds file
    "sfPassword":password,#accessed from running creds file
    "sfDatabase":"...",
    "sfSchema":"USER_{}".format(username),
}

Utils = spark.jvm.net.snowflake.spark.snowflake.Utils



In [None]:
#define the list of values that will result in the script sending out it's results (unit test)
unit_test_acpt_values = ['Yes','YES','y', 'yup', '1', 'yes']

In [None]:
#create a powerpoint object that will generate slides in the following commands
ppt = pptMaker.pptMaker()

## Slide Creation

The slide creation process starts with a title slide, highlighting the month of the report, and the author of the report (First Party Fraud Intent). 

From there we run the scripts that are in the graph creation folder.

Note that this is just an example of the trigger script concept using three scripts. 

The First Party Fraud MBR repository runs 13 scripts, pulling from 4 folders rather than a singular folder called _Graph Creation_.

The final output is a 50 slide powerpoint file that comprises the majority of the team's monthly reporting. From there, analysts have to add commentary on what the data are actually showing in the slides.

The three scripts used to highlight the concept are:

        Fraud Losses Graphs
        Root Cause
        Defense Performance
        
The three scripts answer pivotal questions for the team:

        1. how has the KPI - fraud losses changed this month?
        2. which tactics were driving the fraud losses?
        3. how did the team's fraud defenses designed to mitigate fraud losses, perform?
        
 ## The Unit Test Concept
        
A powerful feature of databricks is the ability to pass parameters when running a command [source](https://forums.databricks.com/questions/176/how-do-i-pass-argumentsvariables-to-notebooks.html).
 
The main drawback of a trigger script concept is if the job errors out in one of the scripts. This can ultimately mean hours were spent letting code compile with no potential output. 

To alleviate the frustrations of having to redesign existing scripts if analysts want a single script's output, the unit_test parameter can be used. If **unit_test** is specified as a value in the _unit_test_acpt_values_ created two commands earlier, the if logic in our scripts will be true, causing a particular scripts output to be executed independently. 
        

In [14]:
#the slide should consist of the main text saying: First Party Fraud Monthly Business Report, last month's performance

ts_main_text = 'First Party Fraud Monthly Business Report\n{0} {1} Performance'.format(
    one_month_ago.strftime('%B'), one_month_ago.strftime('%Y')
)

#the sub_text should detail the author, First Party Fraud Intent, and the current month
ts_sub_text = """First Party Fraud Intent\n{} {}""".format(today.strftime('%B'), today.strftime('%Y')) 

#the following commands will validate the string is right
#print(ts_main_text) 
#print(title_slide_sub_text)

#create the title slide using the two strings and the createSlide() function
ppt.createSlide(slide_name = ts_main_text, notes = ts_sub_Text)

First Party Fraud Monthly Business Report
November 2020 Performance
First Party Fraud Intent
December 2020


In [None]:
%run "./Graph_Creation/Fraud_Losses_Graphs" $unit_test = 'no'

In [None]:
%run "./Graph_Creation/Root_Cause" $unit_test = 'no'

In [None]:
%run "./Graph_Creation/Defense_Performance" $unit_test = 'no'

## Emailing the deliverable

To send out the slides generated in the above scripts, all we have to do is run a single command. 

The command will email an attached version of a singular powerpoint file.

In [None]:
#email the powerpoint
ppt.createDeck(filename = 'First Party Fraud Monthly Business Report '+ str(datetime.now(timezone("America/New_York")).strftime('%Y_%m_%d_%H_%M')),
                email_to = recipients, #email recipients created in command 4
                email_from = dev_email,
                email_subject = 'First Party Fraud Monthly Business Report '+ str(datetime.now(timezone("America/New_York")).strftime('%Y_%m_%d_%H_%M'),
                ppt_attach = True)                                                                  
