# Duplicating statistical test cards programmatically

This notebook is designed to introduce you to the functions and methods available within DSS to help duplicate statistical cards in an existing worksheet. The narrow use-case for this is when (note: just an example) you may want to conduct the same statistical test for multiple variables without having to create the individual cards for each of these tests through the UI.

Here, the recommendation is to create the first card on the UI prior to performing the steps below. This allows for the card settings (a python dictionary) to be created and referenced, enabling us to change the values within to meet our needs.

The documentation to the Python APIs available for the statistical worksheets and cards in DSS can be found [here](https://doc.dataiku.com/dss/latest/python-api/statistics.html#statistics-worksheets)

# 1. Importing libraries

In [1]:
import dataiku
from dataiku import pandasutils as pdu
import pandas as pd
from itertools import combinations

# 2. Establishing connection to DSS Instance

In [2]:
# from within DSS
client = dataiku.api_client()

# # from outside DSS
# host = "http://localhost:11200"
# apiKey = "BCtZV0kLIxHAWCPZTtZM8vgbj2Yzst9F"
# client = dataikuapi.DSSClient(host, apiKey)

# 3. Retrieving Project, Statistical Worksheet & Card

In [3]:
# Get project in DSS
project = client.get_project("CCFRAUDAVDCORESTART") #replace with your project key

In [8]:
# Get the dataset which has the statistical worksheet & card created
dataset = project.get_dataset("transactions_joined_prepared") #replace with the dataset name

# Get the statistical worksheet & its settings
stats_worksheet = dataset.get_statistics_worksheet("c5E2Tv0415") #replace with your statistic worksheet id (found in URL)
ws = stats_worksheet.get_settings()

# Get the first card in the worksheet
card = ws.list_cards()[0]

# # Display card settings (if required)
# card.get_raw()

# 4. Duplicating the Statistical Card

In [9]:
# Loading data to create list combinations for conducting chi-squared test on
# TO DO: Replace with the list that you would want to conduct statistical tests for
dataset_to_load = dataiku.Dataset("transactions_joined_prepared")
df = dataset_to_load.get_dataframe()
cat_cols = df.select_dtypes(include='object').columns
unique_combinations = combinations(cat_cols, 2)

# For each combination, create a chi-squared test card on the same worksheet
for seq, (x_col, y_col) in enumerate(unique_combinations):
    new_card_dict = {}
    # Create the same chi-squared test card settings, while giving a new_id (required)
    # and updating the variables to use
    for key, value in card.get_raw().items():
        if key == 'id':
            new_card_dict[key] = value + str(seq+1)
            continue
        if key == 'xColumn':
            new_card_dict[key] = value
            new_card_dict[key]['name'] = x_col
            continue
        if key == 'yColumn':
            new_card_dict[key] = value
            new_card_dict[key]['name'] = y_col
            continue
        new_card_dict[key] = value
        
    # Add card to worksheet
    ws.add_card(new_card_dict)
    
    # Save the worksheet settings
    ws.save()
    
    # Stop after creating 5th card (for demo purposes, delete to create for all combination)
    if seq==4:
        break
    
# # Run worksheet to compute calculation (if required)
# stats_worksheet.run_worksheet()