# Trial Balance Formatting: Preparing QuickBooks Export Data for Import into CCH Engagement

**This notebook will contain an example of code used to format trial balance data with limited comments. This code can also be found as a .py file in this same repository (trial_balance_etl.py) For a more detailed explanation of the process and instructions, see the tutorial_notebook_tb_formatting.ipynb in the same repository.

## Example Code

### Import Libraries

In [1]:
import pandas as pd
import os

### Helper Functions

In [14]:
# Create a new dataframe column from splitting another column. 4 arguments:df, column to split, delimiter to split on, and index of the item we want from the .split() method
# Set index default to -1 if no argument is given to select the last item in the list

def new_col_from_split(df, split_col, delim, index = -1):
    return [x[index] for x in df[split_col].astype(str).str.split(delim)]

In [19]:
# This function creates a dictionary with files to format and account suffixes that will be added onto the account numbers in the file

def create_entity_dict(df, entity_column, suffix_column, data_folder='./quickbooks_data/'):
    file_list = os.listdir(data_folder)
    return {x + '.xlsx': '.' + y for x, y in zip(df[entity_column], df[suffix_column].astype(str)) if x + '.xlsx' in file_list}

In [16]:
# This function performs all of the standard formatting changes necessary to prepare quickbooks exports for CCH Engagement
# Minor Changes may be necessary from project to project

def format_tbs(entities, data_folder='./quickbooks_data/'):
    
    #create output folders
    if 'ready_for_tb_import' not in os.listdir():
        os.mkdir('./ready_for_import/')
        
    if 'processed_quickbooks_files' not in os.listdir():
        os.mkdir('./processed_quickbooks_files/')
        
    for entity, suffix in entities.items():
        
        # Print statement to help with debugging if one of the QuickBooks files is formatted differently
        print(f'formatting {entity}')
        
        # Create a dataframe from the QuickBooks export file
        df = pd.read_excel(f'{data_folder}{entity}', sheet_name='Sheet1', skiprows=4)
        
        # Drop the unneeded Total row
        if 'total' in df.iloc[len(df) - 1, 0].lower():
            df.drop(index=len(df) - 1, inplace=True)
            
        # Replace nan values with 0 
        df.fillna(0, inplace=True)
        
        # Split combined name and account coloumn into separate name and account column, adding the suffix to the end of the account numbers.
       
        df['_col'] = new_col_from_split(df, 'Unnamed: 1', ':')
        df['account_number'] = [account + suffix for account  in new_col_from_split(df, '_col', ' · ', index=0)]
        df['account_name'] = new_col_from_split(df, '_col', ' · ')
        
        # Combine debit and credit columns into a single balance column
        df['balance'] = df['Debit'] - df['Credit']
        
        # Export account number, account name, and balance columns to a new excel file in ready_for_import folder
        df[['account_number', 'account_name', 'balance']].to_excel(f'./ready_for_tb_import/formatted_tb_{entity}', index=False)
        
        # Move QuickBooks excel file to import_file_created folder
        os.rename(f'./{data_folder}/{entity}', f'./processed_quickbooks_files/{entity}')
        
        # Print statement confirming successful formatting to help with debugging
        print(f'formatted_tb_{entity} successfully created')
        

### Create Dictionary of Files to Format

In [17]:
df_keys = pd.read_excel('account_keys.xlsx')
df_keys.head()

Unnamed: 0,Acronym,Trial Balance,Entity
0,ABC,34-ABC,ABC Subsidiary
1,DEF,34-DEF,DEF Subsidiary
2,GHI,34-GHI,GHI Subsidiary
3,JKL,34-GPD,JKL Subsidiary
4,MNO,34-MNO,MNO Subsidiary


In [12]:
entity_dict = create_entity_dict(df_keys, entity_column='Entity', suffix_column='Acronym')
entity_dict

{}

### Format and Export TBs

In [7]:
format_tbs(entity_dict)

formatting ABC Subsidiary.xlsx
formatted_tb_ABC Subsidiary.xlsx successfully created
formatting DEF Subsidiary.xlsx
formatted_tb_DEF Subsidiary.xlsx successfully created


## Demonstrating Scalability
The more_quickbooks_data folder has additional example quickbooks exports from the same company. Since the export format is the same, recreating the transformations from the first run is as easy as passing in the folder name in the 'data_folder' parameters to apply the same formatting to the rest of these subsidiaries.

In [20]:
entity_dict = create_entity_dict(df_keys, entity_column='Entity', suffix_column='Acronym', data_folder='./more_quickbooks_data/')
entity_dict

{'JKL Subsidiary.xlsx': '.JKL',
 'MNO Subsidiary.xlsx': '.MNO',
 'PQR Subsidiary.xlsx': '.PQR',
 'STU Subsidiary.xlsx': '.STU',
 'VWX Subsidiary.xlsx': '.VWX',
 'YZ Subsidiary.xlsx': '.YZ',
 '123 Subsidiary.xlsx': '.123',
 '234 Subsidiary.xlsx': '.234',
 '345 Subsidiary.xlsx': '.345',
 '456 Subsidiary.xlsx': '.456',
 '567 Subsidiary.xlsx': '.567',
 '678 Subsidiary.xlsx': '.678',
 '789 Subsidiary.xlsx': '.789',
 '890 Subsidiary.xlsx': '.890',
 '999 Subsidiary.xlsx': '.999'}

In [10]:
format_tbs(entity_dict, data_folder='./more_quickbooks_data/')

formatting GHI Subsidiary.xlsx
formatted_tb_GHI Subsidiary.xlsx successfully created
