USC Children's Data Network

__RECORD RECONCILIATION PROJECT__

Date: 08 January 2020

__Description__ : This module contains the functions neccessary to perform the masking procedures for the dashboard reports for each of the programs. Primary masking hides low counts in range (0,10] by chaning the cell value to _*1_. Secondary masking is performed for categories of the same group, or for each column, to prevent deduction of the primarily masked cell by arithmetics. In other words, if a cell in a related group is primarily masked, then the next lowest cell count in that group will be secondarily masked.

        Example: A row of data has value [5,*1] for columns [Male, Female]. The value of -999 may be calculated by subtracting [5] from the row total. Secondary masking will transform this group to [*2,*1]. 

__General Procedure__: The module accepts unmasked Excel report files, and split the report into separate dataframes for each geography level (e.g.: State Assembly, State Senate etc.) Note that the unmasked Excel files are already primarily masked, but with values of -999 instead of _*1_ . The function _createGroupdict_ generates a Python dictionary of related cell groups, depending on the program. This dictionary is used to performed horizontal masking, which iterates through each row, by each group to check whether secondary masking is needed. After the rows have been checked, each column is also tested by similar logic, as arithmetic deduction may be done vertically as well. Finally, the separate dataframes are re-joined together to match the original shape. 

__General Sequence__:

*   Step 1. Run createGroupDict to create dictionary of horizontal groups _([Male,Female],[Asian, Black, White, Hispanic, Native  Am./other/missing])_.
*  Step 2. Run batchMask, wich first calls splitReport to separete dataframes by geographical levels. Then calls horizontalMask and verticalMask sequentially on them for masking. Finally re-join them into a single dataframe again. 
*   Step 3. Export masked dataframe to new Excel file.

In [1]:
import pandas as pd
import numpy as np
import os

### FUNCTION INVENTORY ###

In [2]:
def splitReport(df:pd.DataFrame, geo=["State Assembly Total","State Senate Total"
                                      ,"U.S. Congress Total"
                                     ,"County Total"]):
    '''
    Accept a DF that is the original Excel report,
    split into N sub-reports based on geographies as 
    lited by the geo parameter.
    :param df: Pandas Dataframe.
    :param geo: List of strings, names of each level of geography.
    :return split_dfs: List of split dataframes.
    '''
    #We can identify each section of the report here...
    indices = np.zeros(len(geo),dtype='int')
    split_dfs = []
    #Look up the index of the last row of each geography level, usually subtotal rows.
    for idx in range(len(geo)):
        print("GEOGRAPHY level:", geo[idx])
        indices[idx] = df.index[df.Number == geo[idx]].tolist()[0]
        if idx == 0:
            split_dfs.append(df.iloc[:indices[idx]+1])
        else:
            split_dfs.append(df.iloc[indices[idx-1]+1:indices[idx]+1])
    return split_dfs

In [3]:
def createGroupdict(prog):
    '''
    For each program, create a dictionary of groups of related variables. This is necesarry because there are slight 
    variations between programs.
    :param prog: A string. Name of program (ex: "WIC","CalFresh" etc...)
    :return group_dict: A Python dictionary of related groups of variables for each program.
    '''
    group_dict = dict()
    #Populate dictionary
    group_dict["race"] =  ['Asian_PI','Black', 'Hispanic', 'Native American/Other/Unknown',"White"]
    group_dict["prog"] = ['1 program', '2 programs', '3 programs','4 programs', '5+ programs']
    group_dict["agy"] = ['1 department', '2 departments','3 departments', '4 departments']
    #Variable fields...
    group_dict["age1"] = ['17 and Under', '18 and Over']
    group_dict["age2"] = ['18-64','65 and Over']
    group_dict["gender"] = ['Female','Male', 'Unknown/Other Gender']
    #Special modifications for each program 
    if prog.upper() == 'WIC': 
        group_dict["gender"] = ['Female','Male']
        group_dict["age1"]  = ['less than 19', '20 to 24', '25 to 29', '30 to 34','35 and over']
        group_dict["age2"]  = ['wic age 0','wic age 1','wic age 2','wic age 3','wic age 4','wic age 5 to 19',
                               'wic age 20 and Over']
    elif prog.upper() == "CWS_CMS" :
        group_dict["age1"] = ['age1 missing','17 and Under', '18 and Over']
        group_dict["age2"] = ['age2 missing', '18-64', '65 and Over']
        group_dict["fcage"] = ['fc age missing','age 0', '1 to 2', '3 to 5','6 to 10', '11 to 15', '16 to 17', '18 to 20']
    elif prog.upper() == "FC" :
        del group_dict['age2'] #no need to mask group with only 1 variable
        group_dict["fcage"] = ['age 0', '1 to 2', '3 to 5','6 to 10', '11 to 15', '16 to 17', '18 to 20']
    elif prog.upper() =='FPACT':
        group_dict["gender"] = ['Female','Male']
    elif prog.upper() =='MEDICAL':
        group_dict["gender"] = ['Female','Male']
    elif prog.upper() == "CALFRESH":
        group_dict["gender"] = ['Female','Male']
        group_dict["age1"]  = ['17 and Under','18 to 59', '60 and Over']
    elif prog.upper() in  ['IHSS','CALWORKS','DDS']: 
        group_dict["gender"] = ['Female','Male']
    

    return group_dict  

In [4]:
def verticalMask(df, drop_cols = ['Program', 'Level', 'Number']):
    '''Asssuming the dataframe only contains 1 geography,
    iterrate through each column to perform vertical masking.
    :param df: Dataframe
    :param drop_cols: list of columns to NOT be masked/ignored (non-data columns)
    :return d: Dataframe, veritically masked
    '''
    d = df.copy()
    #Iteratate through every report column
    cols = list(set(d.columns.tolist()) - set(drop_cols))
    
    for c in cols:
        print("...Processing column",c)
        #Only mask if a single number in that column is -999
#         print(np.sum(d.loc[d[c] < 0, c]))
        if sum(d.loc[d[c] < 0, c]) == -999: 
#             print("Single -999 value found. Secondary masking done!")
            # Only mask 0 cell when no other non-zero cell present
            next_min_a = np.min(d.loc[d[c] >= 0 , c])
            next_min_b = np.min(d.loc[d[c] > 0 , c])
            #Only secondary mask 0 if it is the true next smallest. 
#             print("min_a",next_min_a,"min_b",next_min_b)
            next_min = max(next_min_a, next_min_b )
#             next_min_idx = np.where(d[c].values  ==next_min)[0][0] 
            col_values = d[c]
#             print("col_values",col_values)
            next_min_idx = col_values[col_values == next_min].index.tolist()[0]
            print("The next smallest value for column {} is at index {} of value {}".format(c,next_min_idx,next_min))
            #Secondary mask 
            d.loc[next_min_idx, c] = -100 #placeholder value
    return d

In [5]:
def secondMask(row):
    '''
    Helper function. Perform secondary masking for a single vector by checking if an element is 
    -999 (placeholder for original primary masking).
    :param row: a numpy vector
    :return row: secondarily masked row vector
    '''
    #If -999 exists, find the index of next smallest cell to mask
    if len(row[row ==  -999]) == 1:
#         print("Row before:\n", row)
        # Only mask 0 cell when no other non-zero cells present
        next_min_a = np.min(row[row >= 0])
        next_min_b = np.min(row[row > 0])
        next_min  = max(next_min_a, next_min_b) 
        next_min_idx = np.where(row==next_min)[0][0] #Remember the index of next-smallest cell
#         next_min_idx = row[row==next_min].index.tolist()[0]
        print("Next smallest at index {} of value {}".format(next_min_idx, next_min))
        #Change value of masked cell, temporarily to -100 (numeric values easier to keep track)
        row[next_min_idx] = -100
#         print("Row after:\n", row)
    return row

In [6]:
def horizontalMask(df, grp:dict):
    '''
    Leveraging the secondMask function, change the 
    group of related column names, then perform masking 
    within the parameters for each row in group. 
    :param df: Pandas dataframe
    :param grp: dictionary of group names, from createGroupdict.
    :return d: Dataframe, horizontally masked.
    '''
    d = df.copy()
    #Apply secondMask for each row
    for k,v in grp.items():
        print("Now looking at group,",k)
        d.loc[:,v] = d.loc[:,v].apply(secondMask, axis=1)
    return d

In [7]:
def batchMask(df, grp):
    '''
    Wrapper function. Performs both horizontal and vertical secondary masking for an entire dataframe 
    by calling previous helper functions.
    :param df: Dataframe, from original Excel file. 
    :param grp: dictionary, from createGroupDict
    :return final_df: Dataframe, masked and concatenated
    '''
    masked_df = splitReport(df)
    final_df = pd.DataFrame()
    #Perform masking for each split frame 
    for i in range(len(masked_df)):
#         print("-----------NOW LOOKING AT GEOGRAPHY-------", masked_df[i]["Level"])
        #Horizontal masking
        masked_df[i] = horizontalMask(masked_df[i], grp )
        #Vertical masking
        masked_df[i] = verticalMask(masked_df[i])   
        #Concatenate frames together
        final_df = pd.concat([final_df, masked_df[i]], ignore_index=True)
        
#     #Finally, change al -999 and -100 into *1 and *2, respectively;
    final_df = final_df.replace(-999,"*1")
    final_df = final_df.replace(-100,"*2")

#     if ! assert df.shape == final_df.shape: 
#         print("Final dataframe does NOT preserve original shape!!")
    final_df = final_df.reset_index(drop=True)
    return final_df

## PRODUCTION TEST

## MANUAL FOR EACH PROGRAM

In [195]:
prog = "FC"
year = '18'
d = pd.read_excel('Dashboard_'+ year+"//" + prog +"_Dashboard_" + year+ ".xlsx")
d.columns 

Index(['fileyear', 'Program', 'Level', 'Number', 'Child Welfare', 'Medi-Cal',
       'FPACT', 'CalFresh', 'CalWorks', 'IHSS', 'DDS', 'WIC',
       'Native American/Other/Unknown', 'Black', 'White', 'Hispanic',
       'Asian_PI', 'Female', 'Male', 'Unknown/Other Gender', '17 and Under',
       '18 and Over', '18-64', 'age 0', '1 to 2', '3 to 5', '6 to 10',
       '11 to 15', '16 to 17', '18 to 20', '1 program', '2 programs',
       '3 programs', '4 programs', '5+ programs', '1 department',
       '2 departments', '3 departments', '4 departments'],
      dtype='object')

In [196]:
group_dict = createGroupdict(prog)

In [197]:
r = batchMask(d, group_dict)

GEOGRAPHY level: State Assembly Total
GEOGRAPHY level: State Senate Total
GEOGRAPHY level: U.S. Congress Total
GEOGRAPHY level: County Total
Now looking at group, race
Next smallest at index 3 of value 25
Next smallest at index 3 of value 17
Next smallest at index 3 of value 16
Next smallest at index 0 of value 20
Next smallest at index 0 of value 69
Next smallest at index 0 of value 43
Next smallest at index 1 of value 23
Next smallest at index 0 of value 37
Next smallest at index 0 of value 19
Next smallest at index 0 of value 23
Next smallest at index 0 of value 32
Next smallest at index 0 of value 24
Next smallest at index 0 of value 18
Next smallest at index 0 of value 24
Next smallest at index 3 of value 24
Next smallest at index 0 of value 11
Next smallest at index 3 of value 13
Next smallest at index 0 of value 30
Next smallest at index 0 of value 29
Next smallest at index 1 of value 11
Next smallest at index 0 of value 18
Next smallest at index 0 of value 36
Next smallest at i

Next smallest at index 0 of value 720
Next smallest at index 0 of value 31490
Now looking at group, fcage
...Processing column Asian_PI
...Processing column 3 departments
...Processing column CalWorks
...Processing column Hispanic
...Processing column Black
...Processing column 16 to 17
...Processing column 18 to 20
...Processing column 5+ programs
...Processing column 4 departments
...Processing column fileyear
...Processing column 18 and Over
...Processing column DDS
...Processing column 1 program
...Processing column 6 to 10
...Processing column 1 to 2
...Processing column IHSS
...Processing column Female
...Processing column 3 to 5
...Processing column 11 to 15
...Processing column 2 programs
...Processing column 18-64
...Processing column FPACT
...Processing column 3 programs
...Processing column 2 departments
...Processing column 4 programs
...Processing column 17 and Under
...Processing column WIC
...Processing column Child Welfare
...Processing column 1 department
...Processing

In [198]:
r.head()

Unnamed: 0,fileyear,Program,Level,Number,Child Welfare,Medi-Cal,FPACT,CalFresh,CalWorks,IHSS,...,18 to 20,1 program,2 programs,3 programs,4 programs,5+ programs,1 department,2 departments,3 departments,4 departments
0,2018,Foster Care,State Assembly,-9,3770,3552,45,684,597,*1,...,333,161,1253,1196,803,357,163,1625,1335,647
1,2018,Foster Care,State Assembly,District 01,880,861,15,265,172,0,...,105,15,412,227,161,65,16,605,217,42
2,2018,Foster Care,State Assembly,District 02,1115,1087,33,348,235,0,...,116,20,504,309,196,86,20,780,273,42
3,2018,Foster Care,State Assembly,District 03,1185,1166,31,349,221,0,...,128,13,553,306,216,97,13,801,290,81
4,2018,Foster Care,State Assembly,District 04,624,606,27,166,114,0,...,83,13,297,180,93,41,13,452,124,35


In [199]:
r.to_excel('Dashboard_'+ year+"//"+"Masked_"+prog+"_Dashboard_" + year + ".xlsx",index=False)

# AUTOMATE TEST

## Loop through Each Year and Mask

In [37]:
year = "18"

In [38]:
for filename in os.listdir('Dashboard_'+ year+"//"):
    if 'Masked' not in filename and 'test' not in filename:
        prog = filename.split('_')[0]
        if prog == 'CWS':
            prog = 'CWS_CMS'
        print("Processing",filename)
        t = pd.read_excel('Dashboard_'+ year+"//"+filename)
        group_dict = createGroupdict(prog)
        if year == '15':
            group_dict['agy'].remove('4 departments')
        v = batchMask(t,group_dict)
        v.to_excel('Dashboard_'+ year+"//Masked_"+filename)

Processing CalFresh_Dashboard_18.xlsx
GEOGRAPHY level: State Assembly Total
GEOGRAPHY level: State Senate Total
GEOGRAPHY level: U.S. Congress Total
GEOGRAPHY level: County Total
Now looking at group, race
Now looking at group, prog
Now looking at group, agy
Now looking at group, age1
Now looking at group, age2
Now looking at group, gender
...Processing column 65 and Over
...Processing column Foster Care
...Processing column DDS
...Processing column White
...Processing column Female
...Processing column 1 program
...Processing column IHSS
...Processing column 17 and Under
...Processing column 2 departments
...Processing column 3 departments
...Processing column Person
...Processing column fileyear
...Processing column 4 departments
...Processing column Native American/Other/Unknown
...Processing column FPACT
...Processing column Asian_PI
...Processing column Hispanic
...Processing column 2 programs
...Processing column WIC
...Processing column 5+ programs
...Processing column 1 departm

Next smallest at index 0 of value 4019
Next smallest at index 0 of value 1903
Next smallest at index 0 of value 2125
Next smallest at index 0 of value 5598
Next smallest at index 0 of value 1658
Next smallest at index 0 of value 2953
Next smallest at index 0 of value 2431
Next smallest at index 0 of value 2710
Next smallest at index 0 of value 5515
Next smallest at index 0 of value 1521
Next smallest at index 0 of value 975
Next smallest at index 0 of value 1958
Next smallest at index 0 of value 751
Next smallest at index 0 of value 1434
Next smallest at index 0 of value 2353
Next smallest at index 0 of value 3458
Next smallest at index 0 of value 1049
Next smallest at index 0 of value 365
Next smallest at index 0 of value 374
Next smallest at index 0 of value 516
Next smallest at index 0 of value 1370
Next smallest at index 0 of value 2083
Now looking at group, gender
...Processing column 65 and Over
...Processing column Foster Care
...Processing column DDS
...Processing column White


...Processing column 65 and Over
...Processing column Foster Care
...Processing column DDS
...Processing column White
...Processing column Female
...Processing column 1 program
...Processing column IHSS
...Processing column 17 and Under
...Processing column 2 departments
...Processing column 3 departments
...Processing column Person
...Processing column fileyear
...Processing column 4 departments
...Processing column Native American/Other/Unknown
...Processing column FPACT
...Processing column Asian_PI
...Processing column Hispanic
...Processing column 2 programs
...Processing column CalFresh
...Processing column WIC
...Processing column 5+ programs
...Processing column 1 department
...Processing column 3 programs
...Processing column 4 programs
...Processing column Male
...Processing column Child Welfare
...Processing column 18 and Over
...Processing column 18-64
...Processing column Cases
...Processing column Black
...Processing column Medi-Cal
Now looking at group, race
Next smalles

...Processing column age 0
...Processing column 65 and Over
...Processing column Foster Care
...Processing column DDS
...Processing column White
...Processing column 1 program
...Processing column IHSS
...Processing column 17 and Under
...Processing column 2 departments
...Processing column 3 departments
...Processing column Person
...Processing column fileyear
...Processing column 4 departments
...Processing column age1 missing
...Processing column Native American/Other/Unknown
...Processing column 11 to 15
...Processing column 1 to 2
...Processing column 18 to 20
...Processing column FPACT
...Processing column 6 to 10
...Processing column Asian_PI
...Processing column Hispanic
...Processing column 2 programs
...Processing column CalFresh
...Processing column WIC
...Processing column 3 to 5
...Processing column 5+ programs
...Processing column 16 to 17
...Processing column 1 department
...Processing column 3 programs
...Processing column age2 missing
...Processing column 4 programs
..

Now looking at group, gender
Next smallest at index 0 of value 89
Next smallest at index 0 of value 841
Next smallest at index 0 of value 137
Next smallest at index 0 of value 444
Next smallest at index 0 of value 2952
Next smallest at index 0 of value 816
Next smallest at index 0 of value 654
Next smallest at index 0 of value 507
Next smallest at index 0 of value 242
Next smallest at index 0 of value 479
Next smallest at index 0 of value 178
Next smallest at index 1 of value 35
Next smallest at index 0 of value 383
Next smallest at index 0 of value 852
Next smallest at index 0 of value 692
Next smallest at index 1 of value 262
Next smallest at index 0 of value 102
Next smallest at index 0 of value 464
Next smallest at index 0 of value 109
Next smallest at index 1 of value 566
Next smallest at index 0 of value 667
Next smallest at index 0 of value 400
Next smallest at index 0 of value 636
Next smallest at index 0 of value 133
Next smallest at index 0 of value 728
Next smallest at index

Processing FC_Dashboard_18.xlsx
GEOGRAPHY level: State Assembly Total
GEOGRAPHY level: State Senate Total
GEOGRAPHY level: U.S. Congress Total
GEOGRAPHY level: County Total
Now looking at group, race
Next smallest at index 3 of value 25
Next smallest at index 3 of value 17
Next smallest at index 3 of value 16
Next smallest at index 0 of value 20
Next smallest at index 0 of value 69
Next smallest at index 0 of value 43
Next smallest at index 1 of value 23
Next smallest at index 0 of value 37
Next smallest at index 0 of value 19
Next smallest at index 0 of value 23
Next smallest at index 0 of value 32
Next smallest at index 0 of value 24
Next smallest at index 0 of value 18
Next smallest at index 0 of value 24
Next smallest at index 3 of value 24
Next smallest at index 0 of value 11
Next smallest at index 3 of value 13
Next smallest at index 0 of value 30
Next smallest at index 0 of value 29
Next smallest at index 1 of value 11
Next smallest at index 0 of value 18
Next smallest at index 

Next smallest at index 3 of value 80
Next smallest at index 3 of value 64
Next smallest at index 3 of value 76
Next smallest at index 3 of value 44
Next smallest at index 3 of value 23
Next smallest at index 3 of value 56
Next smallest at index 3 of value 96
Now looking at group, age1
Now looking at group, gender
Next smallest at index 0 of value 1714
Next smallest at index 0 of value 1714
Next smallest at index 0 of value 790
Next smallest at index 1 of value 863
Next smallest at index 0 of value 720
Next smallest at index 0 of value 31490
Now looking at group, fcage
...Processing column age 0
...Processing column White
...Processing column DDS
...Processing column 1 program
...Processing column IHSS
...Processing column 17 and Under
...Processing column 2 departments
...Processing column 3 departments
...Processing column Person
...Processing column fileyear
...Processing column 4 departments
...Processing column Native American/Other/Unknown
...Processing column 11 to 15
...Processi

Next smallest at index 0 of value 11141
Next smallest at index 0 of value 49
Next smallest at index 0 of value 1565
Next smallest at index 1 of value 0
Next smallest at index 0 of value 31490
Now looking at group, fcage
Next smallest at index 1 of value 36
Next smallest at index 1 of value 19
Next smallest at index 0 of value 0
...Processing column age 0
...Processing column White
...Processing column DDS
...Processing column 1 program
...Processing column IHSS
...Processing column 17 and Under
...Processing column 2 departments
...Processing column 3 departments
...Processing column Person
...Processing column fileyear
...Processing column 4 departments
...Processing column Native American/Other/Unknown
...Processing column 11 to 15
...Processing column 1 to 2
...Processing column 18 to 20
...Processing column FPACT
...Processing column 6 to 10
...Processing column Asian_PI
...Processing column Hispanic
...Processing column 2 programs
...Processing column CalFresh
...Processing column

Next smallest at index 2 of value 309
Next smallest at index 2 of value 290
Next smallest at index 2 of value 243
Now looking at group, age1
Now looking at group, age2
Next smallest at index 0 of value 11229
Next smallest at index 0 of value 8912
Next smallest at index 0 of value 3766
Next smallest at index 0 of value 9223
Next smallest at index 0 of value 5105
Next smallest at index 0 of value 8478
Next smallest at index 0 of value 9816
Next smallest at index 0 of value 8576
Next smallest at index 0 of value 11279
Next smallest at index 0 of value 10672
Next smallest at index 0 of value 8262
Next smallest at index 0 of value 5602
Next smallest at index 0 of value 6905
Next smallest at index 0 of value 17278
Next smallest at index 0 of value 10895
Next smallest at index 0 of value 7875
Next smallest at index 0 of value 18741
Next smallest at index 0 of value 10138
Next smallest at index 0 of value 11671
Next smallest at index 0 of value 7063
Next smallest at index 0 of value 12100
Next

...Processing column 65 and Over
...Processing column Foster Care
...Processing column DDS
...Processing column White
...Processing column 1 program
...Processing column 17 and Under
...Processing column 2 departments
...Processing column 3 departments
...Processing column Person
...Processing column fileyear
...Processing column 4 departments
...Processing column Native American/Other/Unknown
...Processing column FPACT
...Processing column Asian_PI
...Processing column Hispanic
...Processing column 2 programs
...Processing column CalFresh
...Processing column WIC
The next smallest value for column WIC is at index 95 of value 16
...Processing column 5+ programs
...Processing column 1 department
...Processing column 3 programs
...Processing column 4 programs
...Processing column Male
...Processing column Child Welfare
...Processing column 18 and Over
...Processing column CalWorks
...Processing column 18-64
...Processing column Female
...Processing column Black
...Processing column Medi-

Now looking at group, gender
...Processing column 65 and Over
...Processing column Foster Care
...Processing column DDS
...Processing column White
...Processing column 1 program
...Processing column IHSS
...Processing column 17 and Under
...Processing column 2 departments
...Processing column 3 departments
...Processing column Person
...Processing column fileyear
...Processing column 4 departments
...Processing column Native American/Other/Unknown
...Processing column FPACT
...Processing column Asian_PI
...Processing column Hispanic
...Processing column 2 programs
...Processing column ACA
...Processing column CalFresh
...Processing column WIC
...Processing column 5+ programs
...Processing column 1 department
...Processing column 3 programs
...Processing column 4 programs
...Processing column Male
...Processing column Child Welfare
...Processing column 18 and Over
...Processing column CalWorks
...Processing column 18-64
...Processing column Female
...Processing column Black
Now looking 

## Stacking

In [39]:
ls = []
df_ls = []
for filename in os.listdir('Dashboard_'+ year+"//"):
    if 'Masked' in filename and 'zip' not in filename:
        print(filename)
        ls.append(filename)
        df_ls.append(pd.read_excel('Dashboard_'+ year+"//"+filename))

Masked_CalFresh_Dashboard_18.xlsx
Masked_CalWorks_Dashboard_18.xlsx
Masked_CWS_CMS_Dashboard_18.xlsx
Masked_DDS_Dashboard_18.xlsx
Masked_FC_Dashboard_18.xlsx
Masked_FPACT_Dashboard_18.xlsx
Masked_IHSS_Dashboard_18.xlsx
Masked_Medical_Dashboard_18.xlsx
Masked_WIC_Dashboard_18.xlsx


In [40]:
df = pd.concat(df_ls,axis=0,sort=False)

In [41]:
correct_order = ['fileyear','Program','Level','Number','Person','Cases','Medi-Cal','ACA',
                 'FPACT','CalFresh','CalWorks','IHSS','Child Welfare','Foster Care','DDS',
                 'WIC','Black','White','Hispanic','Asian_PI','Native American/Other/Unknown'
                 ,'Female','Male','Unknown/Other Gender','age1 missing','17 and Under',
                 '18 and Over','age2 missing','18-64','65 and Over','18 to 59','60 and Over',
                 '1 program','2 programs','3 programs','4 programs','5+ programs','1 department',
                 '2 departments','3 departments','4 departments','fc age missing',
                 'age 0','1 to 2','3 to 5','6 to 10','11 to 15','16 to 17','18 to 20','less than 19',
                 '20 to 24','25 to 29','30 to 34','35 and over','wic age 0','wic age 1','wic age 2',
                 'wic age 3','wic age 4','wic age 5 to 19','wic age 20 and Over']
if year =='15':
    correct_order.remove('4 departments')

In [42]:
df = df[correct_order]

In [43]:
df.to_excel('Dashboard_'+ year+"//"+"Test_stacked_"+year+".xlsx")

In [200]:
#********************************END OF CODE***********************************#