# EA Performance Score by High School Campus 2018-2019
**This notebook creates campus level attainment scores for all high school campuses reporting EA performance indicators in North Carolina. High school EA Performance indicators used in this report includes the following topics:**

* **ACT:**
 * ACT Composite Score - NCDPI mapping pct_acall_all_act
 * ACT Score - NCDPI mapping pct_acco_all_act
 * Pct Met ACT WorkKeys - NCDPI mapping pct_met_act_workkeys_part 
 * Pct Met The ACT - NCDPI mapping pct_met_the_act_part 

* **Attendance:**
 * Avg Daily Attendance - NCDPI mapping ada_adm_ratio_2019_attendance 
 * Chronic Absence - NCDPI mapping pct_all_chron_absent  

* **Graduation:**
 * 4yr Graduation Rate - NCDPI mapping pct_std_all_cgr 
 * 5yr Graduation Rate - NCDPI mapping pct_ext_all_cgr 

* **Discipline:**
 * Crime Rate Per 1000 - NCDPI mapping act_per1000_all_inc2 
 * Short Term Susp Per 1000 - NCDPI mapping sts_per1000_all_inc2  
 * Long Term Susp Per 1000 - NCDPI mapping lts_per1000_all_inc2 
 * Expulsion Per 1000 - NCDPI mapping exp_per1000_all_inc2 
 
* **College Enrollment:**
 * Pct Enrl College - NCDPI mapping pct_enrolled_2018_enroll_all_college 
 
* **School Performance:**
 * Spg Score - NCDPI mapping spg_score_all_spg2 
 
**During the 2018-19 school year there were 358 high school campuses reporting all of the EA Performance indicators listed above. Steps for creating the blended EA Performance Score are as follows:**

1. Each individual indicator is mapped to the range 0-100 using a MinMaxScaler. This process works as follows: 
 * Let min = 0 and max = 100
 * Let X.min = the minimum value and X.max = the maximum value for each individual indicator column during 2018-19.
 * Each value in the indicator column is now transformed to the range 0 - 100 by: 
 * X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0))
 * X_scaled = X_std * (max - min) + min
 * See docs here: [Sklearn MinMaxScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html#sklearn.preprocessing.MinMaxScaler.fit_transform)


2. Scaled scores are reversed when a score value of less is better.  The score is reversed using (100 - each indicator value). Reversed indicators include:
 * Chronic Absence 
 * Crime Rate Per 1000
 * Short Term Susp Per 1000 
 * Long Term Susp Per 1000 
 * Expulsion Per 1000


3. Multiple scaled and reversed indicators are averaged by topic into a blended topical score. This gives topics with multiple indicators an equal weight in the final blended score.  


4. The final score ranges from 0 to 100 including a mean value for all of the following blended topics and individual indicators:
 * ACT - 4 blended indicators included (see above)
 * Attendance - 2 blended indicators included (see above)
 * Graduation - 2 blended indicators included (see above)
 * Disipline - 4 blended indicators included (see above)
 * College Enrollment - 1 indicator included (see above)
 * SPG Score - 1 indicator included (see above)
 
**Important Notes**
* The MinMax scaling may not work as expected at a first glance. 
* For example, graduation rates are on a scale of 0-100 already.  In practice however, no school has a graduation rate below even 50%.  This means only a small portion of the values in the range 0 to 100 are actually observed / being used. 
* The MinMaxScaler is only considering the observed min and max values for all campuses during 2018-2019 and mapping that range (much smaller than 0 - 100) to the scale 0-100.
* So each school is basically given a score of 0-100 with 0 being the worst school and 100 being the best school for each individual EA performance indicator.
* This is a nice feature since for the untrained eye, a high school's graduation rate could rank in the bottom 10% and still have an actual value which appears very close to another campus that is in the top 20%.
* This means that a 4 year graduation rate of 68.7% for 2018-19 maps to a scaled MinMax score of 33.23.    
* The scaled score should highlight differences in performance across all metrics on a scaled value range of 0-100 where 0 equals the lowest and 100 equals the highest campus level observed values for each individual performance attribute that the final blended score comprises. 

In [447]:
#import required Libraries
import pandas as pd
import numpy as np

#**********************************************************************************
# Set the following variables before running this code!!!
#**********************************************************************************

dataDir = 'C:/Users/Jake/Documents/GitHub/EducationDataNC/2019/School Datasets/'
outputDirParent = 'D:/BenepactLLC/Belk/NC_Report_Card_Data/2020/October 2020/' 

#All raw data files are filtered for the year below
schoolYear = 2019

In [448]:
# read in the public schools file.
filePath = dataDir + 'PublicHighSchools' + str(schoolYear) +'.csv'
schFile = pd.read_csv(filePath, dtype={'unit_code': object}, low_memory=False)
# Filter out any state or national level records
schFile = schFile[(schFile.agency_code != 'NATION') & (schFile.agency_code != 'NC-SEA')]

## Isolate EA Performance Indicators

In [449]:
topLineFields = ['agency_code','lea_name','name_loc','county_loc',
                 #'pct_ccr_rd_04_all_pc', #'EOGReadingGr4_CACR_All',
                 #'pct_glp_rd_04_all_pc', #'EOGReadingGr4_GLP_All',
                 #'pct_ccr_ma_08_all_pc', #'EOGMathGr8_CACR_All',
                 #'pct_glp_ma_08_all_pc', #'EOGMathGr8_GLP_All',
                 #'count_all_pk_enroll',
                 'pct_acall_all_act',
                 'pct_acco_all_act', #'pct_The ACT_ALL_PART_DET',
                 'pct_met_act_workkeys_part',
                 'pct_met_the_act_part',
                 'ada_adm_ratio_2019_attendance', #'pct_att',
                 'pct_all_chron_absent',
                 'pct_std_all_cgr',
                 'pct_ext_all_cgr',
                 'pct_enrolled_2018_enroll_all_college',
                 # Removed due to data errors - 'pct_ccp_t_All_courses2', #'pct_univ_coll_crs',
                 'spg_score_all_spg2',
                 #'ma_spg_score_all_spg2',
                 #'rd_spg_score_all_spg2',
                 'act_per1000_all_inc2', # 'crime',
                 'sts_per1000_all_inc2',  # 'short_term',
                 'lts_per1000_all_inc2', # 'long_term',
                 'exp_per1000_all_inc2'  #'expulsion'
                ]

csvColNames = ['unit_code','Lea_Name','School_Name','county',
               #'4th Reading CACR',
               #'4th Reading GLP',
               #'8th Math CACR',
               #'8th Math GLP',
               #'PK Enrollment',
               'ACT Composite Score',
               'ACT Score',
               'Pct Met ACT WorkKeys',
               'Pct Met The ACT',
               'Avg Daily Attendance',
               'Chronic Absence',
               '4yr Graduation Rate',
               '5yr Graduation Rate',
               'Pct Enrl College',
               # 'Pct Enrl College Courses',
               'Spg Score',
               #'Math Spg Score',
               #'Reading Spg Score',
               'Crime Rate Per 1000',
               'Short Term Susp Per 1000',
               'Long Term Susp Per 1000',
               'Expulsion Per 1000'
               ]

## Remove Any High Schools not Reporting any EA Performance Indicators  

In [450]:
# Select only the high school EA performance indicators
eaScoreData = schFile[topLineFields].copy()
eaScoreData.columns = csvColNames

In [451]:
# Remove all schools with any missing indicators
eaScoreData.dropna(inplace=True)

In [452]:
# Check the number of high schools and indicators remaining
eaScoreData.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 358 entries, 4 to 689
Data columns (total 18 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   unit_code                 358 non-null    object 
 1   Lea_Name                  358 non-null    object 
 2   School_Name               358 non-null    object 
 3   county                    358 non-null    object 
 4   ACT Composite Score       358 non-null    float64
 5   ACT Score                 358 non-null    float64
 6   Pct Met ACT WorkKeys      358 non-null    float64
 7   Pct Met The ACT           358 non-null    float64
 8   Avg Daily Attendance      358 non-null    float64
 9   Chronic Absence           358 non-null    float64
 10  4yr Graduation Rate       358 non-null    float64
 11  5yr Graduation Rate       358 non-null    float64
 12  Pct Enrl College          358 non-null    float64
 13  Spg Score                 358 non-null    float64
 14  Crime Rate

In [453]:
# Check that all indicators are being reported
eaScoreData.isna().sum()

unit_code                   0
Lea_Name                    0
School_Name                 0
county                      0
ACT Composite Score         0
ACT Score                   0
Pct Met ACT WorkKeys        0
Pct Met The ACT             0
Avg Daily Attendance        0
Chronic Absence             0
4yr Graduation Rate         0
5yr Graduation Rate         0
Pct Enrl College            0
Spg Score                   0
Crime Rate Per 1000         0
Short Term Susp Per 1000    0
Long Term Susp Per 1000     0
Expulsion Per 1000          0
dtype: int64

## What the Data Looks like before Scaling

In [454]:
#Save a copy of this data for later
eaScoreDataBefore = eaScoreData.copy()  
eaScoreData

Unnamed: 0,unit_code,Lea_Name,School_Name,county,ACT Composite Score,ACT Score,Pct Met ACT WorkKeys,Pct Met The ACT,Avg Daily Attendance,Chronic Absence,4yr Graduation Rate,5yr Graduation Rate,Pct Enrl College,Spg Score,Crime Rate Per 1000,Short Term Susp Per 1000,Long Term Susp Per 1000,Expulsion Per 1000
4,10324,Alamance-Burlington Schools,Eastern Alamance High,Alamance,26.3,51.1,100.0,83.3,91.94,0.2921,83.2,86.0,0.633218,67.0,14.2737,80.605,0.8396,0.0
5,10348,Alamance-Burlington Schools,Graham High,Alamance,10.0,31.8,0.0,0.0,91.42,0.3446,68.7,75.3,0.443787,56.0,12.2616,174.387,1.3624,0.0
6,10360,Alamance-Burlington Schools,Hugh M Cummings High,Alamance,6.1,18.1,100.0,100.0,88.52,0.4881,73.6,72.2,0.378788,59.0,15.0215,175.966,1.073,0.0
8,10388,Alamance-Burlington Schools,Southern Alamance High,Alamance,21.5,52.9,100.0,50.0,90.64,0.3335,85.9,81.9,0.58908,69.0,16.4286,125.0,0.0,0.0
9,10396,Alamance-Burlington Schools,Walter M Williams High,Alamance,20.5,43.7,100.0,20.0,88.76,0.4135,88.4,84.4,0.55036,75.0,6.1457,201.054,0.878,0.0
10,10400,Alamance-Burlington Schools,Western Alamance High,Alamance,29.3,58.0,100.0,80.0,91.68,0.2495,88.4,91.0,0.644654,69.0,9.839,100.179,0.0,0.0
14,20302,Alexander County Schools,Alexander Central High,Alexander,21.4,51.1,100.0,100.0,94.54,0.1739,83.5,85.1,0.6,70.0,11.7647,88.235,0.0,0.0
16,30304,Alleghany County Schools,Alleghany High,Alleghany,17.0,43.6,100.0,100.0,95.23,0.1551,89.3,92.0,0.51087,71.0,31.0263,66.826,0.0,0.0
19,40306,Anson County Schools,Anson High School,Anson,7.0,20.0,100.0,100.0,87.39,0.4971,86.2,91.0,0.463855,51.0,10.1449,1052.174,13.0435,0.0
21,50302,Ashe County Schools,Ashe County High,Ashe,27.1,53.8,66.7,100.0,98.33,0.1218,86.0,89.1,0.630542,64.0,11.8343,68.639,0.0,0.0


# Scale the indicators by mapping all values to the range 0-100
**Translate each column individually such that it is in the given range between 0 and 100. The scaling transformation is given by:**
* X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0))
* X_scaled = X_std * (max - min) + min
* where min, max = 0,100 and X.min, X.max = The column's minimum and maximum values. 

In [455]:
from sklearn.preprocessing import MinMaxScaler
# Create a MinMaxScaler in the range 1-100
scaler = MinMaxScaler(feature_range=(0,100))
# Get only the columns containing EA Performance scores
eaScoreCols = eaScoreData.columns[(eaScoreData.dtypes.values == np.dtype('float64'))]
# Perform scaling mapping each column's scores to the range 0-100
eaScoreData[eaScoreCols] = scaler.fit_transform(eaScoreData[eaScoreCols])

## What the Data Looks like after Scaling

In [456]:
eaScoreData

Unnamed: 0,unit_code,Lea_Name,School_Name,county,ACT Composite Score,ACT Score,Pct Met ACT WorkKeys,Pct Met The ACT,Avg Daily Attendance,Chronic Absence,4yr Graduation Rate,5yr Graduation Rate,Pct Enrl College,Spg Score,Crime Rate Per 1000,Short Term Susp Per 1000,Long Term Susp Per 1000,Expulsion Per 1000
4,10324,Alamance-Burlington Schools,Eastern Alamance High,Alamance,24.287343,50.451467,100.0,83.3,44.928685,47.481776,70.050761,73.293769,54.239978,43.859649,30.402953,6.077088,4.357529,0.0
5,10348,Alamance-Burlington Schools,Graham High,Alamance,5.701254,28.668172,0.0,0.0,40.808241,56.179589,33.248731,41.543027,26.876154,24.561404,26.117184,13.67108,7.070864,0.0
6,10360,Alamance-Burlington Schools,Hugh M Cummings High,Alamance,1.254276,13.205418,100.0,100.0,17.828843,79.953612,45.685279,32.344214,17.486857,29.824561,31.995766,13.798939,5.568877,0.0
8,10388,Alamance-Burlington Schools,Southern Alamance High,Alamance,18.814139,52.48307,100.0,50.0,34.627575,54.340623,76.903553,61.127596,47.864192,47.368421,34.992886,9.671971,0.0,0.0
9,10396,Alamance-Burlington Schools,Walter M Williams High,Alamance,17.673888,42.099323,100.0,20.0,19.730586,67.594433,83.248731,68.545994,42.270874,57.894737,13.090329,15.830438,4.556825,0.0
10,10400,Alamance-Burlington Schools,Western Alamance High,Alamance,27.708096,58.239278,100.0,80.0,42.868463,40.424122,83.248731,88.130564,55.891953,47.368421,20.957051,7.662092,0.0,0.0
14,20302,Alexander County Schools,Alexander Central High,Alexander,18.700114,50.451467,100.0,100.0,65.530903,27.899271,70.812183,70.623145,49.441549,49.122807,25.058788,6.694927,0.0,0.0
16,30304,Alleghany County Schools,Alleghany High,Alleghany,13.68301,41.986456,100.0,100.0,70.998415,24.784626,85.532995,91.097923,36.566415,50.877193,66.085958,4.961334,0.0,0.0
19,40306,Anson County Schools,Anson High School,Anson,2.280502,15.349887,100.0,100.0,8.874802,81.444665,77.664975,88.130564,29.775094,15.789474,21.608617,84.749828,67.695845,0.0
21,50302,Ashe County Schools,Ashe County High,Ashe,25.199544,53.498871,66.7,100.0,95.562599,19.267727,77.15736,82.492582,53.853405,38.596491,25.207036,5.108142,0.0,0.0


## Create a blended ACT score that considers each ACT attribute.  
* This keeps the multiple ACT attributes from overshadowing the other EA performance indicators in the final score. 

In [457]:
# Take the mean of all the scaled ACT attributes
actCols = ['ACT Composite Score','ACT Score',
           'Pct Met ACT WorkKeys','Pct Met The ACT']
eaScoreData['ACT EA Score'] = eaScoreData[actCols].mean(axis='columns')

**Check that the new score performs as expected:**

In [458]:
eaScoreData[actCols + ['ACT EA Score']]

Unnamed: 0,ACT Composite Score,ACT Score,Pct Met ACT WorkKeys,Pct Met The ACT,ACT EA Score
4,24.287343,50.451467,100.0,83.3,64.509703
5,5.701254,28.668172,0.0,0.0,8.592356
6,1.254276,13.205418,100.0,100.0,53.614923
8,18.814139,52.48307,100.0,50.0,55.324302
9,17.673888,42.099323,100.0,20.0,44.943303
10,27.708096,58.239278,100.0,80.0,66.486843
14,18.700114,50.451467,100.0,100.0,67.287895
16,13.68301,41.986456,100.0,100.0,63.917367
19,2.280502,15.349887,100.0,100.0,54.407597
21,25.199544,53.498871,66.7,100.0,61.349604


In [459]:
# Remove the ACT attributes now that we have a blended score 
eaScoreData.drop(actCols, axis=1, inplace=True)

## Reverse the Direction of Scores when Lower is Better
**The following EA performance indicators should be minimized. In each case we subtract the score from 100 to ensure that all scores move in the same direction:**
 * Chronic Absence 
 * Crime Rate Per 1000
 * Short Term Susp Per 1000 
 * Long Term Susp Per 1000 
 * Expulsion Per 1000

In [460]:
# Reverse the scores
revScoreCols = ['Chronic Absence','Crime Rate Per 1000','Short Term Susp Per 1000',
                'Long Term Susp Per 1000','Expulsion Per 1000']

eaScoreData['Chronic Absence Score'] = 100 - eaScoreData['Chronic Absence']
eaScoreData['Crime Rate Score'] = 100 - eaScoreData['Crime Rate Per 1000']
eaScoreData['Short Term Susp Score'] = 100 - eaScoreData['Short Term Susp Per 1000']
eaScoreData['Long Term Susp Score'] = 100 - eaScoreData['Long Term Susp Per 1000']
eaScoreData['Expulsion Score'] = 100 - eaScoreData['Expulsion Per 1000']

**Check that the new scores perform as expected:**

In [461]:
# Look at the unscaled values
eaScoreDataBefore['Chronic Absence']

4      0.2921
5      0.3446
6      0.4881
8      0.3335
9      0.4135
10     0.2495
14     0.1739
16     0.1551
19     0.4971
21     0.1218
23     0.1889
31     0.1649
32     0.2355
33     0.1862
36     0.3777
37     0.1469
38     0.1277
43     0.3324
44     0.2616
45     0.3972
47     0.1370
50     0.1239
51     0.1530
52     0.1170
53     0.1651
54     0.1557
55     0.0652
56     0.2574
57     0.0647
62     0.1643
63     0.1731
64     0.1341
66     0.0876
67     0.1042
68     0.1575
69     0.2650
70     0.2693
72     0.2309
73     0.2120
74     0.1028
78     0.2948
81     0.0884
84     0.2366
85     0.2616
86     0.2816
87     0.1630
90     0.2235
91     0.2182
93     0.2416
94     0.2738
95     0.1586
96     0.2002
99     0.1435
100    0.2830
101    0.2416
103    0.3405
105    0.3084
108    0.1287
109    0.1428
110    0.1574
117    0.2059
119    0.1824
120    0.1850
121    0.2604
123    0.3150
125    0.2017
127    0.2555
129    0.3723
130    0.2223
132    0.3268
134    0.1925
138   

In [462]:
#Look at the 0 to 100 scaled value and the scaled value reversed
eaScoreData[['Chronic Absence Score','Chronic Absence']]

Unnamed: 0,Chronic Absence Score,Chronic Absence
4,52.518224,47.481776
5,43.820411,56.179589
6,20.046388,79.953612
8,45.659377,54.340623
9,32.405567,67.594433
10,59.575878,40.424122
14,72.100729,27.899271
16,75.215374,24.784626
19,18.555335,81.444665
21,80.732273,19.267727


In [463]:
# Look at the unscaled values
eaScoreDataBefore['Short Term Susp Per 1000']

4        80.605
5       174.387
6       175.966
8       125.000
9       201.054
10      100.179
14       88.235
16       66.826
19     1052.174
21       68.639
23       90.909
31      253.807
32      142.857
33      244.470
36     1007.952
37      154.818
38      118.943
43      110.803
44      108.028
45      123.734
47      146.199
50      141.873
51      245.300
52      190.979
53      155.359
54      174.363
55       10.444
56      168.563
57       88.235
62      164.319
63       72.056
64      120.861
66       40.741
67       95.572
68      146.875
69      252.508
70      194.467
72      148.810
73      205.813
74       82.640
78      229.100
81       17.699
84      147.929
85       77.035
86      143.662
87       79.137
90      183.633
91      197.287
93      253.690
94      257.493
95      118.990
96      100.719
99      239.954
100     181.373
101     180.776
103     227.362
105     213.450
108     111.386
109     334.554
110     116.618
117      31.847
119     154.386
120     

In [464]:
#Look at the 0 to 100 scaled value and the scaled value reversed
eaScoreData[['Short Term Susp Score','Short Term Susp Per 1000']]

Unnamed: 0,Short Term Susp Score,Short Term Susp Per 1000
4,93.922912,6.077088
5,86.32892,13.67108
6,86.201061,13.798939
8,90.328029,9.671971
9,84.169562,15.830438
10,92.337908,7.662092
14,93.305073,6.694927
16,95.038666,4.961334
19,15.250172,84.749828
21,94.891858,5.108142


In [465]:
# Remove the ACT attributes now that we have a blended score 
eaScoreData.drop(revScoreCols, axis=1, inplace=True)

## Create a Blended Disipline Score  
* This keeps multiple disipline attributes from overshadowing the other EA performance indicators in the final score.

In [466]:
# Take the mean of all the scaled ACT attributes
disiplineCols = ['Crime Rate Score','Short Term Susp Score',
                'Long Term Susp Score','Expulsion Score']
eaScoreData['Disipline Score'] = eaScoreData[disiplineCols].mean(axis='columns')

**Check that the new score performs as expected:**

In [467]:
eaScoreData[disiplineCols + ['Disipline Score']]

Unnamed: 0,Crime Rate Score,Short Term Susp Score,Long Term Susp Score,Expulsion Score,Disipline Score
4,69.597047,93.922912,95.642471,100.0,89.790607
5,73.882816,86.32892,92.929136,100.0,88.285218
6,68.004234,86.201061,94.431123,100.0,87.159105
8,65.007114,90.328029,100.0,100.0,88.833786
9,86.909671,84.169562,95.443175,100.0,91.630602
10,79.042949,92.337908,100.0,100.0,92.845214
14,74.941212,93.305073,100.0,100.0,92.061571
16,33.914042,95.038666,100.0,100.0,82.238177
19,78.391383,15.250172,32.304155,100.0,56.486428
21,74.792964,94.891858,100.0,100.0,92.421206


In [468]:
# Remove the individual attributes now that we have a blended score 
eaScoreData.drop(disiplineCols, axis=1, inplace=True)

## Create a Blended Absence Score  
* This keeps multiple attributes from overshadowing the other EA performance indicators in the final score.

In [469]:
# Take the mean of all the scaled ACT attributes
absCols = ['Avg Daily Attendance','Chronic Absence Score']
eaScoreData['Attendance Score'] = eaScoreData[absCols].mean(axis='columns')

**Check that the new score performs as expected:**

In [470]:
# Look at the unscaled values
eaScoreDataBefore[['Avg Daily Attendance','Chronic Absence']]

Unnamed: 0,Avg Daily Attendance,Chronic Absence
4,91.94,0.2921
5,91.42,0.3446
6,88.52,0.4881
8,90.64,0.3335
9,88.76,0.4135
10,91.68,0.2495
14,94.54,0.1739
16,95.23,0.1551
19,87.39,0.4971
21,98.33,0.1218


In [471]:
eaScoreData[absCols + ['Attendance Score']]

Unnamed: 0,Avg Daily Attendance,Chronic Absence Score,Attendance Score
4,44.928685,52.518224,48.723454
5,40.808241,43.820411,42.314326
6,17.828843,20.046388,18.937616
8,34.627575,45.659377,40.143476
9,19.730586,32.405567,26.068076
10,42.868463,59.575878,51.22217
14,65.530903,72.100729,68.815816
16,70.998415,75.215374,73.106895
19,8.874802,18.555335,13.715068
21,95.562599,80.732273,88.147436


In [472]:
# Remove the individual attributes now that we have a blended score 
eaScoreData.drop(absCols, axis=1, inplace=True)

## Create a Blended Graduation Score  
* This keeps multiple attributes from overshadowing the other EA performance indicators in the final score.

In [473]:
# Take the mean of all the scaled ACT attributes
gradCols = ['4yr Graduation Rate','5yr Graduation Rate']
eaScoreData['Graduation Score'] = eaScoreData[gradCols].mean(axis='columns')

**Check that the new score performs as expected:**

In [474]:
# Look at the unscaled values
eaScoreDataBefore[gradCols]

Unnamed: 0,4yr Graduation Rate,5yr Graduation Rate
4,83.2,86.0
5,68.7,75.3
6,73.6,72.2
8,85.9,81.9
9,88.4,84.4
10,88.4,91.0
14,83.5,85.1
16,89.3,92.0
19,86.2,91.0
21,86.0,89.1


In [475]:
eaScoreData[gradCols + ['Graduation Score']]

Unnamed: 0,4yr Graduation Rate,5yr Graduation Rate,Graduation Score
4,70.050761,73.293769,71.672265
5,33.248731,41.543027,37.395879
6,45.685279,32.344214,39.014746
8,76.903553,61.127596,69.015575
9,83.248731,68.545994,75.897363
10,83.248731,88.130564,85.689647
14,70.812183,70.623145,70.717664
16,85.532995,91.097923,88.315459
19,77.664975,88.130564,82.897769
21,77.15736,82.492582,79.824971


In [476]:
# Remove the individual attributes now that we have a blended score 
eaScoreData.drop(gradCols, axis=1, inplace=True)

## Create a mean EA Performance Score from the scaled and reversed EA indicators

In [477]:
# Get only the columns containing EA Performance scores
eaFinalScoreCols = eaScoreData.columns[(eaScoreData.dtypes.values == np.dtype('float64'))] 
eaScoreData['EA Performance Score'] = eaScoreData[eaFinalScoreCols].mean(axis='columns')

In [478]:
pd.set_option('display.max_rows', 500)
eaScoreData.sort_values('EA Performance Score',ascending=False)

Unnamed: 0,unit_code,Lea_Name,School_Name,county,Pct Enrl College,Spg Score,ACT EA Score,Disipline Score,Attendance Score,Graduation Score,EA Performance Score
298,410569,Guilford County Schools,STEM Early College @ NC A&T SU,Guilford,100.0,87.719298,100.0,100.0,100.0,100.0,97.953216
55,110500,Buncombe County Schools,Nesbitt Discovery Academy,Buncombe,83.147183,100.0,99.344356,99.901049,86.021391,100.0,94.735663
249,360418,Gaston County Schools,Highland Sch of Technology,Gaston,91.172334,100.0,91.847206,96.802942,80.939285,100.0,93.460295
602,900366,Union County Schools,Central Academy of Technology and Arts,Union,91.521254,94.736842,90.779026,95.36023,82.718901,100.0,92.519375
606,900393,Union County Schools,Marvin Ridge High,Union,94.957782,92.982456,68.22014,99.508428,90.963228,100.0,91.105339
421,600508,Charlotte-Mecklenburg County Schools,Providence High,Mecklenburg,87.537696,85.964912,87.110423,95.777431,87.645311,100.0,90.672629
594,900311,Union County Schools,Cuthbertson High,Union,85.222653,89.473684,88.815857,95.997464,84.404127,100.0,90.652298
395,600302,Charlotte-Mecklenburg County Schools,Ardrey Kell High,Mecklenburg,89.27289,89.473684,85.130577,98.700801,80.303625,100.0,90.480263
624,920441,Wake County Schools,Green Hope High,Wake,87.007106,80.701754,87.857913,96.929569,84.180955,100.0,89.446216
604,900377,Union County Schools,Weddington High,Union,73.967334,85.964912,91.091436,95.866169,89.243448,100.0,89.35555
