The school year profile `csvs` have been downloaded from the [Chicago Data Portal](https://data.cityofchicago.org/).   

The preprocessing is performed via the SchoolYear class found in the source folder.

In [8]:
import os, sys
# Set absolute path to the root folder of the directory
full_path = os.getcwd()
home_folder = 'CPS_GradRate_Analysis'
root = full_path.split(home_folder)[0] + home_folder + '/'
sys.path.append(root)

In [9]:
%load_ext autoreload
%autoreload 2

from src.preprocessing.preprocessing_schoolid import SchoolYear


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


The SchoolYear class represents data from a CPS school gathered during an individual school year.  

The SchoolYear class takes two arguments: 
  - a path to a School Profile CSV from a given year.
  - a path to a Progress Report CSV from the same year.
  
These files can be obtained from the Chicago Data Portal.


  - [2016-2017 Profile](https://data.cityofchicago.org/Education/Chicago-Public-Schools-School-Profile-Information-/8i6r-et8s)
  - [2017-2018 Profile](https://data.cityofchicago.org/Education/Chicago-Public-Schools-School-Profile-Information-/w4qj-h7bg)
  - [2018-2019 Profile](https://data.cityofchicago.org/Education/Chicago-Public-Schools-School-Profile-Information-/kh4r-387c)

Files should be downloaded and placed in the `data/chicago_data_portal_csv_files` folder

In [11]:
# After downloading the csv's, instantiate a SchoolYear object 
path_to_pr_1819 = '../data/chicago_data_portal_csv_files/Chicago_Public_Schools_-_School_Progress_Reports_SY1819.csv'
path_to_sp_1819 = '../data/chicago_data_portal_csv_files/Chicago_Public_Schools_-_School_Profile_Information_SY1819.csv'
sy_1819 = SchoolYear(path_to_sp_1819, path_to_pr_1819)

The original data has been converted into dataframes, which can be accessed by the `sp_df` and `pr_df` attributes.

In [13]:
sy_1819.sp_df.sample()

Unnamed: 0,School_ID,Legacy_Unit_ID,Finance_ID,Short_Name,Long_Name,Primary_Category,Is_High_School,Is_Middle_School,Is_Elementary_School,Is_Pre_School,Summary,Administrator_Title,Administrator,Secondary_Contact_Title,Secondary_Contact,Address,City,State,Zip,Phone,Fax,CPS_School_Profile,Website,Facebook,Twitter,Youtube,Pinterest,Attendance_Boundaries,Grades_Offered_All,Grades_Offered,Student_Count_Total,Student_Count_Low_Income,Student_Count_Special_Ed,Student_Count_English_Learners,Student_Count_Black,Student_Count_Hispanic,Student_Count_White,Student_Count_Asian,Student_Count_Native_American,Student_Count_Other_Ethnicity,Student_Count_Asian_Pacific_Islander,Student_Count_Multi,Student_Count_Hawaiian_Pacific_Islander,Student_Count_Ethnicity_Not_Available,Statistics_Description,Demographic_Description,Dress_Code,PreK_School_Day,Kindergarten_School_Day,School_Hours,Freshman_Start_End_Time,After_School_Hours,Earliest_Drop_Off_Time,Classroom_Languages,Bilingual_Services,Refugee_Services,Title_1_Eligible,PreSchool_Inclusive,Preschool_Instructional,Significantly_Modified,Hard_Of_Hearing,Visual_Impairments,Transportation_Bus,Transportation_El,Transportation_Metra,School_Latitude,School_Longitude,Average_ACT_School,Mean_ACT,College_Enrollment_Rate_School,College_Enrollment_Rate_Mean,Graduation_Rate_School,Graduation_Rate_Mean,Overall_Rating,Rating_Status,Rating_Statement,Classification_Description,School_Year,Third_Contact_Title,Third_Contact_Name,Fourth_Contact_Title,Fourth_Contact_Name,Fifth_Contact_Title,Fifth_Contact_Name,Sixth_Contact_Title,Sixth_Contact_Name,Seventh_Contact_Title,Seventh_Contact_Name,Network,Is_GoCPS_Participant,Is_GoCPS_PreK,Is_GoCPS_Elementary,Is_GoCPS_High_School,Open_For_Enrollment_Date,Closed_For_Enrollment_Date
423,609759,1840,51091,CLEMENTE HS,Roberto Clemente Community Academy High School,HS,True,False,False,False,Roberto Clemente Community Academy is a wall-t...,Principal,Fernando S Mojica,Assistant Principal,Amber Henderson,1147 N WESTERN AVE,Chicago,Illinois,60622,7735344000.0,7735344000.0,http://cps.edu/Schools/Pages/school.aspx?Schoo...,http://www.rccachicago.org,http://www.facebook.com/rccachicago,http://twitter.com/rccachicago,,,True,9101112,9-12,685,595,168,78,199,465,12,2,3,0,0,3,1,0,There are 685 students enrolled at CLEMENTE HS...,The largest demographic at CLEMENTE HS is Hisp...,True,,,8:00 AM - 3:15 PM,,,7:30 AM,"Spanish, Spanish for Heritage Speakers",True,False,True,,,,,,"49, 70","Blue, Red",,41.902626,-87.686906,,,68.1,68.2,70.0,78.2,Level 2,PROVISIONAL SUPPORT,"This school received a Level 2 rating, which i...",Schools that have an attendance boundary. Gene...,School Year 2018-2019,Assistant Principal,Brad Rossi,IB Coordinator,Andrea Kulas,Case Manager,Thomas Keddy,Spanish-speaking contact,Norma Guzman,,,Network 15,True,False,False,True,09/01/2004 12:00:00 AM,


In [15]:
# For the 2018/2019 school year, there are 660 records and 95 columns in the school profile csv: 
print(sy_1819.sp_df.shape)

(660, 95)


In [17]:
# For the 2018/2019 school year, there are 654 records and 182 columns in the school profile csv: 
print(sy_1819.pr_df.shape)

(654, 182)


The merge_pr_and_sp method merges the Progress Report (pr_df) and School Profile (sp_df) dataframes on School_id.
It is called in the objects __init__ function.  Two copies are made and stored in the parameters:
   - _original_merged_df: a copy that is not change, and is used for reference purposes only
   - merged_df: a dataframe that will be altered.

In [20]:
sy_1819.merged_df.sample()

Unnamed: 0,School_ID,Legacy_Unit_ID,Finance_ID,Short_Name_sp,Long_Name_sp,Primary_Category_sp,Is_High_School,Is_Middle_School,Is_Elementary_School,Is_Pre_School,Summary,Administrator_Title,Administrator,Secondary_Contact_Title,Secondary_Contact,Address_sp,City_sp,State_sp,Zip_sp,Phone_sp,Fax_sp,CPS_School_Profile_sp,Website_sp,Facebook,Twitter,Youtube,Pinterest,Attendance_Boundaries,Grades_Offered_All,Grades_Offered,Student_Count_Total,Student_Count_Low_Income,Student_Count_Special_Ed,Student_Count_English_Learners,Student_Count_Black,Student_Count_Hispanic,Student_Count_White,Student_Count_Asian,Student_Count_Native_American,Student_Count_Other_Ethnicity,Student_Count_Asian_Pacific_Islander,Student_Count_Multi,Student_Count_Hawaiian_Pacific_Islander,Student_Count_Ethnicity_Not_Available,Statistics_Description,Demographic_Description,Dress_Code,PreK_School_Day,Kindergarten_School_Day,School_Hours,Freshman_Start_End_Time,After_School_Hours,Earliest_Drop_Off_Time,Classroom_Languages,Bilingual_Services,Refugee_Services,Title_1_Eligible,PreSchool_Inclusive,Preschool_Instructional,Significantly_Modified,Hard_Of_Hearing,Visual_Impairments,Transportation_Bus,Transportation_El,Transportation_Metra,School_Latitude_sp,School_Longitude_sp,Average_ACT_School,Mean_ACT,College_Enrollment_Rate_School,College_Enrollment_Rate_Mean,Graduation_Rate_School,Graduation_Rate_Mean,Overall_Rating,Rating_Status,Rating_Statement,Classification_Description,School_Year,Third_Contact_Title,Third_Contact_Name,Fourth_Contact_Title,Fourth_Contact_Name,Fifth_Contact_Title,Fifth_Contact_Name,Sixth_Contact_Title,Sixth_Contact_Name,Seventh_Contact_Title,Seventh_Contact_Name,Network,Is_GoCPS_Participant,Is_GoCPS_PreK,Is_GoCPS_Elementary,Is_GoCPS_High_School,Open_For_Enrollment_Date,Closed_For_Enrollment_Date,Short_Name_pr,Long_Name_pr,School_Type,Primary_Category_pr,Address_pr,City_pr,State_pr,Zip_pr,Phone_pr,Fax_pr,CPS_School_Profile_pr,Website_pr,Progress_Report_Year,Blue_Ribbon_Award_Year,Excelerate_Award_Gold_Year,Spot_Light_Award_Year,Improvement_Award_Year,Excellence_Award_Year,Student_Growth_Rating,Student_Growth_Description,Growth_Reading_Grades_Tested_Pct_ES,Growth_Reading_Grades_Tested_Label_ES,Growth_Math_Grades_Tested_Pct_ES,Growth_Math_Grades_Tested_Label_ES,Student_Attainment_Rating,Student_Attainment_Description,Attainment_Reading_Pct_ES,Attainment_Reading_Lbl_ES,Attainment_Math_Pct_ES,Attainment_Math_Lbl_ES,Culture_Climate_Rating,Culture_Climate_Description,School_Survey_Student_Response_Rate_Pct,School_Survey_Student_Response_Rate_Avg_Pct,School_Survey_Teacher_Response_Rate_Pct,School_Survey_Teacher_Response_Rate_Avg_Pct,School_Survey_Parent_Response_Rate_Pct,School_Survey_Parent_Response_Rate_Avg_Pct,Healthy_School_Certification,Healthy_School_Certification_Description,Creative_School_Certification,Creative_School_Certification_Description,NWEA_Reading_Growth_Grade_3_Pct,NWEA_Reading_Growth_Grade_3_Lbl,NWEA_Reading_Growth_Grade_4_Pct,NWEA_Reading_Growth_Grade_4_Lbl,NWEA_Reading_Growth_Grade_5_Pct,NWEA_Reading_Growth_Grade_5_Lbl,NWEA_Reading_Growth_Grade_6_Pct,NWEA_Reading_Growth_Grade_6_Lbl,NWEA_Reading_Growth_Grade_7_Pct,NWEA_Reading_Growth_Grade_7_Lbl,NWEA_Reading_Growth_Grade_8_Pct,NWEA_Reading_Growth_Grade_8_Lbl,NWEA_Math_Growth_Grade_3_Pct,NWEA_Math_Growth_Grade_3_Lbl,NWEA_Math_Growth_Grade_4_Pct,NWEA_Math_Growth_Grade_4_Lbl,NWEA_Math_Growth_Grade_5_Pct,NWEA_Math_Growth_Grade_5_Lbl,NWEA_Math_Growth_Grade_6_Pct,NWEA_Math_Growth_Grade_6_Lbl,NWEA_Math_Growth_Grade_7_Pct,NWEA_Math_Growth_Grade_7_Lbl,NWEA_Math_Growth_Grade_8_Pct,NWEA_Math_Growth_Grade_8_Lbl,NWEA_Reading_Attainment_Grade_2_Pct,NWEA_Reading_Attainment_Grade_2_Lbl,NWEA_Reading_Attainment_Grade_3_Pct,NWEA_Reading_Attainment_Grade_3_Lbl,NWEA_Reading_Attainment_Grade_4_Pct,NWEA_Reading_Attainment_Grade_4_Lbl,NWEA_Reading_Attainment_Grade_5_Pct,NWEA_Reading_Attainment_Grade_5_Lbl,NWEA_Reading_Attainment_Grade_6_Pct,NWEA_Reading_Attainment_Grade_6_Lbl,NWEA_Reading_Attainment_Grade_7_Pct,NWEA_Reading_Attainment_Grade_7_Lbl,NWEA_Reading_Attainment_Grade_8_Pct,NWEA_Reading_Attainment_Grade_8_Lbl,NWEA_Math_Attainment_Grade_2_Pct,NWEA_Math_Attainment_Grade_2_Lbl,NWEA_Math_Attainment_Grade_3_Pct,NWEA_Math_Attainment_Grade_3_Lbl,NWEA_Math_Attainment_Grade_4_Pct,NWEA_Math_Attainment_Grade_4_Lbl,NWEA_Math_Attainment_Grade_5_Pct,NWEA_Math_Attainment_Grade_5_Lbl,NWEA_Math_Attainment_Grade_6_Pct,NWEA_Math_Attainment_Grade_6_Lbl,NWEA_Math_Attainment_Grade_7_Pct,NWEA_Math_Attainment_Grade_7_Lbl,NWEA_Math_Attainment_Grade_8_Pct,NWEA_Math_Attainment_Grade_8_Lbl,School_Survey_Involved_Families,School_Survey_Supportive_Environment,School_Survey_Ambitious_Instruction,School_Survey_Effective_Leaders,School_Survey_Collaborative_Teachers,School_Survey_Safety,Suspensions_Per_100_Students_Year_1_Pct,Suspensions_Per_100_Students_Year_2_Pct,Suspensions_Per_100_Students_Avg_Pct,Misconducts_To_Suspensions_Year_1_Pct,Misconducts_To_Suspensions_Year_2_Pct,Misconducts_To_Suspensions_Avg_Pct,Average_Length_Suspension_Year_1_Pct,Average_Length_Suspension_Year_2_Pct,Average_Length_Suspension_Avg_Pct,Behavior_Discipline_Year_1,Behavior_Discipline_Year_2,School_Survey_School_Community,School_Survey_Parent_Teacher_Partnership,School_Survey_Quality_Of_Facilities,Student_Attendance_Year_1_Pct,Student_Attendance_Year_2_Pct,Student_Attendance_Avg_Pct,Teacher_Attendance_Year_1_Pct,Teacher_Attendance_Year_2_Pct,Teacher_Attendance_Avg_Pct,One_Year_Dropout_Rate_Year_1_Pct,One_Year_Dropout_Rate_Year_2_Pct,One_Year_Dropout_Rate_Avg_Pct,Other_Metrics_Year_1,Other_Metrics_Year_2,Freshmen_On_Track_School_Pct_Year_2,Freshmen_On_Track_CPS_Pct_Year_2,Freshmen_On_Track_School_Pct_Year_1,Freshmen_On_Track_CPS_Pct_Year_1,Graduation_4_Year_School_Pct_Year_2,Graduation_4_Year_CPS_Pct_Year_2,Graduation_4_Year_School_Pct_Year_1,Graduation_4_Year_CPS_Pct_Year_1,Graduation_5_Year_School_Pct_Year_2,Graduation_5_Year_CPS_Pct_Year_2,Graduation_5_Year_School_Pct_Year_1,Graduation_5_Year_CPS_Pct_Year_1,College_Enrollment_School_Pct_Year_2,College_Enrollment_CPS_Pct_Year_2,College_Enrollment_School_Pct_Year_1,College_Enrollment_CPS_Pct_Year_1,College_Persistence_School_Pct_Year_2,College_Persistence_CPS_Pct_Year_2,College_Persistence_School_Pct_Year_1,College_Persistence_CPS_Pct_Year_1,Progress_Toward_Graduation_Year_1,Progress_Toward_Graduation_Year_2,State_School_Report_Card_URL,Mobility_Rate_Pct,Chronic_Truancy_Pct,Empty_Progress_Report_Message,School_Survey_Rating_Description,Supportive_School_Award,Supportive_School_Award_Desc,Parent_Survey_Results_Year,School_Latitude_pr,School_Longitude_pr,PSAT_Grade_9_Score_School_Avg,PSAT_Grade_10_Score_School_Avg,SAT_Grade_11_Score_School_Avg,SAT_Grade_11_Score_CPS_Avg,Growth_PSAT_Grade_9_School_Pct,Growth_PSAT_Grade_9_School_Lbl,Growth_PSAT_Reading_Grade_10_School_Pct,Growth_PSAT_Reading_Grade_10_School_Lbl,Growth_SAT_Grade_11_School_Pct,Growth_SAT_Grade_11_School_Lbl,Attainment_PSAT_Grade_9_School_Pct,Attainment_PSAT_Grade_9_School_Lbl,Attainment_PSAT_Grade_10_School_Pct,Attainment_PSAT_Grade_10_School_Lbl,Attainment_SAT_Grade_11_School_Pct,Attainment_SAT_Grade_11_School_Lbl,Attainment_All_Grades_School_Pct,Attainment_All_Grades_School_Lbl,Growth_PSAT_Math_Grade_10_School_Pct,Growth_PSAT_Math_Grade_10_School_Lbl,Growth_SAT_Reading_Grade_11_School_Pct,Growth_SAT_Reading_Grade_11_School_Lbl,Growth_SAT_Math_Grade_11_School_Pct,Growth_SAT_Math_Grade_11_School_Lbl
268,610237,6540,25931,BEETHOVEN,Ludwig Van Beethoven Elementary School,ES,False,True,True,True,Beethoven Elementary School services the stude...,Principal,Mellodie L Brown,Assistant Principal,Mrs. LaVerne Wright,25 W 47TH ST,Chicago,Illinois,60609,7735351000.0,7735351000.0,http://cps.edu/Schools/Pages/school.aspx?Schoo...,http://beethovenschool.weebly.com/,,,,,True,"PK,K,1,2,3,4,5,6,7,8","PK,K-8",315,257,49,0,309,5,1,0,0,0,0,0,0,0,There are 315 students enrolled at BEETHOVEN. ...,The largest demographic at BEETHOVEN is Black....,True,Full Day,Full Day,8:45 AM - 3:45 PM,,3:45 PM -5:30 PM,8:35 AM,,True,,True,,Y,,,,47,Red,,41.809135,-87.627137,,,,68.2,,78.2,Level 2,INTENSIVE SUPPORT,"This school received a Level 2 rating, which i...",Schools that have an attendance boundary. Gene...,School Year 2018-2019,Counselor,Ms. Devona Hazelwood,,,Clerk,Mr. Gustavo Del Real,,,,,Network 9,True,False,True,False,09/01/2004 12:00:00 AM,,BEETHOVEN,Ludwig Van Beethoven Elementary School,Neighborhood,ES,25 W 47TH ST,Chicago,Illinois,60609,7735351000.0,7735351000.0,http://cps.edu/Schools/Pages/school.aspx?Schoo...,https://www.beethovenelementarycps.org/,2018,,2018.0,,,,AVERAGE,Student Growth measures the change in standard...,60.0,60th,37.0,37th,BELOW AVERAGE,Student Attainment measures how well the schoo...,22.0,22nd,8.0,8th,PARTIALLY ORGANIZED,Results are based on student and teacher respo...,63.9,81.4,77.8,79.9,38%,35.6,Not Achieved,Students learn better at healthy schools! This...,EXCELLING,This school is Excelling in the arts. It meets...,99.0,99th,57.0,57th,6.0,6th,76.0,76th,71.0,71st,62.0,62nd,43.0,43rd,22.0,22nd,7.0,7th,85.0,85th,41.0,41st,78.0,78th,7.0,7th,16.0,16th,22.0,22nd,30.0,30th,37.0,37th,17.0,17th,32.0,32nd,2.0,2nd,3.0,3rd,6.0,6th,5.0,5th,18.0,18th,3.0,3rd,21.0,21st,STRONG,WEAK,STRONG,WEAK,WEAK,WEAK,8.8,0.3,5.6,16.3,1.8,13.5,1.7 days,1.0 days,2.0 days,2018.0,2017.0,WEAK,STRONG,NEUTRAL,91.3,92.6,93.3,92.1,93.6,95.0,,,6.4,2017.0,2018.0,,89.4,,88.7,,75.6,,74.7,,78.2,,77.5,,68.2,,59.8,,72.3,,71.9,2017.0,2018.0,http://iirc.niu.edu/School.aspx?schoolid=15016...,27.0,61.9,,This school is “Partially Organized for Improv...,EMERGING,This school has developed an action plan to su...,2018.0,41.809135,-87.627137,,,,969.0,,,,,,,,,,,,,,,,,,,,


In [22]:
# As one would expect, the merged dataframe has 651 rows and 276 columns:
sy_1819.merged_df.shape

(651, 276)

# Isolate Important Columns



The preprocessing function, isolate_important_columns, reduces the number of columns in the datasets from 92 - 20.

In [5]:
from src.preprocessing.preprocessing import isolate_important_columns

df_dict = {year: isolate_important_columns(df_dict[year]) for year in df_dict}
df_dict['2017-2018']

Unnamed: 0,School_ID,Short_Name,Graduation_Rate_School,Student_Count_Total,Student_Count_Low_Income,Student_Count_Special_Ed,Student_Count_English_Learners,Student_Count_Black,Student_Count_Hispanic,Student_Count_White,Student_Count_Asian,Student_Count_Native_American,Student_Count_Other_Ethnicity,Student_Count_Asian_Pacific_Islander,Student_Count_Multi,Student_Count_Hawaiian_Pacific_Islander,Student_Count_Ethnicity_Not_Available,Is_High_School,Dress_Code,Classroom_Languages,Transportation_El
0,610521,DAVIS M,,237,227,45,10,216,20,0,0,0,0,0,1,0,0,N,Y,,
1,609750,SIMPSON HS,23.1,34,25,3,3,25,9,0,0,0,0,0,0,0,0,Y,N,Spanish,Pink
2,610386,PEACE AND EDUCATION HS,,94,58,16,13,29,62,2,0,0,0,0,1,0,0,Y,Y,"French, Spanish",
3,400123,YCCS - SCHOLASTIC ACHIEVEMENT,,172,77,33,1,165,6,0,0,1,0,0,0,0,0,Y,Y,,Green
4,400116,MONTESSORI ENGLEWOOD,,333,191,48,2,323,9,1,0,0,0,0,0,0,0,N,N,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
656,610030,KOZMINSKI,,257,133,27,5,241,8,1,0,2,0,0,4,1,0,N,Y,,
657,610197,TALCOTT,,473,267,81,196,34,386,34,2,3,0,0,13,1,0,N,Y,Spanish,Orange
658,610084,KELLER,,227,56,2,5,81,25,52,60,0,0,0,6,2,1,N,N,Spanish,Orange
659,609711,HARPER HS,55.9,87,73,23,1,84,3,0,0,0,0,0,0,0,0,Y,Y,Spanish,Blue


After this reduction, the following columns are left:

  - School_ID
  - Graduation_Rate_School
  - Student_Count_Total
  - Student_Count_Low_Income
  - Student_Count_Special_Ed
  - Student_Count_English_Learners
  - 10 Columns Counting Populations of Different Ethnicities
  - **Is_High_School**
  - Dress_Code
  - Classroom_Languages
  - Transportation_El
  
The bolded columns require preprocessing, which is shown below.

# Is_High_School

The school profiles for 2016-2017 and 2017-2018 encode `Is_High_School` as 'Y/N', whereas 2018-2019 encodes it as 'True/False'.  

The function below converts Y/N to True/False to ensure consistency.

In [6]:
from src.preprocessing.preprocessing import convert_is_high_school_to_bool

df_dict = {year: convert_is_high_school_to_bool(df_dict[year]) for year in df_dict}
df_dict['2016-2017']['Is_High_School']

0      False
1       True
2       True
3       True
4      False
       ...  
656    False
657     True
658     True
659     True
660    False
Name: Is_High_School, Length: 661, dtype: bool

# Dress_Code

The same conversions are applied to the Dress_Code column

In [7]:
from src.preprocessing.preprocessing import convert_dress_code_to_bool

df_dict = {year: convert_dress_code_to_bool(df_dict[year]) for year in df_dict}
df_dict['2016-2017']['Dress_Code']

0      False
1       True
2      False
3       True
4       True
       ...  
656     True
657    False
658     True
659     True
660     True
Name: Dress_Code, Length: 661, dtype: bool

In [8]:
# Add Year column to dataframes

In [11]:
df_dict['2018-2019']

Unnamed: 0,School_ID,Short_Name,Graduation_Rate_School,Student_Count_Total,Student_Count_Low_Income,Student_Count_Special_Ed,Student_Count_English_Learners,Student_Count_Black,Student_Count_Hispanic,Student_Count_White,Student_Count_Asian,Student_Count_Native_American,Student_Count_Other_Ethnicity,Student_Count_Asian_Pacific_Islander,Student_Count_Multi,Student_Count_Hawaiian_Pacific_Islander,Student_Count_Ethnicity_Not_Available,Is_High_School,Dress_Code,Classroom_Languages,Transportation_El
0,400172,ASPIRA - BUSINESS & FINANCE HS,,633,414,130,195,17,597,10,4,1,0,0,4,0,0,True,True,"Spanish, Spanish for Heritage Speakers",Blue
1,609794,EDISON,,267,22,10,1,11,22,160,43,1,0,0,29,1,0,False,False,French,Brown
2,609780,MARINE LEADERSHIP AT AMES HS,,847,825,79,158,17,817,8,2,1,0,0,2,0,0,True,True,Spanish,
3,400039,ERIE,,415,325,69,197,56,342,7,1,2,0,0,6,1,0,False,True,Spanish,
4,610590,BRONZEVILLE CLASSICAL,,90,24,2,1,55,6,8,16,0,0,0,5,0,0,False,False,,"Green, Red"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
655,400022,CHIARTS HS,84.4,606,220,38,5,205,214,120,11,2,0,0,15,1,38,True,False,"French, Spanish","Blue, Red"
656,610383,SOCIAL JUSTICE HS,81.0,304,290,59,74,37,264,2,0,1,0,0,0,0,0,True,True,"French, Spanish, Spanish for Heritage Speakers",
657,610589,SOR JUANA,,92,44,5,21,6,75,6,2,1,0,0,1,0,1,False,False,,
658,400130,YCCS - YOUTH DEVELOPMENT,,96,94,27,0,93,0,0,0,2,0,0,0,0,1,True,True,,


In [9]:
# Interesting: primary category would be a good feature to change to Primary_Is_High_School.  
# This would give a signal of whether a school is a specifically a high school.
df_hs['2018-2019']['Primary_Category']

NameError: name 'df_hs' is not defined