#                                             2016-2019 Public School Comparison by California's Smarter Balanced Assessment Results

## California Public Schools Information 2016-2019

Resource: California Department of Education

Link: https://www.cde.ca.gov/ds/si/ds/pubschls.asp

In [69]:
import pandas as pd

In [70]:
pubschls = "pubschls.csv"
public_schools = pd.read_csv(pubschls, low_memory=False)

In [71]:
public_schools.columns

Index(['CDSCode', 'NCESDist', 'NCESSchool', 'StatusType', 'County', 'District',
       'School', 'Street', 'StreetAbr', 'City', 'Zip', 'State', 'MailStreet',
       'MailStrAbr', 'MailCity', 'MailZip', 'MailState', 'Phone', 'Ext',
       'WebSite', 'OpenDate', 'ClosedDate', 'Charter', 'CharterNum',
       'FundingType', 'DOC', 'DOCType', 'SOC', 'SOCType', 'EdOpsCode',
       'EdOpsName', 'EILCode', 'EILName', 'GSoffered', 'GSserved', 'Virtual',
       'Magnet', 'YearRoundYN', 'FederalDFCDistrictID', 'Latitude',
       'Longitude', 'AdmFName1', 'AdmLName1', 'AdmEmail1', 'AdmFName2',
       'AdmLName2', 'AdmEmail2', 'AdmFName3', 'AdmLName3', 'AdmEmail3',
       'LastUpDate'],
      dtype='object')

In [72]:
# Number of schools in raw data
public_schools["School"].count()

18104

In [73]:
# Check year range of raw data
public_schools["LastUpDate"]

0        5/31/2019
1         9/1/2015
2         7/1/2019
3         7/1/2015
4        2/13/2019
           ...    
18100    6/24/1999
18101    6/24/1999
18102     7/2/2013
18103    2/13/2019
18104          NaN
Name: LastUpDate, Length: 18105, dtype: object

In [74]:
#Clean Nan and No Data values of date column
public_schools = public_schools.loc[public_schools['LastUpDate']!= 'No Data']
public_schools['LastUpDate']= public_schools['LastUpDate'].dropna()

In [75]:
# Convert datetime to get most current information
public_schools['time'] = pd.to_datetime(public_schools['LastUpDate'])

In [76]:
# Create a 'year' column to filter by years
public_schools['year'] = public_schools['time'].dt.year 

In [77]:
# Number of public schools information last 4 year
len(public_schools.loc[public_schools["year"] > 2016])

11881

In [78]:
# Create a new dataframe of public school information since 2016
ps_since2016 = public_schools.loc[public_schools["year"] > 2015]

In [79]:
# Check status types of schools
ps_since2016["StatusType"].unique()

array(['Active', 'Closed', 'Merged', 'Pending'], dtype=object)

In [80]:
# Closed, merged and nan rows should be cleaned/We will work on Active/Merged schools

In [81]:
active_ps = ps_since2016[ps_since2016["StatusType"].isin(["Active", "Merged"])]

In [82]:
# Number of Active and Merged Public Schools after 2016
active_ps["School"].count()

11600

### Magnet Schools

A magnet school is an entire school with a special focus on a special area of study, such as science, the performing arts, or career education.

Resource: https://www.cde.ca.gov/sp/eo/mt/index.asp

In [83]:
active_ps["Magnet"].unique()

array(['No Data', 'N', 'Y'], dtype=object)

In [84]:
active_ps = active_ps.loc[active_ps['Magnet']!= 'No Data']

In [85]:
# Number of public schools are in Magnet Program
len(active_ps.loc[active_ps["Magnet"] == "Y"])

531

In [86]:
active_ps["Magnet"].unique()

array(['N', 'Y'], dtype=object)

### Charter Schools

A charter school is a public school governed by a contract (“charter”) between the school’s operators and a chartering authority. The chartering authority, also known as the authorizing local educational agency (LEA), can be a school district, county office of education, or the State Board of Education (SBE).

Resource: https://www.ed-data.org/article/Charter-Schools-in-California

In [87]:
active_ps["Charter"].unique()

array(['Y', 'N'], dtype=object)

In [88]:
public_schools = public_schools.loc[public_schools['Charter']!= 'No Data']

In [89]:
# Number of public schools are Charter school
len(active_ps.loc[active_ps["Charter"] == "Y"])


1305

###  Year Round Education Program

Year-round education (YRE) is not a typical alternative way to deliver the curriculum. It is, however, an alternative way to construct the school calendar. Both traditional and some year-round school calendars can have 180 days of instruction. The traditional calendar, of course, is divided into nine months of instruction and three months of vacation during the summer. Year-round calendars break these long instructional/vacation blocks into shorter units. The most typical instructional/vacation year-round pattern is called the 60/20 calendar (60 days of instruction followed by 20 days of vacation and the second most popular is the 45/15. There are numerous other possible patterns, but they are not common. 

Resource: https://www.cde.ca.gov/ls/fa/yr/guide.asp

In [90]:
len(active_ps.loc[active_ps["YearRoundYN"] == "Y"])

709

In [91]:
active_ps = active_ps.loc[active_ps['YearRoundYN']!= 'No Data']

### Directly Funded vs Locally Funded Schools

Direct funding is funding that is provided to an organization directly by a governmental entity or intermediate organization that has the same duties as a government entity.

https://www.hhs.gov/answers/grants-and-contracts/what-is-the-difference-between-indirect-direct-funding/index.html

Locally Funded. Locally funded schools receive funding through their authorizing district or county office. Districts sometimes refer to these schools as dependent charter schools.
Direct Funded. Directly funded schools receive funding directly from the state. Districts sometimes refer to these schools as independent charter schools. Method Schools is a direct-funded network of charter schools.

https://www.methodschools.org/blog/how-charter-schools-are-funded-accomplishing-more-with-less

In [92]:
# We can compare success of directly funded and locally funded public schools
# and see the difference between funded schools and no data/nan/not in CS funding models columns

active_ps["FundingType"].unique()

array(['Directly funded', 'No Data', 'Locally funded'], dtype=object)

In [93]:
active_ps = active_ps.loc[active_ps['FundingType']!= 'No Data']

### The Educational Instruction Levels

A – Adult

ELEM – Elementary

ELEMHIGH – Elementary-High Combination

HS – High School

INTMIDJR – Intermediate/Middle/Junior High

PS – Preschool

UG – Ungraded

In [96]:
active_ps["EILCode"].unique()

array(['HS', 'ELEM', 'INTMIDJR', 'ELEMHIGH', 'UG'], dtype=object)

In [97]:
active_ps = active_ps.loc[active_ps['EILCode']!= 'No Data']

In [98]:
active_ps = active_ps.loc[active_ps['EILName']!= 'No Data']

In [99]:
active_ps["EILName"].unique()

array(['High School', 'Elementary', 'Intermediate/Middle/Junior High',
       'Elementary-High Combination', 'Ungraded'], dtype=object)

In [100]:
public_schools20162020 = active_ps[["CDSCode", "StatusType", "County", "District", "School","EILName", "OpenDate",
                                 "Charter","Magnet", "YearRoundYN", "FundingType"]]
public_schools20162020.head(20)

Unnamed: 0,CDSCode,StatusType,County,District,School,EILName,OpenDate,Charter,Magnet,YearRoundYN,FundingType
2,1100170112607,Active,Alameda,Alameda County Office of Education,Envision Academy for Arts & Technology,High School,8/28/2006,Y,N,N,Directly funded
4,1100170123968,Active,Alameda,Alameda County Office of Education,Community School for Creative Education,Elementary,8/22/2011,Y,N,N,Directly funded
5,1100170124172,Active,Alameda,Alameda County Office of Education,Yu Ming Charter,Elementary,8/9/2011,Y,N,N,Directly funded
6,1100170125567,Active,Alameda,Alameda County Office of Education,Urban Montessori Charter,Elementary,8/27/2012,Y,N,N,Directly funded
7,1100170129403,Active,Alameda,Alameda County Office of Education,Epic Charter,Intermediate/Middle/Junior High,8/25/2014,Y,N,N,Directly funded
13,1100170131581,Active,Alameda,Alameda County Office of Education,Oakland Unity Middle,Intermediate/Middle/Junior High,8/23/2015,Y,N,N,Directly funded
15,1100170136101,Active,Alameda,Alameda County Office of Education,Connecting Waters Charter - East Bay,Elementary-High Combination,8/16/2017,Y,N,N,Directly funded
16,1100170136226,Active,Alameda,Alameda County Office of Education,Opportunity Academy,High School,9/5/2017,Y,N,Y,Locally funded
18,1100170137448,Active,Alameda,Alameda County Office of Education,Aurum Preparatory Academy,Elementary,8/13/2018,Y,N,N,Directly funded
21,1100170138867,Active,Alameda,Alameda County Office of Education,Hayward Collegiate Charter,Elementary,8/19/2019,Y,N,N,Directly funded


In [90]:

public_schools20162020.to_csv(r'C:\Users\Rabia\Desktop\public_schools20162020.csv', index=False) 

## California Assessment of Student Performance and Progress(CASPP) 2016-2019 Reports                                                                          

In [153]:
data2016 = pd.read_csv("sb_ca2016_all_csv_v3.txt")

In [155]:
data2016.columns

Index(['County Code', 'District Code', 'School Code', 'Filler', 'Test Year',
       'Subgroup ID', 'Test Type', 'Total CAASPP Enrollment',
       'Total Tested At Entity Level', 'Total Tested with Scores', 'Grade',
       'Test Id', 'CAASPP Reported Enrollment', 'Students Tested',
       'Mean Scale Score', 'Percentage Standard Exceeded',
       'Percentage Standard Met', 'Percentage Standard Met and Above',
       'Percentage Standard Nearly Met', 'Percentage Standard Not Met',
       'Students with Scores', 'Area 1 Percentage Above Standard',
       'Area 1 Percentage Near Standard', 'Area 1 Percentage Below Standard',
       'Area 2 Percentage Above Standard', 'Area 2 Percentage Near Standard',
       'Area 2 Percentage Below Standard', 'Area 3 Percentage Above Standard',
       'Area 3 Percentage Near Standard', 'Area 3 Percentage Below Standard',
       'Area 4 Percentage Above Standard', 'Area 4 Percentage Near Standard',
       'Area 4 Percentage Below Standard'],
      dtype='o

In [156]:
data2016["CDSCode"]=data2016["County Code"].astype(str)+data2016["District Code"].astype(str)+data2016["School Code"].astype(str)

In [167]:
data2016 = data2016[["CDSCode","Test Id","Grade","Students Tested","Mean Scale Score","Subgroup ID",
                     "Percentage Standard Exceeded","Percentage Standard Met","Percentage Standard Nearly Met","Percentage Standard Not Met"]]

### Achievement Level Descriptors
the specifications for what knowledge and skills students display at each level (i.e., Level 1, Level 2, Level 3, and Level 4).
Reporting Achievement Student test results are reported in the following overall achievement levels: 
##### • Level 4—Standard Exceeded
means the student has surpassed the achievement standard and demonstrates advanced progress toward 
mastering the skills and knowledge necessary to succeed in future coursework. 
In grades 6-8 and 11, this achievement level also indicates that a student has demonstrated advanced
progress toward college readiness after graduation.
##### • Level 3—Standard Met
means the student demonstrates skills and knowledge that are likely necessary to succeed in future coursework. 
In grades 6-8 and 11, this achievement level indicates that the student has demonstrated progress toward 
mastering skills and knowledge needed to be ready for college after graduation.
##### • Level 2—Standard Nearly Met 
means the student is close to meeting the achievement standard and may need further development
                               to demonstrate skills and knowledge required for future coursework. In grades 6-8 and 11,
                               a score in this range indicates further development may be needed to succeed in entry-level
                               college courses after graduation.
##### • Level 1—Standard Not Met 
means the student must improve substantially to demonstrate the skills and knowledge needed to 
                            succeed in future coursework. In grades 6-8 and 11, a score in this range indicates the student needs to 
                            improve substantially to be ready for college after graduation.

https://edsource.org/2015/california-smarter-balanced-math-english-results-common-core-faq/86181

In [157]:
data2016[["Percentage Standard Exceeded","Percentage Standard Met","Percentage Standard Nearly Met","Percentage Standard Not Met"]]

Unnamed: 0,Percentage Standard Exceeded,Percentage Standard Met,Percentage Standard Nearly Met,Percentage Standard Not Met
0,18,28,26,29
1,22,21,25,32
2,23,21,20,36
3,15,23,33,28
4,17,16,28,39
...,...,...,...,...
3116790,7,11,32,51
3116791,*,*,*,*
3116792,*,*,*,*
3116793,*,*,*,*


In [158]:
data2016["Mean Scale Score"].head()
# A mean scale score is the average performance of a group of students on an assessment. 
# Specifically, a mean scale score is calculated by adding all individual student scores and dividing by the number of total scores.

0    2424.7
1    2414.2
2    2454.4
3    2460.5
4    2485.2
Name: Mean Scale Score, dtype: object

In [159]:
data2016["Students Tested"].head()

0    461013
1    458658
2    474588
3    476795
4    467426
Name: Students Tested, dtype: int64

In [160]:
data2016["Grade"].unique()

array([ 3,  4,  5,  6,  7,  8, 11, 13], dtype=int64)

In [161]:
data2016["Test Id"].unique() 
# "1" - English Language Arts/Literacy"
# "2" - Mathematics"

array([2, 1], dtype=int64)

In [162]:
data2016["Subgroup ID"].unique()

array([  1,   3,   4,   6,   7,   8,  28,  31,  74,  75,  76,  77,  78,
        79,  80,  90,  91,  92,  93,  94,  99, 111, 120, 121, 128, 142,
       144, 160, 180, 200, 201, 202, 203, 204, 205, 206, 207, 220, 221,
       222, 223, 224, 225, 226, 227], dtype=int64)

In [163]:
# "001", 1, "All Students", "All Students"
# "003", 3, "Male", "Gender"
# "004", 4, "Female", "Gender"
# "006", 6, "Fluent English proficient and English only", "English-Language Fluency"
# "007", 7, "Initial fluent English proficient (IFEP)", "English-Language Fluency"
# "008", 8, "Reclassified fluent English proficient (RFEP)", "English-Language Fluency"
# "028", 28, "Migrant education", "Migrant"
# "031", 31, "Economically disadvantaged", "Economic Status"
# "074", 74, "Black or African American", "Ethnicity"
# "075", 75, "American Indian or Alaska Native", "Ethnicity"
# "076", 76, "Asian", "Ethnicity"
# "077", 77, "Filipino", "Ethnicity"
# "078", 78, "Hispanic or Latino", "Ethnicity"
# "079", 79, "Native Hawaiian or Pacific Islander", "Ethnicity"
# "080", 80, "White", "Ethnicity"
# "090", 90, "Not a high school graduate", "Parent Education"
# "091", 91, "High school graduate", "Parent Education"
# "092", 92, "Some college (includes AA degree)", "Parent Education"
# "093", 93, "College graduate", "Parent Education"
# "094", 94, "Graduate school/Post graduate", "Parent Education"
# "099", 99, "Students with no reported disability", "Disability Status"
# "111", 111, "Not economically disadvantaged", "Economic Status"
# "120", 120, "English learners (ELs) enrolled in school in the U.S. fewer than 12 months", "English-Language Fluency"
# "121", 121, "Declined to state", "Parent Education"
# "128", 128, "Students with disability", "Disability Status"
# "142", 142, "English learners enrolled in school in the U.S. 12 months or more", "English-Language Fluency"
# "144", 144, "Two or more races", "Ethnicity"
# "160", 160, "English learner", "English-Language Fluency"
# "170", 170, "Ever-ELs", "English-Language Fluency"
# "180", 180, "English only", "English-Language Fluency"
# "190", 190, "To be determined (TBD)", "English-Language Fluency"
# "200", 200, "Black or African American", "Ethnicity for Economically Disadvantaged"
# "201", 201, "American Indian or Alaska Native", "Ethnicity for Economically Disadvantaged"
# "202", 202, "Asian", "Ethnicity for Economically Disadvantaged"
# "203", 203, "Filipino", "Ethnicity for Economically Disadvantaged"
# "204", 204, "Hispanic or Latino", "Ethnicity for Economically Disadvantaged"
# "205", 205, "Native Hawaiian or Pacific Islander", "Ethnicity for Economically Disadvantaged"
# "206", 206, "White", "Ethnicity for Economically Disadvantaged"
# "207", 207, "Two or more races", "Ethnicity for Economically Disadvantaged"
# "220", 220, "Black or African American", "Ethnicity for Not Economically Disadvantaged"
# "221", 221, "American Indian or Alaska Native", "Ethnicity for Not Economically Disadvantaged"
# "222", 222, "Asian", "Ethnicity for Not Economically Disadvantaged"
# "223", 223, "Filipino", "Ethnicity for Not Economically Disadvantaged"
# "224", 224, "Hispanic or Latino", "Ethnicity for Not Economically Disadvantaged"
# "225", 225, "Native Hawaiian or Pacific Islander", "Ethnicity for Not Economically Disadvantaged"
# "226", 226, "White", "Ethnicity for Not Economically Disadvantaged"
# "227", 227, "Two or more races", "Ethnicity for Not Economically Disadvantaged"


In [None]:
# Apply same steps to years 2017, 2018 and 2019 datasets.

In [168]:
data2017 = pd.read_csv("sb_ca2017_all_csv_v2.txt")
data2017["CDSCode"]=data2017["County Code"].astype(str)+data2017["District Code"].astype(str)+data2017["School Code"].astype(str)
data2017 = data2017[["CDSCode","Test Id","Grade","Students Tested","Mean Scale Score","Subgroup ID",
                     "Percentage Standard Exceeded","Percentage Standard Met","Percentage Standard Nearly Met","Percentage Standard Not Met"]]

In [165]:
data2018 = pd.read_csv("sb_ca2018_all_csv_v3.txt")
data2018["CDSCode"]=data2018["County Code"].astype(str)+data2018["District Code"].astype(str)+data2018["School Code"].astype(str)
data2018 = data2018[["CDSCode","Test Id","Grade","Students Tested","Mean Scale Score","Subgroup ID",
                     "Percentage Standard Exceeded","Percentage Standard Met","Percentage Standard Nearly Met","Percentage Standard Not Met"]]

In [166]:
data2019 = pd.read_csv("sb_ca2019_all_csv_v4.txt")
data2019["CDSCode"]=data2019["County Code"].astype(str)+data2019["District Code"].astype(str)+data2019["School Code"].astype(str)
data2019 = data2019[["CDSCode","Test Id","Grade","Students Tested","Mean Scale Score","Subgroup ID",
                     "Percentage Standard Exceeded","Percentage Standard Met","Percentage Standard Nearly Met","Percentage Standard Not Met"]]

In [94]:
path = "combined_data.csv"
combined_data = pd.read_csv(path)
combined_data.columns

Index(['CDSCode', 'CAASPP Reported Enrollment', 'Students Tested',
       'Percentage Standard Exceeded', 'Percentage Standard Met',
       'Percentage Standard Met and Above', 'Percentage Standard Nearly Met',
       'Percentage Standard Not Met', 'Students with Scores', 'School',
       'County', 'Charter', 'Magnet', 'EILCode'],
      dtype='object')