Written analysis will be included at the bottom.

In [1]:
# Import pandas so that we can actually use dataframes.
import pandas as pd

In [2]:
# Save filepath to dataset.
school_data_path = "Resources/PyCitySchools/Resources/schools_complete.csv"
student_data_path = "Resources/PyCitySchools/Resources/students_complete.csv"

In [3]:
# Store data in csv files into Pandas Dataframes
school_data = pd.read_csv(school_data_path)
student_data = pd.read_csv(student_data_path)

In [4]:
# See what they look like.
school_data.head()

Unnamed: 0,School ID,school_name,type,size,budget
0,0,Huang High School,District,2917,1910635
1,1,Figueroa High School,District,2949,1884411
2,2,Shelton High School,Charter,1761,1056600
3,3,Hernandez High School,District,4635,3022020
4,4,Griffin High School,Charter,1468,917500


In [5]:
student_data.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score
0,0,Paul Bradley,M,9th,Huang High School,66,79
1,1,Victor Smith,M,12th,Huang High School,94,61
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58
4,4,Bonnie Ray,F,9th,Huang High School,97,84


In [6]:
# Merge the 2 dataframes into 1. Let's remove any rows with null values.
school_data_df = pd.merge(student_data, school_data, how="left", on=["school_name", "school_name"])
school_data_df.dropna()
school_data_df.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635


District Summary

In [7]:
school_data_df.dtypes

Student ID        int64
student_name     object
gender           object
grade            object
school_name      object
reading_score     int64
math_score        int64
School ID         int64
type             object
size              int64
budget            int64
dtype: object

In [8]:
# Find the total number of unique schools.
num_schools = len(school_data_df["school_name"].value_counts())
school_data_df["type"].value_counts()       

District    26976
Charter     12194
Name: type, dtype: int64

In [9]:
# Let's put each student into pass/fail bins for math and reading separately.
bins = [0,69.99,100]
bin_labels = ["Fail","Pass"]

school_data_df["Math Pass/Fail"] = pd.cut(school_data_df["math_score"],bins,labels=bin_labels, include_lowest=True)
school_data_df["Reading Pass/Fail"] = pd.cut(school_data_df["reading_score"],bins,labels=bin_labels, include_lowest=True)
school_data_df.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget,Math Pass/Fail,Reading Pass/Fail
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635,Pass,Fail
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635,Fail,Pass
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635,Fail,Pass
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635,Fail,Fail
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635,Pass,Pass


In [10]:
# The only school type options are District and Charter in this dataset.
district_school_data_df = school_data_df.loc[school_data_df["type"]=="District",:]
num_district_schools = len(district_school_data_df["school_name"].value_counts())
charter_school_data_df = school_data_df.loc[school_data_df["type"]=="Charter",:]
num_charter_schools = len(charter_school_data_df["school_name"].value_counts())

In [11]:
# Find the total number of students.
num_students = len(school_data_df["student_name"].value_counts())
num_district_students = len(school_data_df.loc[school_data_df["type"]=="District",:])
num_charter_students = len(school_data_df.loc[school_data_df["type"]=="Charter",:])

In [12]:
# Find the total budget.
schools_grouped = school_data_df.groupby("school_name")
total_budget = schools_grouped.mean()["budget"].sum()

In [13]:
# Find the total budget for District schools.
district_schools_grouped = district_school_data_df.groupby("school_name")
district_total_budget = district_schools_grouped.mean()["budget"].sum()

In [14]:
# Find total budget for Charter schools. It'll be the remainder of the previous two values and that's easier.
charter_total_budget = total_budget - district_total_budget

In [15]:
# Find the average math score for all schools.
avg_math_score = school_data_df["math_score"].mean()

In [16]:
# Find the average math score for District schools.
avg_district_math_score = district_school_data_df["math_score"].mean()

In [17]:
# Find the average math score for Charter schools.
avg_charter_math_score = charter_school_data_df["math_score"].mean()

In [18]:
# Find the average reading score for all schools.
avg_reading_score = school_data_df["reading_score"].mean()

In [19]:
# Find the average reading score for District schools.
avg_district_reading_score = district_school_data_df["reading_score"].mean()

In [20]:
# Find the average reading score for District schools.
avg_charter_reading_score = charter_school_data_df["reading_score"].mean()

In [21]:
# Find the overall percentage of students who passed math.
district_math_pass = (len(district_school_data_df.loc[district_school_data_df["Math Pass/Fail"]=="Pass",:])/len(district_school_data_df))*100
charter_math_pass = (len(charter_school_data_df.loc[charter_school_data_df["Math Pass/Fail"]=="Pass",:])/len(charter_school_data_df))*100

In [22]:
# Find the overall percentage of students who passed reading.
district_reading_pass = (len(district_school_data_df.loc[district_school_data_df["Reading Pass/Fail"]=="Pass",:])/len(district_school_data_df))*100
charter_reading_pass = (len(charter_school_data_df.loc[charter_school_data_df["Reading Pass/Fail"]=="Pass",:])/len(charter_school_data_df))*100

In [23]:
# Find the overall percentage of students who passed both math and reading.
district_overall_pass = (len(district_school_data_df.loc[(district_school_data_df["Math Pass/Fail"]=="Pass")&(district_school_data_df["Reading Pass/Fail"]=="Pass"),:])/len(district_school_data_df))*100
charter_overall_pass = (len(charter_school_data_df.loc[(charter_school_data_df["Math Pass/Fail"]=="Pass")&(charter_school_data_df["Reading Pass/Fail"]=="Pass"),:])/len(charter_school_data_df))*100

In [24]:
# Put all of the metrics in a new Dataframe.
# The reason I did it this way with District and Charter school broken out separately is because I was asked for district summary, so I thought... excluding charter schools?
district_summary_dict = {
    "Total Schools":[num_district_schools,num_charter_schools],
    "Total Budget":[district_total_budget,charter_total_budget],
    "Total Students":[num_district_students,num_charter_students],
    "Average Math Score":[avg_district_math_score,avg_charter_math_score],
    "Average Reading Score":[avg_district_reading_score,avg_charter_reading_score],
    "% Passing Math":[district_math_pass,charter_math_pass],
    "% Passing Reading":[district_reading_pass,charter_reading_pass],
    "% Overall Passing":[district_overall_pass,charter_overall_pass]
}
district_summary_df=pd.DataFrame(district_summary_dict,index=["District","Charter"])
district_summary_df

Unnamed: 0,Total Schools,Total Budget,Total Students,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
District,7,17347923.0,26976,76.987026,80.962485,66.518387,80.905249,53.695878
Charter,8,7301505.0,12194,83.406183,83.902821,93.701821,96.645891,90.560932


In [25]:
# Let's also make a total summary not broken out by District/Charter type. Maybe I can do it with loops?
total_summary=pd.DataFrame()
for column_name in district_summary_df.columns:
    total_summary[column_name]=[district_summary_df[column_name].sum()]
total_summary

Unnamed: 0,Total Schools,Total Budget,Total Students,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
0,15,24649428.0,39170,160.393209,164.865306,160.220207,177.551141,144.256809


In [26]:
# Format some of the numerical entries in the dataframe.
district_summary_df["Total Students"] = district_summary_df["Total Students"].map("{:,}".format)
district_summary_df["Total Budget"] = district_summary_df["Total Budget"].map("${:,.2f}".format)
district_summary_df["Average Math Score"] = district_summary_df["Average Math Score"].map("{:,.2f}".format)
district_summary_df["Average Reading Score"] = district_summary_df["Average Reading Score"].map("{:,.2f}".format)
district_summary_df["% Passing Math"] = district_summary_df["% Passing Math"].map("{:,.2f}%".format)
district_summary_df["% Passing Reading"] = district_summary_df["% Passing Reading"].map("{:,.2f}%".format)
district_summary_df["% Overall Passing"] = district_summary_df["% Overall Passing"].map("{:,.2f}%".format)
district_summary_df

Unnamed: 0,Total Schools,Total Budget,Total Students,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
District,7,"$17,347,923.00",26976,76.99,80.96,66.52%,80.91%,53.70%
Charter,8,"$7,301,505.00",12194,83.41,83.9,93.70%,96.65%,90.56%


In [27]:
# Group dataframe entries by school and store the school names/sizes in a dictionary for later use.
grouped_by_school = school_data_df.groupby('school_name')
students_per_school_dict=dict(grouped_by_school.count()["student_name"])
students_per_school_dict

{'Bailey High School': 4976,
 'Cabrera High School': 1858,
 'Figueroa High School': 2949,
 'Ford High School': 2739,
 'Griffin High School': 1468,
 'Hernandez High School': 4635,
 'Holden High School': 427,
 'Huang High School': 2917,
 'Johnson High School': 4761,
 'Pena High School': 962,
 'Rodriguez High School': 3999,
 'Shelton High School': 1761,
 'Thomas High School': 1635,
 'Wilson High School': 2283,
 'Wright High School': 1800}

In [28]:
# Remind myself what the dataframe looks like.
school_data_df.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget,Math Pass/Fail,Reading Pass/Fail
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635,Pass,Fail
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635,Fail,Pass
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635,Fail,Pass
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635,Fail,Fail
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635,Pass,Pass


In [29]:
# Use a list of lists to build a brand new dataframe. Each list will contain all of the relevant info for a school.
# Start with an empty list. We'll add to it through each iteration of a for loop through all of the school names.
list_of_df_rows=[]
for key in students_per_school_dict.keys():
    # The format of each school's row is like this:
    # [school name, type, number of students, per student budget, average math score, average reading score, % passing math, % passing reading, $ overall passing]
    new_row=[key]
    new_row.append(school_data_df.loc[school_data_df['school_name']==key,'type'].min())
    
    school_budget=school_data_df.loc[school_data_df['school_name']==key,'budget'].mean()
    new_row.append(school_budget)
    
    student_count = students_per_school_dict[key]
    new_row.append(student_count)
    
    new_row.append(school_budget/student_count)
    
    new_row.append(grouped_by_school.mean()["math_score"].loc[key])
    new_row.append(grouped_by_school.mean()["reading_score"].loc[key])
    new_row.append(len(school_data_df.loc[(school_data_df["school_name"]==key)&(school_data_df["Math Pass/Fail"]=="Pass")])/student_count*100)
    new_row.append(len(school_data_df.loc[(school_data_df["school_name"]==key)&(school_data_df["Reading Pass/Fail"]=="Pass")])/student_count*100)
    new_row.append(len(school_data_df.loc[(school_data_df["school_name"]==key)&(school_data_df["Reading Pass/Fail"]=="Pass")&(school_data_df["Math Pass/Fail"]=="Pass")])/student_count*100)
    list_of_df_rows.append(new_row)
  

In [30]:
# Build a pandas dataframe using the list of lists from the previous cell.
school_summary_df=pd.DataFrame(list_of_df_rows)
school_summary_df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,Bailey High School,District,3124928.0,4976,628.0,77.048432,81.033963,66.680064,81.93328,54.642283
1,Cabrera High School,Charter,1081356.0,1858,582.0,83.061895,83.97578,94.133477,97.039828,91.334769
2,Figueroa High School,District,1884411.0,2949,639.0,76.711767,81.15802,65.988471,80.739234,53.204476
3,Ford High School,District,1763916.0,2739,644.0,77.102592,80.746258,68.309602,79.299014,54.289887
4,Griffin High School,Charter,917500.0,1468,625.0,83.351499,83.816757,93.392371,97.138965,90.599455


In [31]:
# I don't like the look of these column names, so rename them to be more helpful.
school_summary_df=school_summary_df.rename(columns={0:"School Name",1:"Type",2:"Budget",3:"No. Students",4:"Budget per Student",5:"Avg Math Score",6:"Avg Reading Score",7:"% Passing Math",8:"% Passing Reading",9:"% Overall Passing"})
school_summary_df.head()

Unnamed: 0,School Name,Type,Budget,No. Students,Budget per Student,Avg Math Score,Avg Reading Score,% Passing Math,% Passing Reading,% Overall Passing
0,Bailey High School,District,3124928.0,4976,628.0,77.048432,81.033963,66.680064,81.93328,54.642283
1,Cabrera High School,Charter,1081356.0,1858,582.0,83.061895,83.97578,94.133477,97.039828,91.334769
2,Figueroa High School,District,1884411.0,2949,639.0,76.711767,81.15802,65.988471,80.739234,53.204476
3,Ford High School,District,1763916.0,2739,644.0,77.102592,80.746258,68.309602,79.299014,54.289887
4,Griffin High School,Charter,917500.0,1468,625.0,83.351499,83.816757,93.392371,97.138965,90.599455


In [32]:
# Display the top 5 schools by sorting by overall passing.
top_schools=school_summary_df.sort_values("% Overall Passing",ascending=False)
top_schools.head()

Unnamed: 0,School Name,Type,Budget,No. Students,Budget per Student,Avg Math Score,Avg Reading Score,% Passing Math,% Passing Reading,% Overall Passing
1,Cabrera High School,Charter,1081356.0,1858,582.0,83.061895,83.97578,94.133477,97.039828,91.334769
12,Thomas High School,Charter,1043130.0,1635,638.0,83.418349,83.84893,93.272171,97.308869,90.948012
4,Griffin High School,Charter,917500.0,1468,625.0,83.351499,83.816757,93.392371,97.138965,90.599455
13,Wilson High School,Charter,1319574.0,2283,578.0,83.274201,83.989488,93.867718,96.539641,90.582567
9,Pena High School,Charter,585858.0,962,609.0,83.839917,84.044699,94.594595,95.945946,90.540541


In [33]:
# Display the lowest 5 schools by sorting by overall passing (but leave ascending as True this time)
bottom_schools=school_summary_df.sort_values("% Overall Passing")
bottom_schools.head()

Unnamed: 0,School Name,Type,Budget,No. Students,Budget per Student,Avg Math Score,Avg Reading Score,% Passing Math,% Passing Reading,% Overall Passing
10,Rodriguez High School,District,2547363.0,3999,637.0,76.842711,80.744686,66.366592,80.220055,52.988247
2,Figueroa High School,District,1884411.0,2949,639.0,76.711767,81.15802,65.988471,80.739234,53.204476
7,Huang High School,District,1910635.0,2917,655.0,76.629414,81.182722,65.683922,81.316421,53.513884
5,Hernandez High School,District,3022020.0,4635,652.0,77.289752,80.934412,66.752967,80.862999,53.527508
8,Johnson High School,District,3094650.0,4761,650.0,77.072464,80.966394,66.057551,81.222432,53.539172


In [34]:
# Calculate the average math score by grade across all schools.
avg_math_score_by_grade=pd.DataFrame(school_data_df.groupby('grade').mean()['math_score'])
avg_math_score_by_grade

Unnamed: 0_level_0,math_score
grade,Unnamed: 1_level_1
10th,78.941483
11th,79.083548
12th,78.993164
9th,78.935659


In [35]:
# Calculate the average reading score by grade across all schools.
avg_reading_score_by_grade=pd.DataFrame(school_data_df.groupby('grade').mean()['reading_score'])
avg_reading_score_by_grade

Unnamed: 0_level_0,reading_score
grade,Unnamed: 1_level_1
10th,81.87441
11th,81.885714
12th,81.819851
9th,81.914358


In [36]:
# Categorize each school by budget per student using below bins.
spending_bins = [0, 585, 630, 645, 680]
spending_labels = ["<$585", "$585-630", "$630-645", "$645-680"]
school_summary_df["Spending Group"] = pd.cut(school_summary_df["Budget per Student"],spending_bins,labels=spending_labels, include_lowest=True)
school_summary_df.head()

Unnamed: 0,School Name,Type,Budget,No. Students,Budget per Student,Avg Math Score,Avg Reading Score,% Passing Math,% Passing Reading,% Overall Passing,Spending Group
0,Bailey High School,District,3124928.0,4976,628.0,77.048432,81.033963,66.680064,81.93328,54.642283,$585-630
1,Cabrera High School,Charter,1081356.0,1858,582.0,83.061895,83.97578,94.133477,97.039828,91.334769,<$585
2,Figueroa High School,District,1884411.0,2949,639.0,76.711767,81.15802,65.988471,80.739234,53.204476,$630-645
3,Ford High School,District,1763916.0,2739,644.0,77.102592,80.746258,68.309602,79.299014,54.289887,$630-645
4,Griffin High School,Charter,917500.0,1468,625.0,83.351499,83.816757,93.392371,97.138965,90.599455,$585-630


In [37]:
# Calculate the following average scores by spending group. Fun fact, I found out this doesn't work if you have already formatted the numbers to look nice to human eyes.
spending_math_scores = school_summary_df.groupby("Spending Group").mean()['Avg Math Score']
spending_reading_scores = school_summary_df.groupby("Spending Group").mean()["Avg Reading Score"]
spending_passing_math = school_summary_df.groupby("Spending Group").mean()["% Passing Math"]
spending_passing_reading = school_summary_df.groupby("Spending Group").mean()["% Passing Reading"]
overall_passing_spending = school_summary_df.groupby("Spending Group").mean()["% Overall Passing"]

In [38]:
# Create a new dataframe to hold average scores by spending bin.
spending_summary_df=pd.DataFrame()
spending_summary_df["Average Math Score"]=spending_math_scores
spending_summary_df["Average Reading Score"]=spending_reading_scores
spending_summary_df["% Passing Math"]=spending_passing_math
spending_summary_df["% Passing Reading"]=spending_passing_reading
spending_summary_df["% Overall Passing"]=overall_passing_spending
spending_summary_df

Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
Spending Group,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
<$585,83.455399,83.933814,93.460096,96.610877,90.369459
$585-630,81.899826,83.155286,87.133538,92.718205,81.418596
$630-645,78.518855,81.624473,73.484209,84.391793,62.857656
$645-680,76.99721,81.027843,66.164813,81.133951,53.526855


In [39]:
# Create some new bins to categorize the schools by number of students. 
size_bins = [0, 1000, 2000, 5000]
size_labels = ["Small (<1000)", "Medium (1000-2000)", "Large (2000-5000)"]

school_summary_df["School Size"]=pd.cut(school_summary_df["No. Students"],size_bins,labels=size_labels, include_lowest=True)
school_summary_df.head()

Unnamed: 0,School Name,Type,Budget,No. Students,Budget per Student,Avg Math Score,Avg Reading Score,% Passing Math,% Passing Reading,% Overall Passing,Spending Group,School Size
0,Bailey High School,District,3124928.0,4976,628.0,77.048432,81.033963,66.680064,81.93328,54.642283,$585-630,Large (2000-5000)
1,Cabrera High School,Charter,1081356.0,1858,582.0,83.061895,83.97578,94.133477,97.039828,91.334769,<$585,Medium (1000-2000)
2,Figueroa High School,District,1884411.0,2949,639.0,76.711767,81.15802,65.988471,80.739234,53.204476,$630-645,Large (2000-5000)
3,Ford High School,District,1763916.0,2739,644.0,77.102592,80.746258,68.309602,79.299014,54.289887,$630-645,Large (2000-5000)
4,Griffin High School,Charter,917500.0,1468,625.0,83.351499,83.816757,93.392371,97.138965,90.599455,$585-630,Medium (1000-2000)


In [40]:
# Calculate the following average scores by school size. 
size_math_scores = school_summary_df.groupby("School Size").mean()['Avg Math Score']
size_reading_scores = school_summary_df.groupby("School Size").mean()["Avg Reading Score"]
size_passing_math = school_summary_df.groupby("School Size").mean()["% Passing Math"]
size_passing_reading = school_summary_df.groupby("School Size").mean()["% Passing Reading"]
overall_passing_size = school_summary_df.groupby("School Size").mean()["% Overall Passing"]

# Same thing again, create a new dataframe to hold average scores by school size.
size_summary=pd.DataFrame()
size_summary["Average Math Score"]=size_math_scores
size_summary["Average Reading Score"]=size_reading_scores
size_summary["% Passing Math"]=size_passing_math
size_summary["% Passing Reading"]=size_passing_reading
size_summary["% Overall Passing"]=overall_passing_size
size_summary

Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
School Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Small (<1000),83.821598,83.929843,93.550225,96.099437,89.883853
Medium (1000-2000),83.374684,83.864438,93.599695,96.79068,90.621535
Large (2000-5000),77.746417,81.344493,69.963361,82.766634,58.286003


In [41]:
# The next step is to do the exact same thing but by school type (district or charter).
# I actually did this in the first portion of this notebook, but let's call up that dataframe again. It is named differently than the challenge instructions specify.
district_summary_df

Unnamed: 0,Total Schools,Total Budget,Total Students,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
District,7,"$17,347,923.00",26976,76.99,80.96,66.52%,80.91%,53.70%
Charter,8,"$7,301,505.00",12194,83.41,83.9,93.70%,96.65%,90.56%


Analysis:

In terms of average scores and % passing students, small and medium schools tend to perform better than large schools. I attended both small and large schools, and that about lines up with my personal experience. Larger schools may not necessarily have the resources to devote to the number of students they support.

In general, performance remains about the same between grades - students don't tend to do better or worse as they move through the grades.

Charter schools also tend to perform much better than district schools. Charter schools make up all top 5 of the best performing schools by score, and district schools make up all top 5 of the worst performing. I looked this up out of curiosity, but some often stated reasons for charter schools' higher performance are smaller sizes (in line with paragraph 1) and flexibility/speed in adopting new educational processes that may be more effective. My thought is that such flexibility may also make the charter schools a more attractive option for skilled teachers - I've definitely heard teachers complain about having their hands tied by all the (ineffective) hoops they're made to jump through by public schools and feeling less interested in teaching in such an environment.

Additionally, schools with lower budget per student tend to perform better than schools with higher budget per student. I might've initially assumed otherwise. I'm considering this along with the charter school points above. My quick online search tells me that charter schools often have smaller budgets than public schools, though this goes along with smaller student populations. As a result, charter schools sometimes end up having to do more fundraising activities. My thought is that this school system, with these requirements and student population caps, might bring in more parents who have the time/financial flexibility to work around the charter school requirements. Parents in these situations would probably also have more resources at home to devote to their children's educations.

(This is tangential and not really supported by the data, which does not include the number of teachers per school, testing requirements, how the budget is allocated between all of the schools' expenses, parents' personal info, etc.)