# PyCity Schools Analysis

* As a whole, schools with higher budgets, did not yield better test results. By contrast, schools with higher spending per student actually (\$645-675) underperformed compared to schools with smaller budgets (<\$585 per student).

* As a whole, smaller and medium sized schools dramatically out-performed large sized schools on passing math performances (89-91% passing vs 67%).

* As a whole, charter schools out-performed the public district schools across all metrics. However, more analysis will be required to glean if the effect is due to school practices or the fact that charter schools tend to serve smaller student populations per school. 
---

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [281]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
school_data_to_load = "Desktop/schools_complete.csv"
student_data_to_load = "Desktop/students_complete.csv"

# Read School and Student Data File and store into Pandas Data Frames
school_df1 = pd.read_csv(school_data_to_load)
student_df = pd.read_csv(student_data_to_load)

# Combine the data into a single dataset
school_df = pd.merge(student_df, school_df1, how="left", on=["school_name", "school_name"])
#display headers
school_df.head()


Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635


In [282]:
#created dataframes to hold counts of passing students
percent_passing_math_df = school_df.loc[school_df["math_score"] >= 70, :]
percent_passing_reading_df = school_df.loc[school_df["reading_score"] >= 70, :]

## District Summary

* Calculate the total number of schools

* Calculate the total number of students

* Calculate the total budget

* Calculate the average math score 

* Calculate the average reading score

* Calculate the overall passing rate (overall average score), i.e. (avg. math score + avg. reading score)/2

* Calculate the percentage of students with a passing math score (70 or greater)

* Calculate the percentage of students with a passing reading score (70 or greater)

* Create a dataframe to hold the above results

* Optional: give the displayed data cleaner formatting

In [283]:
#calculate total schools using unique values, outputs of lists do not calculate with ints, 
total_schools = school_df["school_name"].nunique()
total_students = school_df["student_name"].count()
total_budget = school_df["budget"].unique().sum()
average_maths = round(school_df["math_score"].mean(), 1)
average_reading = round(school_df["reading_score"].mean(), 1)
overall_pass_rate = ((round(school_df["math_score"].mean(), 1)) + (round(school_df["reading_score"].mean(), 1)))/2
percent_passing_math = round(((percent_passing_math_df["math_score"].count())/(school_df["student_name"].count()))*100, 2)
percent_passing_reading = round(((percent_passing_reading_df["reading_score"].count())/(school_df["student_name"].count()))*100, 2)


district_summary_df = pd.DataFrame({"Total Schools": [total_schools], "Total Number of Students": [total_students], "Total Budget($)": [total_budget],
                                   "Average Math Score": [average_maths], "Average Reading Score": [average_reading],
                                   "Overall Average Score": [overall_pass_rate], "Passing: Math(%)": [percent_passing_math], 
                                   "Passing: Reading(%)": [percent_passing_reading]})


district_summary_df

Unnamed: 0,Total Schools,Total Number of Students,Total Budget($),Average Math Score,Average Reading Score,Overall Average Score,Passing: Math(%),Passing: Reading(%)
0,15,39170,24649428,79.0,81.9,80.45,74.98,85.81


In [284]:
temp_dict_list = []
for school in school_df["school_name"].unique().tolist():
    temp_dict = {"School Name": school_df.loc[school_df["school_name"]==school]["school_name"].max(), 
                 "School Type": school_df.loc[school_df["school_name"]==school]["type"].max(), 
                "Total Student": len(school_df.loc[school_df["school_name"]==school]), 
                "Total Budget": school_df.loc[school_df["school_name"]==school]["budget"].max(),
                "Per Stu Budget": (school_df.loc[school_df["school_name"]==school]["budget"].max())/(len(school_df.loc[school_df["school_name"]==school])),
                "Math Avg Score": school_df.loc[school_df["school_name"]==school]["math_score"].mean(), 
                "Reading Avg Score": school_df.loc[school_df["school_name"]==school]["reading_score"].mean(), 
                "Math Pct Pass": ((len(school_df.loc[(school_df["school_name"]==school) & (school_df["math_score"]>69)]))/(len(school_df.loc[school_df["school_name"]==school])))*100,
                "Reading Pct Pass": ((len(school_df.loc[(school_df["school_name"]==school)&(school_df["reading_score"]>69)]))/(len(school_df.loc[school_df["school_name"]==school])))*100, 
                "Overall Pass Rate": ((((len(school_df.loc[(school_df["school_name"]==school) & (school_df["math_score"]>69)]))/(len(school_df.loc[school_df["school_name"]==school])))*100 + ((len(school_df.loc[(school_df["school_name"]==school) & (school_df["reading_score"]>69)]))/(len(school_df.loc[school_df["school_name"]==school])))*100)/2)}
    
    temp_dict_list.append(temp_dict)
SchoolSummary = pd.DataFrame(temp_dict_list)
SchoolSummary 


Unnamed: 0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,School Name,School Type,Total Budget,Total Student
0,76.629414,65.683922,73.500171,655.0,81.182722,81.316421,Huang High School,District,1910635,2917
1,76.711767,65.988471,73.363852,639.0,81.15802,80.739234,Figueroa High School,District,1884411,2949
2,83.359455,93.867121,94.860875,600.0,83.725724,95.854628,Shelton High School,Charter,1056600,1761
3,77.289752,66.752967,73.807983,652.0,80.934412,80.862999,Hernandez High School,District,3022020,4635
4,83.351499,93.392371,95.265668,625.0,83.816757,97.138965,Griffin High School,Charter,917500,1468
5,83.274201,93.867718,95.203679,578.0,83.989488,96.539641,Wilson High School,Charter,1319574,2283
6,83.061895,94.133477,95.586652,582.0,83.97578,97.039828,Cabrera High School,Charter,1081356,1858
7,77.048432,66.680064,74.306672,628.0,81.033963,81.93328,Bailey High School,District,3124928,4976
8,83.803279,92.505855,94.379391,581.0,83.814988,96.252927,Holden High School,Charter,248087,427
9,83.839917,94.594595,95.27027,609.0,84.044699,95.945946,Pena High School,Charter,585858,962


In [285]:

#School summary is above

## School Summary

* Create an overview table that summarizes key metrics about each school, including:
  * School Name
  * School Type
  * Total Students
  * Total School Budget
  * Per Student Budget
  * Average Math Score
  * Average Reading Score
  * % Passing Math
  * % Passing Reading
  * Overall Passing Rate (Average of the above two)
  
* Create a dataframe to hold the above results

## Top Performing Schools (By Passing Rate)

* Sort and display the top five schools in overall passing rate

In [286]:
top_performing_schools_df = SchoolSummary.sort_values("Overall Pass Rate", ascending=False)
#top_performing_schools_df = top_performing_schools_df.reset_index(drop=True)
top_performing_schools_df = top_performing_schools_df.set_index('School Name')
top_performing_schools_df.head()

Unnamed: 0_level_0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,School Type,Total Budget,Total Student
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Cabrera High School,83.061895,94.133477,95.586652,582.0,83.97578,97.039828,Charter,1081356,1858
Thomas High School,83.418349,93.272171,95.29052,638.0,83.84893,97.308869,Charter,1043130,1635
Pena High School,83.839917,94.594595,95.27027,609.0,84.044699,95.945946,Charter,585858,962
Griffin High School,83.351499,93.392371,95.265668,625.0,83.816757,97.138965,Charter,917500,1468
Wilson High School,83.274201,93.867718,95.203679,578.0,83.989488,96.539641,Charter,1319574,2283


## Bottom Performing Schools (By Passing Rate)

* Sort and display the five worst-performing schools

In [287]:
bottom_performing_schools_df = SchoolSummary.sort_values("Overall Pass Rate")
bottom_performing_schools_df = bottom_performing_schools_df.reset_index(drop=True)
bottom_performing_schools_df = bottom_performing_schools_df.set_index('School Name')
bottom_performing_schools_df.head()

Unnamed: 0_level_0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,School Type,Total Budget,Total Student
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Rodriguez High School,76.842711,66.366592,73.293323,637.0,80.744686,80.220055,District,2547363,3999
Figueroa High School,76.711767,65.988471,73.363852,639.0,81.15802,80.739234,District,1884411,2949
Huang High School,76.629414,65.683922,73.500171,655.0,81.182722,81.316421,District,1910635,2917
Johnson High School,77.072464,66.057551,73.639992,650.0,80.966394,81.222432,District,3094650,4761
Ford High School,77.102592,68.309602,73.804308,644.0,80.746258,79.299014,District,1763916,2739


In [288]:
ninth_grade_df = school_df.loc[school_df["grade"] == "9th", :]
ninth_grade_df.head()
ninth_grade_df.drop(['School ID', 'size', 'budget', 'Student ID', 'reading_score'], axis=1, inplace=True)
ninth_grade_df = ninth_grade_df.groupby("school_name").mean()
ninth_grade_df

tenth_grade_df = school_df.loc[school_df["grade"] == "10th", :]
tenth_grade_df.head()
tenth_grade_df.drop(['School ID', 'size', 'budget', 'Student ID', 'reading_score'], axis=1, inplace=True)
tenth_grade_df = tenth_grade_df.groupby("school_name").mean()
tenth_grade_df

eleventh_grade_df = school_df.loc[school_df["grade"] == "11th", :]
eleventh_grade_df.drop(['School ID', 'size', 'budget', 'Student ID', 'reading_score'], axis=1, inplace=True)
eleventh_grade_df = eleventh_grade_df.groupby("school_name").mean()
eleventh_grade_df

twelvth_grade_df = school_df.loc[school_df["grade"] == "12th", :]
twelvth_grade_df.drop(['School ID', 'size', 'budget', 'Student ID', 'reading_score'], axis=1, inplace=True)
twelvth_grade_df = twelvth_grade_df.groupby("school_name").mean()
twelvth_grade_df


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  errors=errors)


Unnamed: 0_level_0,math_score
school_name,Unnamed: 1_level_1
Bailey High School,76.492218
Cabrera High School,83.277487
Figueroa High School,77.151369
Ford High School,76.179963
Griffin High School,83.356164
Hernandez High School,77.186567
Holden High School,82.855422
Huang High School,77.225641
Johnson High School,76.863248
Pena High School,84.121547


## Math Scores by Grade

* Create a table that lists the average Reading Score for students of each grade level (9th, 10th, 11th, 12th) at each school.

  * Create a pandas series for each grade. Hint: use a conditional statement.
  
  * Group each series by school
  
  * Combine the series into a dataframe
  
  * Optional: give the displayed data cleaner formatting

In [289]:
merge_math_score = pd.merge(ninth_grade_df, tenth_grade_df, on="school_name")
merge_math_score = merge_math_score.rename(columns={"math_score_x": "9th Grade", "math_score_y": "10th Grade"})

merge_math_score2 = pd.merge(merge_math_score, eleventh_grade_df, on="school_name")
merge_math_score2 = merge_math_score2.rename(columns={"math_score": "11th Grade"})

merge_math_score3 = pd.merge(merge_math_score2, twelvth_grade_df, on="school_name")
merge_math_score3 = merge_math_score3.rename(columns={"math_score": "12th Grade"})


merge_math_score3


Unnamed: 0_level_0,9th Grade,10th Grade,11th Grade,12th Grade
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Bailey High School,77.083676,76.996772,77.515588,76.492218
Cabrera High School,83.094697,83.154506,82.76556,83.277487
Figueroa High School,76.403037,76.539974,76.884344,77.151369
Ford High School,77.361345,77.672316,76.918058,76.179963
Griffin High School,82.04401,84.229064,83.842105,83.356164
Hernandez High School,77.438495,77.337408,77.136029,77.186567
Holden High School,83.787402,83.429825,85.0,82.855422
Huang High School,77.027251,75.908735,76.446602,77.225641
Johnson High School,77.187857,76.691117,77.491653,76.863248
Pena High School,83.625455,83.372,84.328125,84.121547


In [290]:
Rninth_grade_df = school_df.loc[school_df["grade"] == "9th", :]
Rninth_grade_df.head()
Rninth_grade_df.drop(['School ID', 'size', 'budget', 'Student ID', 'math_score'], axis=1, inplace=True)
Rninth_grade_df = Rninth_grade_df.groupby("school_name").mean()
Rninth_grade_df

Rtenth_grade_df = school_df.loc[school_df["grade"] == "10th", :]
Rtenth_grade_df.head()
Rtenth_grade_df.drop(['School ID', 'size', 'budget', 'Student ID', 'math_score'], axis=1, inplace=True)
Rtenth_grade_df = Rtenth_grade_df.groupby("school_name").mean()
Rtenth_grade_df

Releventh_grade_df = school_df.loc[school_df["grade"] == "11th", :]
Releventh_grade_df.drop(['School ID', 'size', 'budget', 'Student ID', 'math_score'], axis=1, inplace=True)
Releventh_grade_df = Releventh_grade_df.groupby("school_name").mean()
Releventh_grade_df

Rtwelvth_grade_df = school_df.loc[school_df["grade"] == "12th", :]
Rtwelvth_grade_df.drop(['School ID', 'size', 'budget', 'Student ID', 'math_score'], axis=1, inplace=True)
Rtwelvth_grade_df = Rtwelvth_grade_df.groupby("school_name").mean()
Rtwelvth_grade_df

Unnamed: 0_level_0,reading_score
school_name,Unnamed: 1_level_1
Bailey High School,80.912451
Cabrera High School,84.287958
Figueroa High School,81.384863
Ford High School,80.662338
Griffin High School,84.013699
Hernandez High School,80.857143
Holden High School,84.698795
Huang High School,80.305983
Johnson High School,81.227564
Pena High School,84.59116


## Reading Score by Grade 

* Perform the same operations as above for reading scores

In [291]:
Rmerge_math_score = pd.merge(Rninth_grade_df, Rtenth_grade_df, on="school_name")
Rmerge_math_score = Rmerge_math_score.rename(columns={"reading_score_x": "9th Grade", "reading_score_y": "10th Grade"})

Rmerge_math_score2 = pd.merge(Rmerge_math_score, Releventh_grade_df, on="school_name")
Rmerge_math_score2 = Rmerge_math_score2.rename(columns={"reading_score": "11th Grade"})

Rmerge_math_score3 = pd.merge(Rmerge_math_score2, Rtwelvth_grade_df, on="school_name")
Reading_score = Rmerge_math_score3.rename(columns={"reading_score": "12th Grade"})


Reading_score

Unnamed: 0_level_0,9th Grade,10th Grade,11th Grade,12th Grade
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Bailey High School,81.303155,80.907183,80.945643,80.912451
Cabrera High School,83.676136,84.253219,83.788382,84.287958
Figueroa High School,81.198598,81.408912,80.640339,81.384863
Ford High School,80.632653,81.262712,80.403642,80.662338
Griffin High School,83.369193,83.706897,84.288089,84.013699
Hernandez High School,80.86686,80.660147,81.39614,80.857143
Holden High School,83.677165,83.324561,83.815534,84.698795
Huang High School,81.290284,81.512386,81.417476,80.305983
Johnson High School,81.260714,80.773431,80.616027,81.227564
Pena High School,83.807273,83.612,84.335938,84.59116


## Scores by School Spending

* Create a table that breaks down school performances based on average Spending Ranges (Per Student). Use 4 reasonable bins to group school spending. Include in the table each of the following:
  * Average Math Score
  * Average Reading Score
  * % Passing Math
  * % Passing Reading
  * Overall Passing Rate (Average of the above two)

In [292]:
# Sample bins. Feel free to create your own bins.
spending_bins = [0, 585, 615, 645, 675]
group_names = ["<$585", "$585-615", "$615-645", "$645-675"]


In [293]:
SchoolSummary.drop(['Total Budget', 'School Name'], axis=1, inplace=True)

In [294]:
SchoolSummary.head()


Unnamed: 0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,School Type,Total Student
0,76.629414,65.683922,73.500171,655.0,81.182722,81.316421,District,2917
1,76.711767,65.988471,73.363852,639.0,81.15802,80.739234,District,2949
2,83.359455,93.867121,94.860875,600.0,83.725724,95.854628,Charter,1761
3,77.289752,66.752967,73.807983,652.0,80.934412,80.862999,District,4635
4,83.351499,93.392371,95.265668,625.0,83.816757,97.138965,Charter,1468


In [295]:
pd.cut(SchoolSummary["Per Stu Budget"], spending_bins, labels=group_names).head()



0    $645-675
1    $615-645
2    $585-615
3    $645-675
4    $615-645
Name: Per Stu Budget, dtype: category
Categories (4, object): [<$585 < $585-615 < $615-645 < $645-675]

In [296]:
SchoolSummary["Spending Ranges (Per Student)"] = pd.cut(SchoolSummary["Per Stu Budget"], spending_bins, labels=group_names)
SchoolSummary

#SchoolSummarySpending = SchoolSummary.groupby("Spending Ranges (Per Student)")
#SchoolSummarySpending = SchoolSummary.set_index('Spending Ranges (Per Student)')
#SchoolSummarySpending

Unnamed: 0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,School Type,Total Student,Spending Ranges (Per Student)
0,76.629414,65.683922,73.500171,655.0,81.182722,81.316421,District,2917,$645-675
1,76.711767,65.988471,73.363852,639.0,81.15802,80.739234,District,2949,$615-645
2,83.359455,93.867121,94.860875,600.0,83.725724,95.854628,Charter,1761,$585-615
3,77.289752,66.752967,73.807983,652.0,80.934412,80.862999,District,4635,$645-675
4,83.351499,93.392371,95.265668,625.0,83.816757,97.138965,Charter,1468,$615-645
5,83.274201,93.867718,95.203679,578.0,83.989488,96.539641,Charter,2283,<$585
6,83.061895,94.133477,95.586652,582.0,83.97578,97.039828,Charter,1858,<$585
7,77.048432,66.680064,74.306672,628.0,81.033963,81.93328,District,4976,$615-645
8,83.803279,92.505855,94.379391,581.0,83.814988,96.252927,Charter,427,<$585
9,83.839917,94.594595,95.27027,609.0,84.044699,95.945946,Charter,962,$585-615


In [297]:
SchoolSummarySpending = SchoolSummary.groupby("Spending Ranges (Per Student)").mean()

SchoolSummarySpending

Unnamed: 0_level_0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,Total Student
Spending Ranges (Per Student),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
<$585,83.455399,93.460096,95.035486,581.0,83.933814,96.610877,1592.0
$585-615,83.599686,94.230858,95.065572,604.5,83.885211,95.900287,1361.5
$615-645,79.079225,75.668212,80.887391,635.166667,81.891436,86.106569,2961.0
$645-675,76.99721,66.164813,73.649382,652.333333,81.027843,81.133951,4104.333333


In [298]:
#example box kept for looks

## Scores by School Size

* Perform the same operations as above, based on school size.

In [299]:
# Sample bins. Feel free to create your own bins.
size_bins = [0, 1000, 2000, 5000]
group_names = ["Small (<1000)", "Medium (1000-2000)", "Large (2000-5000)"]

In [300]:
pd.cut(SchoolSummary["Total Student"], size_bins, labels=group_names).head()

0     Large (2000-5000)
1     Large (2000-5000)
2    Medium (1000-2000)
3     Large (2000-5000)
4    Medium (1000-2000)
Name: Total Student, dtype: category
Categories (3, object): [Small (<1000) < Medium (1000-2000) < Large (2000-5000)]

In [302]:
SchoolSummary["School Size"] = pd.cut(SchoolSummary["Total Student"], size_bins, labels=group_names)
SchoolSummary.head()

Unnamed: 0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,School Type,Total Student,Spending Ranges (Per Student),School Size
0,76.629414,65.683922,73.500171,655.0,81.182722,81.316421,District,2917,Small (<1000),Large (2000-5000)
1,76.711767,65.988471,73.363852,639.0,81.15802,80.739234,District,2949,Small (<1000),Large (2000-5000)
2,83.359455,93.867121,94.860875,600.0,83.725724,95.854628,Charter,1761,Small (<1000),Medium (1000-2000)
3,77.289752,66.752967,73.807983,652.0,80.934412,80.862999,District,4635,Small (<1000),Large (2000-5000)
4,83.351499,93.392371,95.265668,625.0,83.816757,97.138965,Charter,1468,Small (<1000),Medium (1000-2000)
5,83.274201,93.867718,95.203679,578.0,83.989488,96.539641,Charter,2283,Small (<1000),Large (2000-5000)
6,83.061895,94.133477,95.586652,582.0,83.97578,97.039828,Charter,1858,Small (<1000),Medium (1000-2000)
7,77.048432,66.680064,74.306672,628.0,81.033963,81.93328,District,4976,Small (<1000),Large (2000-5000)
8,83.803279,92.505855,94.379391,581.0,83.814988,96.252927,Charter,427,Small (<1000),Small (<1000)
9,83.839917,94.594595,95.27027,609.0,84.044699,95.945946,Charter,962,Small (<1000),Small (<1000)


In [303]:
SchoolSummarySchoolSize = SchoolSummary.groupby("School Size").mean()

SchoolSummarySchoolSize

Unnamed: 0_level_0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,Total Student
School Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Small (<1000),83.821598,93.550225,94.824831,595.0,83.929843,96.099437,694.5
Medium (1000-2000),83.374684,93.599695,95.195187,605.6,83.864438,96.79068,1704.4
Large (2000-5000),77.746417,69.963361,76.364998,635.375,81.344493,82.766634,3657.375


## Scores by School Type

* Perform the same operations as above, based on school type.

In [304]:
SchoolSummaryByType = SchoolSummary.groupby("School Type").mean()

SchoolSummaryByType

Unnamed: 0_level_0,Math Avg Score,Math Pct Pass,Overall Pass Rate,Per Stu Budget,Reading Avg Score,Reading Pct Pass,Total Student
School Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Charter,83.473852,93.62083,95.10366,599.5,83.896421,96.586489,1524.25
District,76.956733,66.548453,73.673757,643.571429,80.966636,80.799062,3853.714286


Markdown cell Analysis:
Data points to variable to investigate further: We are unable to differentiate the effect of school size on charter school's performance based on this data. Since school size has a pronounced effect on performance, this would be the most useful area for further investigation. 
The breakdown of scores by grade shows us that the performance of the schools is not significantly different by grade. This is unusual and could indicate a need for more stringent standards for passing onto the next grade. Typically you would see a lower percent pass rate in the 9th grade and then see your highest percent pass by the 12th grade. Low performing students are being rubber-stamped into the next grade level.