# PyCitySchools Analysis

Given the two PyCitySchools data files (school and student), these data show:

1) There does not appear to be any correlation between the budget per student and the overall student passing rates. 
Schools with the highest per student budget did not necessarily have higher student passing grades.
Schools with the lowest per student budget did not necessarily have lower student passing grades.

2) Small schools (less than 1,000 students) and medium size schools (between 1,000 and 2,000 students) combined had a signficantly higher overall student pass rate (89% and 90% respectively) than large size schools (over 2000 students) with an overall student pass rate of 58%.

3) Charter schools had significantly higher math, reading, and overall pass rates than district schools.

4) Even though all 15 schools in the district were almost evently split across school type (district = 7, charter = 8), 
the low overall past rates of district type schools had a significant impacted on the overall passing percentage of the entire district.

## District Summary 

In [25]:
# Dependencies and Setup
import pandas as pd
#import os

# Files to Load
school_data_file = "Resources/schools_complete.csv"
student_data_file = "Resources/students_complete.csv"


In [26]:
# Read school and student data files and store into Pandas DataFrames
# Notes: Use Pandas to read data | include header | run using hanging object below
school_data_file = pd.read_csv(school_data_file)

#Hanging object
#school_data_file.head()

In [27]:
# Use Pandas to read data | include header | dataframe
student_data_file = pd.read_csv(student_data_file)
#student_data_file.head()

In [28]:
# Combine the data into a single dataset
# Notes: pd.merge merges two data files and uses "school_name" fields as unique ID to tie data sets together.
# Notes: variable = "school_data_complete" so pull data and do analysis from this Pandas data set.
# Notes: Left = left outer join because student data file has more info than school data file.
# Notes: on = Column or index names to join on, in that order left to right. These must be found in both DataFrames.
all_data_complete = pd.merge(student_data_file, school_data_file, how="left", on=["school_name", "school_name"])
#print (all_data_complete)

In [29]:
# Calcuates total number of schools
# Notes: No need to iterate through the list to keep a running total and assign it to a variable. Use method.
total_schools = len(all_data_complete["School ID"].unique())
#print (total_schools)

In [30]:
# Calcuates total number of students based on Student IDs
# Note: Counting on student name resulted in a different number (32715).
total_students = len(all_data_complete["Student ID"].unique())
#print (total_students)

In [31]:
# Calcuates total budget for all schools
# Note: Point to school_data_file, not all_data_complete < results in the wrong number.
total_budget = (school_data_file["budget"].sum())
#print (total_budget)

In [32]:
# Calcuates average math scores for all students
avg_math = (student_data_file["math_score"].mean())
#print (avg_math)

In [33]:
# Calcuates average reading scores for all students
avg_reading = (student_data_file["reading_score"].mean())
#print (avg_reading)

In [34]:
# Math Percent: Calculate the percentage of students with a passing math score (70 or greater)
passing_math=all_data_complete[all_data_complete["math_score"] >= 70]["math_score"].count()
percent_pass_math = passing_math/total_students*100
#print (percent_pass_math)

# Reading Percent: Calculate the percentage of students with a passing reading score (70 or greater)
passing_reading=all_data_complete[all_data_complete["reading_score"] >= 70]["reading_score"].count()
percent_pass_reading = passing_reading/total_students *100
#print (percent_pass_reading)

# Overall Percent Passing: Calculate the percentage of students who passed math and reading (% Overall Passing)
# Note: Key is math AND reading / total students 
overall_pass = all_data_complete[(all_data_complete["math_score"] >= 70) & (all_data_complete["reading_score"] >= 70)] ["Student ID"].count()/total_students *100
#print (overall_pass)

In [35]:
# Create a dataframe to hold the results
# Note: I used this dataframe and table formatting structure throughout this notebook.
# Note: Remap fields above to table fields (in most cases, they are the same name) and create a .pd.DataFrame called "District Summary".
# Note: I originaly had () around variables but got a ValueError...
# Note: If using all scalar (single) values, you must pass an index (an index is [])

district_summary = pd.DataFrame({
    "Total Schools": [total_schools],
    "Total Students": [total_students],
    "Total Budget": [total_budget],
    "Average Math Score": [avg_math],
    "Average Reading Score": [avg_reading],
    "% Passing Math":[percent_pass_math],
    "% Passing Reading": [percent_pass_reading],
    "% Overall Passing": [overall_pass]
})
#print (district_summary)

# Table Formatting
# Note: Formatting the data under each column by referencing column name
# Note: I combined table creation and formatting into one cell. If split into two different cells, 
# Note: the prior dataframe rows must always be run whenever any updates are made to the formatting code. Otherwise errors.
district_summary["Total Students"] = district_summary["Total Students"].map("{:,}".format)
district_summary["Total Budget"] = district_summary["Total Budget"].map("${:,.2f}".format)
district_summary["Average Math Score"] = district_summary["Average Math Score"].map("{:,.2f}".format)
district_summary["Average Reading Score"] = district_summary["Average Reading Score"].map("{:,.2f}".format)
district_summary["% Passing Math"] = district_summary["% Passing Math"].map("{:,.3f}%".format)
district_summary["% Passing Reading"] = district_summary["% Passing Reading"].map("{:,.3f}%".format)
district_summary["% Overall Passing"] = district_summary["% Overall Passing"].map("{:,.3f}%".format)
district_summary

Unnamed: 0,Total Schools,Total Students,Total Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
0,15,39170,"$24,649,428.00",78.99,81.88,74.981%,85.805%,65.172%


## School Summary 

In [36]:
# Group by schools using .groupby
# Note: Create a new group with a new left join, this time "school data" is table to the left. Use [] to create lists for grouping.
# Note: sometimes use .set_index - allows flexibility in setting a new index for data so code can quickly retrieve it without
# Note: using loc. Just provide the name of the column you want to use as the index. 
all_schooldata_complete = pd.merge(school_data_file, student_data_file, how="left", on=["school_name", "school_name"])

# School group types
# Note: Use type to pull district and charter
school_types = school_data_file.set_index("school_name")["type"]
#print (school_types)

#total students
students_per_school = all_schooldata_complete["school_name"].value_counts()
#print (students_per_school)

# Total school budget
# Note: Thank you TA for helping to fix this code
total_school_budget = all_schooldata_complete.groupby(["school_name"]).mean()["budget"]
#print (total_school_budget)

# Total student budget
budget_per_student = total_school_budget/students_per_school
#print (budget_per_student)

# Average math score per school
average_math_school =all_schooldata_complete.groupby(["school_name"])["math_score"].mean()
#print (average_math_school)

# Average reading score per school
average_reading_school = all_schooldata_complete.groupby(["school_name"])["reading_score"].mean()
#print (average_reading_school)

# Percent math pass per school
percent_math_school = all_schooldata_complete[all_schooldata_complete["math_score"] >= 70].groupby("school_name")["Student ID"].count()/students_per_school*100
#print (percent_math_school)

# Percent reading pass per school
percent_reading_school = all_schooldata_complete[all_schooldata_complete["reading_score"] >= 70].groupby("school_name")["Student ID"].count()/students_per_school*100
#print (percent_reading_school)

# Overall pass rate per school
# Note: Key is math AND reading / students per school | include .groupby | make sure formula provides the correct numeric output
overall_pass_rate = all_data_complete[(all_data_complete["reading_score"] >= 70) & (all_data_complete["math_score"] >= 70)].groupby("school_name")["Student ID"].count()/students_per_school*100
#print (overall_pass_rate)

In [37]:
total_school_summary = pd.DataFrame({
    "School Type": school_types,
    "Total Students": students_per_school,
    "Total School Budget": total_school_budget,
    "Per Student Budget": budget_per_student,
    "Average Math Score": average_math_school,
    "Average Reading Score": average_reading_school,
    "% Passing Math": percent_math_school,
    "% Passing Reading": percent_reading_school,
    "% Overall Passing Rate": overall_pass_rate
})
#print (total_school_summary)

# Formatting the data under each column by referencing column name
total_school_summary["Total Students"] = total_school_summary["Total Students"].map("{:,}".format)
total_school_summary["Total School Budget"] = total_school_summary["Total School Budget"].map("${:,.2f}".format)
total_school_summary["Per Student Budget"] = total_school_summary["Per Student Budget"].map("${:,.2f}".format)
total_school_summary["Average Math Score"] = total_school_summary["Average Math Score"].map("{:,.2f}".format)
total_school_summary["Average Reading Score"] = total_school_summary["Average Reading Score"].map("{:,.2f}".format)
total_school_summary["% Passing Math"] = total_school_summary["% Passing Math"].map("{:,.3f}%".format)
total_school_summary["% Passing Reading"] = total_school_summary["% Passing Reading"].map("{:,.3f}%".format)
total_school_summary["% Overall Passing Rate"] = total_school_summary["% Overall Passing Rate"].map("{:,.3f}%".format)

total_school_summary

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Bailey High School,District,4976,"$3,124,928.00",$628.00,77.05,81.03,66.680%,81.933%,54.642%
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.06,83.98,94.133%,97.040%,91.335%
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.71,81.16,65.988%,80.739%,53.204%
Ford High School,District,2739,"$1,763,916.00",$644.00,77.1,80.75,68.310%,79.299%,54.290%
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.35,83.82,93.392%,97.139%,90.599%
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.29,80.93,66.753%,80.863%,53.528%
Holden High School,Charter,427,"$248,087.00",$581.00,83.8,83.81,92.506%,96.253%,89.227%
Huang High School,District,2917,"$1,910,635.00",$655.00,76.63,81.18,65.684%,81.316%,53.514%
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.07,80.97,66.058%,81.222%,53.539%
Pena High School,Charter,962,"$585,858.00",$609.00,83.84,84.04,94.595%,95.946%,90.541%


## Top Performing Schools (By % Overall Passing)

In [38]:
# Top performing schools by % Overall Passing
# Note: .head() defaults to list of 5 | Reference dataframe from overall school list (total_school_summary)
# Note: If you shorten the number by too many decimals it will not sort correctly. Originally had .map("{:,.2f}".format)
# Note: in all schools formula and it rounded too much, impacting the list of top 5 and bottom 5.
# Note: Once I expanded to {:,.4f} there were enough decimals to properly differentiate and pull top/bottom 5 correctly.
top_five_schools = total_school_summary.sort_values(["% Overall Passing Rate"],ascending=False)
top_five_schools.head()


Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.06,83.98,94.133%,97.040%,91.335%
Thomas High School,Charter,1635,"$1,043,130.00",$638.00,83.42,83.85,93.272%,97.309%,90.948%
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.35,83.82,93.392%,97.139%,90.599%
Wilson High School,Charter,2283,"$1,319,574.00",$578.00,83.27,83.99,93.868%,96.540%,90.583%
Pena High School,Charter,962,"$585,858.00",$609.00,83.84,84.04,94.595%,95.946%,90.541%


## Bottom Performing Schools (By % Overall Passing)

In [39]:
# Top performing schools by % Overall Passing
# Note: .head() defaults to list of 5 | Reference dataframe from overall school list (total_school_summary)
top_five_schools = total_school_summary.sort_values(["% Overall Passing Rate"],ascending=True)
top_five_schools.head()


Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Rodriguez High School,District,3999,"$2,547,363.00",$637.00,76.84,80.74,66.367%,80.220%,52.988%
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.71,81.16,65.988%,80.739%,53.204%
Huang High School,District,2917,"$1,910,635.00",$655.00,76.63,81.18,65.684%,81.316%,53.514%
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.29,80.93,66.753%,80.863%,53.528%
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.07,80.97,66.058%,81.222%,53.539%


## Math Scores by Grade

In [40]:
# Math scores by grade using dataframes by location (loc)
# Note: Continue referencing District Summary dataframe | call on school_name for list/dict of math_score from dataset
# Use of == Comparison with a scalar (one value), using either the operator or method
ninth_math = all_data_complete.loc[all_data_complete["grade"] == "9th"].groupby("school_name")["math_score"].mean()
tenth_math = all_data_complete.loc[all_data_complete["grade"] == "10th"].groupby("school_name")["math_score"].mean()
eleventh_math = all_data_complete.loc[all_data_complete["grade"] == "11th"].groupby("school_name")["math_score"].mean()
twelfth_math = all_data_complete.loc[all_data_complete["grade"] == "12th"].groupby("school_name")["math_score"].mean()


# Create dataframe | variables above are referenced below (ninth_read, tenth_read, etc.)
math_scores = pd.DataFrame({
        "9th": ninth_math,
        "10th": tenth_math,
        "11th": eleventh_math,
        "12th": twelfth_math
})
#print (math_scores)

# Use to hide unwanted header row text
# Note: Use "" as empty.
math_scores.index.name = ""

# Formatting the data under each column by referencing column name
math_scores["9th"] = math_scores["9th"].map("{:,.2f}".format)
math_scores["10th"] = math_scores["10th"].map("{:,.2f}".format)
math_scores["11th"] = math_scores["11th"].map("{:,.2f}".format)
math_scores["12th"] = math_scores["12th"].map("{:,.2f}".format)
math_scores


Unnamed: 0,9th,10th,11th,12th
,,,,
Bailey High School,77.08,77.0,77.52,76.49
Cabrera High School,83.09,83.15,82.77,83.28
Figueroa High School,76.4,76.54,76.88,77.15
Ford High School,77.36,77.67,76.92,76.18
Griffin High School,82.04,84.23,83.84,83.36
Hernandez High School,77.44,77.34,77.14,77.19
Holden High School,83.79,83.43,85.0,82.86
Huang High School,77.03,75.91,76.45,77.23
Johnson High School,77.19,76.69,77.49,76.86


## Reading Scores by Grade 

In [41]:
# Reading scores by grade using dataframes by location (loc)
# Note: Continue referencing District Summary dataframe | call on school_name for list/dict of math_score from dataset
# Use of == Comparison with a scalar (one value), using either the operator or method
ninth_reading = all_data_complete.loc[all_data_complete["grade"] == "9th"].groupby("school_name")["reading_score"].mean()
tenth_reading = all_data_complete.loc[all_data_complete["grade"] == "10th"].groupby("school_name")["reading_score"].mean()
eleventh_reading = all_data_complete.loc[all_data_complete["grade"] == "11th"].groupby("school_name")["reading_score"].mean()
twelfth_reading = all_data_complete.loc[all_data_complete["grade"] == "12th"].groupby("school_name")["reading_score"].mean()

# Create dataframe | variables above are referenced below (ninth_read, tenth_read, etc.)
reading_scores = pd.DataFrame({
        "9th": ninth_reading,
        "10th": tenth_reading,
        "11th": eleventh_reading,
        "12th": twelfth_reading
})

reading_scores.index.name = "School"
#print (read_scores)

# Use to hide unwanted header row text
# Note: Use "" as empty.
reading_scores.index.name = ""

# Formatting the data under each column by referencing column name
reading_scores["9th"] = reading_scores["9th"].map("{:,.2f}".format)
reading_scores["10th"] = reading_scores["10th"].map("{:,.2f}".format)
reading_scores["11th"] = reading_scores["11th"].map("{:,.2f}".format)
reading_scores["12th"] = reading_scores["12th"].map("{:,.2f}".format)
reading_scores


Unnamed: 0,9th,10th,11th,12th
,,,,
Bailey High School,81.3,80.91,80.95,80.91
Cabrera High School,83.68,84.25,83.79,84.29
Figueroa High School,81.2,81.41,80.64,81.38
Ford High School,80.63,81.26,80.4,80.66
Griffin High School,83.37,83.71,84.29,84.01
Hernandez High School,80.87,80.66,81.4,80.86
Holden High School,83.68,83.32,83.82,84.7
Huang High School,81.29,81.51,81.42,80.31
Johnson High School,81.26,80.77,80.62,81.23


## Scores by School Spending

In [42]:
# Create four bins for spending ranges per student using "budget" and "size"
# Note: Bins below must be one greater than the labels, otherwise error ("Bin labels must be one fewer than the number of bin edges")
spending_bins = [0, 580, 620, 650, 680]
#print (spending_bins)

# Create labels for these bins
group_names = ["<$580", "$580-620", "$620-650", "$650-680"]
#print (group_names)

# .pd cut required to slice data
all_data_complete["spending_bins"] = pd.cut(all_data_complete["budget"]/all_data_complete["size"], spending_bins, labels = group_names)

#group by spending bins
by_spending = all_data_complete.groupby("spending_bins")

# Average math score
math_average = by_spending["math_score"].mean()
#print (math_average)

# Average reading score
reading_average = by_spending["reading_score"].mean()
#print (reading_average)

#% pass math
percent_pass_math = all_data_complete[all_data_complete["math_score"] >=70].groupby("spending_bins")["Student ID"].count()/by_spending["Student ID"].count()*100
#print (percent_pass_math)

# % pass reading
percent_pass_reading = all_data_complete[all_data_complete["reading_score"] >=70].groupby("spending_bins")["Student ID"].count()/by_spending["Student ID"].count()*100
#print (percent_pass_reading)

# % overall pass rate
# Note: Key is math AND reading / students per school | include .groupby | make sure formula provides the correct numeric output
overall_pass_rate = all_data_complete[(all_data_complete["reading_score"] >= 70) & (all_data_complete["math_score"] >= 70)].groupby("spending_bins")["Student ID"].count()/by_spending["Student ID"].count()*100
#print(overall_pass_rate)

In [43]:
# Create Scores by School Spent table
score_school_spend = pd.DataFrame({
    "Average Math Score": math_average,
    "Average Reading Score": reading_average,
    "% Passing Math": percent_pass_math,
    "% Passing Reading": percent_pass_reading,
    "% Overall Passing Rate": overall_pass_rate
})

#print (score_school_spend)

# Use to hide unwanted or add header row text
# Note: Use "" as empty.
score_school_spend.index.name = "Spending Ranges (Per Student)"

# Formatting the data under each column by referencing column name
score_school_spend["Average Math Score"] = score_school_spend["Average Math Score"].map("{:,.2f}".format)
score_school_spend["Average Reading Score"] = score_school_spend["Average Reading Score"].map("{:,.2f}".format)
score_school_spend["% Passing Math"] = score_school_spend["% Passing Math"].map("{:,.3f}%".format)
score_school_spend["% Passing Reading"] = score_school_spend["% Passing Reading"].map("{:,.3f}%".format)
score_school_spend["% Overall Passing Rate"] = score_school_spend["% Overall Passing Rate"].map("{:,.3f}%".format)
score_school_spend


Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Spending Ranges (Per Student),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
<$580,83.27,83.99,93.868%,96.540%,90.583%
$580-620,83.46,83.91,93.816%,96.416%,90.452%
$620-650,77.85,81.34,70.271%,83.109%,58.863%
$650-680,77.03,81.03,66.340%,81.038%,53.522%


## Scores by School Size

In [44]:
# Create four bins for scores by school size using "size"
# Note: Bins below must be one greater than the labels, otherwise error ("Bin labels must be one fewer than the number of bin edges")
size_bins = [0, 1000, 2000, 5000]
#print (size_bins)

# Create labels for these bins
group_names = ["Small (<1000)", "Medium (1000-2000)", "Large (2000-5000)"]
#print (group_names)

# .pd cut required to slice data
# Note: include ["size"] to ref school
all_data_complete["size_bins"] = pd.cut(all_data_complete["size"], size_bins, labels = group_names)

#group by school size bins
groupby_size = all_data_complete.groupby("size_bins")
#print(groupby_size)

# Average math score
math_average = groupby_size["math_score"].mean()
#print (math_average)

# Average reading score
reading_average = groupby_size["reading_score"].mean()
#print (reading_average)

#% pass math
percent_pass_math = all_data_complete[all_data_complete["math_score"] >=70].groupby("size_bins")["Student ID"].count()/groupby_size["Student ID"].count()*100
#print(percent_pass_math)

# % pass reading
percent_pass_reading = all_data_complete[all_data_complete["reading_score"] >=70].groupby("size_bins")["Student ID"].count()/groupby_size["Student ID"].count()*100
#print(percent_pass_reading)

# % overall pass rate
#Note: Average of % pass math and % pass reading
overall_pass_rate_size = all_data_complete[(all_data_complete["reading_score"] >= 70) & (all_data_complete["math_score"] >= 70)].groupby("size_bins")["Student ID"].count()/groupby_size["Student ID"].count()*100
#print(overall_pass_rate)


In [45]:
# Create Scores by School Size table
score_school_size = pd.DataFrame({
    "Average Math Score": math_average,
    "Average Reading Score": reading_average,
    "% Passing Math": percent_pass_math,
    "% Passing Reading": percent_pass_reading,
    "% Overall Passing Rate": overall_pass_rate_size
})

#print (score_school_size)

# Use to hide unwanted or add header row text
# Note: Use "" as empty.
score_school_size.index.name = "School Size"

# Formatting the data under each column by referencing column name
score_school_size["Average Math Score"] = score_school_size["Average Math Score"].map("{:,.2f}".format)
score_school_size["Average Reading Score"] = score_school_size["Average Reading Score"].map("{:,.2f}".format)
score_school_size["% Passing Math"] = score_school_size["% Passing Math"].map("{:,.3f}%".format)
score_school_size["% Passing Reading"] = score_school_size["% Passing Reading"].map("{:,.3f}%".format)
score_school_size["% Overall Passing Rate"] = score_school_size["% Overall Passing Rate"].map("{:,.3f}%".format)
score_school_size


Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
School Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Small (<1000),83.83,83.97,93.952%,96.040%,90.137%
Medium (1000-2000),83.37,83.87,93.617%,96.773%,90.624%
Large (2000-5000),77.48,81.2,68.652%,82.125%,56.574%


## Scores by School Type

In [46]:
# Group by school size using "type"
# Notes: Use "groupby" since there are only two types of data (charter and district)
groupby_spending = all_data_complete.groupby("type")
#print(groupby_spending2)

# Average math score
math_average = groupby_spending["math_score"].mean()
#print (math_average)

# Average reading score
reading_average = groupby_spending["reading_score"].mean()
#print (reading_average)

#% pass math
percent_pass_math = all_data_complete[all_data_complete["math_score"] >=70].groupby("type")["Student ID"].count()/groupby_spending["Student ID"].count()*100
#print(percent_pass_math)

# % pass reading
percent_pass_reading = all_data_complete[all_data_complete["reading_score"] >=70].groupby("type")["Student ID"].count()/groupby_spending["Student ID"].count()*100
#print(percent_pass_reading)

# % overall pass rate
# Note: Average of % pass math and % pass reading
overall_pass_rate_type = all_data_complete[(all_data_complete["reading_score"] >= 70) & (all_data_complete["math_score"] >= 70)].groupby("type")["Student ID"].count()/groupby_spending["Student ID"].count()*100
#print(overall_pass)


In [47]:
# Create Scores by School Type table

score_school_type = pd.DataFrame({
    "Average Math Score": math_average,
    "Average Reading Score": reading_average,
    "% Passing Math": percent_pass_math,
    "% Passing Reading": percent_pass_reading,
    "% Overall Passing Rate": overall_pass_rate_type
})

score_school_type

# Use to hide unwanted or add header row text
# Note: Use "" as empty.
score_school_type.index.name = "School Type"

# Formatting the data under each column by referencing column name
score_school_type["Average Math Score"] = score_school_type["Average Math Score"].map("{:,.2f}".format)
score_school_type["Average Reading Score"] = score_school_type["Average Reading Score"].map("{:,.2f}".format)
score_school_type["% Passing Math"] = score_school_type["% Passing Math"].map("{:,.3f}%".format)
score_school_type["% Passing Reading"] = score_school_type["% Passing Reading"].map("{:,.3f}%".format)
score_school_type["% Overall Passing Rate"] = score_school_type["% Overall Passing Rate"].map("{:,.3f}%".format)
score_school_type


Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
School Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Charter,83.41,83.9,93.702%,96.646%,90.561%
District,76.99,80.96,66.518%,80.905%,53.696%


In [48]:
# End