Module 4 Challenge: Thomas High School ninth-graders' reading and math grades appear to have been altered in the students_complete.csv
file according to the school board. Even though the school board is unaware of the entire scope of the cheating, they are looking for assistance in upholding state testing requirements. Thomas High School's math and reading test scores should be replaced with NaN (Not a Number) values, but all other data should be left untouched. The school board wants to repeat the school district analysis conducted previously and receive a report about how these adjustments influenced the overall study once the reading and math scores have been replaced.
The analysis makes use of Python language Pandas library to review the data contained in the CSV file.
The Jupyter notebook contianing all the figures and datasets used for this report can be found at PyCitySchools_Challenge_Final.ipynb
As the images above show, after changing to NaN
the math and reading scores of all the 9th grade students at Thomas High School there is almost no difference between the two summaries at district level. The Average Math Score
changed by 0.1 points, while the Average Reading Score
is the same in both cases. It is worth noting that, based on the data extracted from the CSV, the number of 9th Grade students at Thomas High School is 461, which represents 461/39170 = 1.18% of the total. This percentage is too low to make a big difference on the results for the total school district.
Before and after changing to NaN
, the grades of 9th graders at Thomas High School (THS), the results stayed the same for all the schools, with the obvious exception of the THS school itself. The values for Average Reading Score
and Average Math Score
remained pretty much unchanged, however, the value for % Passing Reading
, % Passing Math
and % Overall Passing
dropped significantly for THS. The reason is that the .mean()
function in Pandas ignores the NaN values, but the % Passing
is a formula that uses the count()
function and whose result depends on the number of students. Since the total number of students at THS is not affected because it is based in the number of unique()
names but the number of students passing math or reading is reduced by ignoring the NaN
values, then the % Passing
is affected.
The formula for calculating the % Passing
is % Passing = Total Students Passing / Total Students in School * 100
How does replacing the ninth graders’ math and reading scores affect Thomas High School’s performance relative to the other schools?
The change in the Thomas High School ninth graders math and reading scores dropped the relative position of THS from the 2nd place in the district down to the 13th position.
Fig. 7: Reading and Math Scores per School and Grade Summary - Before the change to 9th Grade at THS
When reviewing the math and reading scores summary per school and grade, the change only affects the figures of Thomas High School, the numbers for the other schools remain unchanged.
Fig. 9: Reading and Math Scores per School Size and Spending Summary - Before the change to 9th Grade at THS
Fig. 10: Reading and Math Scores per School Size and Spending Summary - After the change to 9th Grade at THS
The position of Thomas High School relative to other schools in the district did not change before and after making the change to the 9th graders; the school remained in 6th place in both cases. This result was to be expected as the spending in each school is related to the number of students and not to the grades obtained. Obviously, the reading and math average scores changed at THS, but that did not affect the total spending of the school.
The Average Math Score
, Average Reading Score
and the % Overall Passing
values do not change from before to after the change to the grades of the 9th graders at Thomas High School.
There are no significant differences in math or reading scores between the groups Small (< 1000)
and Medium (1000-2000)
. However, the scores drop considerably for the group Large (2000-5000)
. The % Overal Passing
score stays between 90 to 91 for the Small and Medium groups, but drops to 58 to the Large group.
When looking at the % Overall Passing
score, it is evident that the Charter
schools perform far better than the District
schools. The first group received 90%, while the second received only 54%. The ratio is 1.67:1, which is quite substantial.
This large disparity appears to be driven primarily by Math, as Charter
schools had an average score of 83.5 percent compared to 77 percent for District
schools, a difference of 6.5 points. When looking at the reading scores, the gap between the two groups is less pronounced, just 2.9 points.
Based on the data examined from the CSV file, we can arrive at the following conclusions:
- The inclusion or exclusion of math and reading scores for 9th graders at Thomas High School has no meaningful impact on the School District's performance.
- Based on the data studied, it is not possible to draw any conclusions about the possibility of dishonesty in the reading and math scores for ninth grade at Thomas High School. Perhaps if the analysis had focused on comparing THS's math and reading data with other schools using statistical analysis, it would have been possible to shed some light on this matter and confirm or deny the hypothesis, but the analysis was focused on reporting administrative metrics rather than on grading scores.
- Small and medium-sized schools appear to perform better in terms of
Overall Passing Scores
. Large schools with over 2000 students underperform when compared to medium and small groups. There are exceptions to the norm, and data demonstrates that a large institution, such asWilson High School
, can achieve high Overall Grades despite its size. - From Figure 10 we can infer that the amount of money spent per student appears to have an inverse relationship with the grades earned. Schools in the highest budget category ($628 - $655) have abysmal Overall Scores hovering about 53%. The sole exception is Thomas High School, which spends $638 per pupil and has a 90.6 percent overall grade. This fact may support the notion of dishonesty, but it is not conclusive. The highest Overal Scores are seen near the bottom of the spending table, where the majority of schools have an Overal Score of 89 percent or above. 'Wilson High School,' once again, is the school in the category that spends less per student while achieving an enviable Overal Score of 90.6 percent.
- In relation to the type of school, Figure 14 cleraly shows that
Charter
schools fare way better thanDistrict
schools.
From the data analyzed, it can be concluded that the schools that perfomed the best are
Charter Schools, of Medium or Small size, and spending less than $625 per student.