Skip to content

Use Python to analyze data on student funding and student standardize test scores with the aim of aggregating the data and showcasing trends in performance

Notifications You must be signed in to change notification settings

Geneille/School_District_Analysis

Repository files navigation

School_District_Analysis

Overview and Objectives

The school district board has presented both school and student data for analysis. Presented with a variety of information from a variety of sources and in a variety of formats, the task was to prepare all standardize test scores for analysis and reporting. The data was then analyzed and the results was presented to provide insights about trends and patterns to aid in effective decision making for budget allocation.

The project challenge involved analyzing school and student statistics with the main feedback being data on percentage pass rates for mathematics, reading and both of these subjects. In addition, the analysis included summarized results based on a specific set of categories for example top and bottom performing schools, and outcomes based on budget and size, to name a few.

However, there is evidence of tampering of reading and math grades for one of the schools at a speicific grade level. This dishonesty challenges the integrity of the state-testing standards and may have significant implications on the overall outcome of the analysis. Considering the analysis was already completed and summarized, the objectives of this project were:

  • replace the grades for the school that had evidence of tampering
  • repeat the analysis to produce the following summarized outcomes: district summary, school summary, math and reading scores by grade, high and low performing schools, and scores by school budget, school size and school type.
  • compare the outcome of the analysis before and after the grades were replaced

Resources

  • The data file used in the analysis is students_complete.csv and school_complete.csv
  • Languages & Tools: Python/Pandas, Jupyter notebook

Results

The inital result, before analyzing the data again because of suspected tampering can be found in PyCitySchools.ipynb file. The result for the project challege to reevaluate the data because of the tampering can be found in PyCitySchools_Challenge.ipynb. Answers to specific questions are listed below.

  • How is the district summary affected?

    The results for the district summary before and after the data clean up is presented in the figures below. As can be observed, replacing the 9th grade data for THS does not have a significant impact on school district data.

    Figure showing District Summary before data clean

    DS before

    Figure showing District Summary after data clean

    DS after
  • How is the school summary affected?

    The table below summarizes THS data output before and after the math and reading scores were replaced. As can be observed, the school summary was only signicantly impacted in the areas of % passing math, % passing english and overall percentage passing both subjects. The averages for THS for these two subjects was not affected by the data clean.

                      Average math scores   Average reading scores   % passing maths   % passing reading    % Overall passing 
      Before Clean     83.4 	           83.5	                  93                  97                   91 
      After Clean      83.4                    83.9	                  67                  70                   65
    
  • How does replacing the ninth graders’ math and reading scores affect Thomas High School’s performance relative to the other schools?

    By using the ranking function in pandas [code: per_school_summary_df_rank=per_school_summary_df['% Overall Passing'].rank(ascending=False)], it was observed that the tampered data had a significant impact on THS overall performance compared to other schools. This overall percent performance is based on both maths and reading have a score greater than or equal to 70%.

    Before the data was cleaned, with an overall passing percent of 91% THS ranked second for all the schools analyzed. After replacing the 9th graders scores, with an overall passing percent of 65% Thomas High School ranked 8th for all the schools analyzed.

  • How does replacing the ninth-grade scores affect the following: #Math and reading scores by grade

    The following information is a comparion of the percent math scores for the different grades for THS.

         	         9th	10th	11th	12th
      Before Clean   83.6	83.1	83.5	83.5
      After Clean    nan	83.1	83.5	83.5
    

    The following information is a comparion of the percent reading scores for the different grades for THS.

                       9th	 10th	 11th	 12th
      Before Clean     83.7	 84.3	 83.6	 83.8
      After Clean      nan	 84.3	 83.6	 83.8
    

    The results is within expectations. The only difference between the scores is that under 9th graders which reflects a "NaN", as was changed in the analysis

  • Scores by school spending

    Replacing the reading and maths score for THS with 'Nan' did not affect the school spending per student (budget per student). This is within expectations as the number of students nor budget did not change. Only the math and reading scores from the original was altered and therfore any analysis directly linked to these data would change. THS spending budget per student is approximately $638.00.

  • Scores by school size

    THS is in the "Medium (1000-2000)" size bucket. From the data, replacing the 9th grade maths and reading scores did not affect the mean scores for this school range.

  • Scores by school type

    Scores by school type reamins unaffected by the data change.

Summary

After the 9th grade scores for both maths and reading was replaced the analyzed output most significantly impacted is % passing math, % passing english, overall percentage passing both subjects, and THS overall performance (/ranking) compared to other schools.

About

Use Python to analyze data on student funding and student standardize test scores with the aim of aggregating the data and showcasing trends in performance

Topics

Resources

Stars

Watchers

Forks