Skip to content

Pandas is a powerful data analysis Python library. I have used Pandas Framework to analyse a dataset on schools and students.

Notifications You must be signed in to change notification settings

subhasushi/Pandas-PySchoolAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

Python Pandas

Pandas is a powerful data analysis Python library. I have used Pandas Framework to analyse datasets on schools and students. In this assignment, I have used the Pandas library to make strategic decisions regarding future school budgets and priorities. I have analysed the district-wide standardized test results mainly the reading and math scores of students to uncover onbious trends in school performance.

  • Pandas and Jupyter notebook was used to write the code and create dataframes.
  • Single script which runs on both the data sets.

The final report includes the following:

District Summary A high level snapshot (in table form) of the district's key metrics, including: Total Schools Total Students Total Budget Average Math Score Average Reading Score % Passing Math % Passing Reading Overall Passing Rate (Average of the above two)

School Summary An overview table that summarizes key metrics about each school, including: School Name School Type Total Students Total School Budget Per School Budget Average Math Score Average Reading Score % Passing Math % Passing Reading Overall Passing Rate (Average of the above two)

Top Performing Schools (By Passing Rate) A table that highlights the top 5 performing schools based on Overall Passing Rate. Include: School Name School Type Total Students Total School Budget Per School Budget Average Math Score Average Reading Score % Passing Math % Passing Reading Overall Passing Rate (Average of the above two)

Low Performing Schools (By Passing Rate) A table that highlights the bottom 5 performing schools based on Overall Passing Rate. Include all of the same metrics as above.

Math Scores by Grade A table that lists the average Math Score for students of each grade level (9th, 10th, 11th, 12th) at each school.

Reading Scores by Grade A table that lists the average Reading Score for students of each grade level (9th, 10th, 11th, 12th) at each school.

Scores by School Spending A table that breaks down school performances based on average Spending Ranges (Per Student). 4 reasonable bins to group school spending. The table includes each of the following: Average Math Score Average Reading Score % Passing Math % Passing Reading Overall Passing Rate (Average of the above two)

Scores by School Size Breakdown by grouping schools based on a reasonable approximation of school size (Small, Medium, Large).

Scores by School Type Breakdown by grouping schools based on school type (Charter vs. District).

Analysis Observations From the above analysis, the following conclusions could be drawn

  • It is clear that the top 5 performing schools are:

    1. Wilson High School
    2. Pena High School
    3. Wright High School
    4. Cabrera High School
    5. Holden High School
  • Schools with higher spending averages, has lower overall passing rate.

  • Medium and small schools have higher overall passing rate compared to the larger schools.

  • Finally Charter schools have higher overall passing rate than district schools.

About

Pandas is a powerful data analysis Python library. I have used Pandas Framework to analyse a dataset on schools and students.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published