Skip to content

Made use of Python's Pandas library to get insights from raw data.

Notifications You must be signed in to change notification settings

brizvi4/School_District_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

School District Analysis

Overview of the school district analysis

For this analysis, I was asked by Maria to help her in analysing school district data. Basically, I had to extract meaningful information from the data that would help the school board in making informed decisions regarding funds allotment. The data contained information about 39,170 students and 15 schools. I used Pandas Library in Pyhton to perform my analysis and calculations. I generated school district summary, high and low performing schools, average math and reading scores by grade, grouped scores by school spending per student and by school size. Later, Maria was informed by the school board that for 9th grade students of THS (Tomsas High School) there is evidence which suggests academic dishonesty. Thus, I was asked to replace all math and reading score values for those particular students with NaNs. After doing that, I performed all the previously mentioned calculations again. And finally, I was aked to write a report on how these changes affected the overall analysis.

School District Analysis Results

  • By looking at the disctict summary tables below, we can clearly see that Average Math Score and percentages decreased after removing Thomas High School ninth graders. The decrease was very little but still it was there.

image

image

  • For the school summary tables, it can be seen that except for Average reading scores, all values decreased. Please see the images below.

image

image

  • From the images below, we can see that the postion of THS is number 2 in both cases. So relative to other schools, their performance did not change after removing ninth graders.

image

image

  • The only thing which changed for Math and reading scores by grade is that now there are NaN values for ninth graders of THS.

image

image

image

image

  • For the spending ranges, only a very minute difference is seen and that too only when the columns are not formatted. After formatting the columns, I get the same values for both tables. This can be seen in the two images below:

image

image

  • We see a similar result for scores by school size and school type as seen below:

image

image

image

image

In order to differentiate between the two results, I added a 'new' to the names of tables which had ninth grade students from THS removed from them.

Summary

Following are the four changes which can be clearly seen from the analysis:

  • For district_summary_df, Average Math Score, % Passing Math, % Passing Reading and % Overall Passing decreased a bit while Average Reading Score remained same
  • For the school summary tables, Average Math Score, % Passing Math, % Passing Reading and % Overall Passing decreased while Average Reading Score increased a bit
  • One change which can be seen in math and reading scores by grade is that ninth grade values for THS have been replaces by NaNs.
  • For the spending ranges, only a very minute difference is seen and that too only when the columns are not formatted. After formatting the columns, I get the same values for both tables.

About

Made use of Python's Pandas library to get insights from raw data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published