This dataset can be downloaded here.
This project analyzes a dataset on students' graduation rates and their influencing factors such as ACT scores, SAT scores, parental education, parental income, high school GPA, and college GPA. The objective of this analysis is to identify patterns and trends in the data to understand the factors affecting graduation rates.
- Overview
- Data Description
- Tools Used
- Analysis and Insights
- How to Use This Repository
- Key Questions Answered
- Conclusion
- Future Improvements
The dataset used for analysis consists of various attributes related to students' academic performance and parental background. It comprises records across multiple columns, including ACT composite score, SAT total score, parental level of education, parental income, high school GPA, college GPA, and years to graduate. The dataset provides comprehensive insights into the factors influencing students' graduation rates.
The tools used for this analysis are:
- Python (including pandas, matplotlib, seaborn)
From the imported dataset, the dataframe consists of the following columns:
ACT composite score
SAT total score
parental level of education
parental income
high school gpa
college gpa
years to graduate
- Correlation Analysis: The correlation matrix helps identify the relationships between different numeric variables.
- Distribution Analysis: Histograms and box plots visualize the distribution of various numeric attributes.
- High School GPA vs. College GPA: A scatter plot highlights the relationship between high school GPA and college GPA.
- Impact of Parental Education: Box plots show how different parental education levels affect students' academic performance and graduation time.
- Clone the repository:
git clone https://github.com/deAlgorithm/graduation_rate.git
- Install dependencies:
pip install -r requirements.txt
-
What is the average ACT composite score and SAT total score of students?
Average ACT Composite Score: 28.607 Average SAT Total Score: 1999.906
-
How does parental level of education impact students' high school and college GPA?
The analysis shows that parental level of education has a noticeable impact on students' GPAs. Specifically, the box plots illustrate that students whose parents have higher levels of education, such as associate's degrees or master's degrees, tend to have higher high school and college GPAs on average. This trend suggests that parental education level could significantly influence academic performance.
-
Is there a correlation between parental income and students' academic performance?
The correlation between parental income and college GPA is 0.27.
The correlation between parental income and college GPA is 0.46.
While there is a positive correlation between parental income and student's high school GPA (0.27), the correlation is stronger between parental income and the student's college GPA (0.46). This implies that parental income might better predict their children's academic performance in college than in high school
-
What is the average number of years taken to graduate?
Average Number of Years Taken to Graduate: 4.982
-
How do high school GPA and college GPA correlate?
The correlation between high school GPA and college GPA is 0.52. This suggests a moderate positive relationship between the two, indicating that students who perform well in high school are likely to also perform well in college, though it is not a perfect predictor
-
How does the GPA ratio (college GPA / high school GPA) vary across different parental education levels?
A higher GPA ratio indicates that students performed relatively better in college compared to high school, while a lower ratio indicates the opposite.
- Correlation Analysis: Certain factors such as parental income and high school GPA show strong correlations with college GPA and years to graduate.
- Distribution Trends: The distributions of ACT scores, SAT scores, and GPAs provide insights into the academic performance of the student population.
- Parental Influence: Parental education and income significantly impact students' academic success and graduation timelines.
- Educational Support: Insights can help educational institutions identify students who might need additional support based on their backgrounds.
- Policy Making: Policymakers can use this data to create programs aimed at improving graduation rates by addressing key influencing factors.
- Predictive Modeling:
- Develop predictive models to forecast students' graduation rates based on their academic and parental backgrounds.
- Detailed Segmentation:
- Perform deeper segmentation to identify specific student personas based on their academic performance and parental background.
- Longitudinal Study:
- Extend the analysis to include longitudinal data to understand how students' academic performance evolves over time.
- Intervention Strategies:
- Propose targeted intervention strategies to help students who are at risk of delayed graduation or poor academic performance.