Skip to content

This project analyzes the Student Study Performance Dataset to explore factors affecting student performance, focusing on gender, ethnicity, and lunch type. Using Python and statistical tests (Z-test, Q-test), it identifies patterns and insights that can inform educational strategies and interventions.

Notifications You must be signed in to change notification settings

doodlemon/StudentStudyPerformanceDataset_StatisticsProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

Student Study Performance Analysis https://www.kaggle.com/datasets/impapan/student-performance-data-set

This project explores the factors affecting student performance based on the Student Study Performance Dataset. The analysis aims to investigate the following questions:

Who performs better: Males or Females?

What other factors (e.g., ethnicity, lunch type) influence student marks?

The project involves data cleaning, data visualization, and statistical hypothesis testing (Z-test and Q-test) to draw insights from the dataset.

Project Overview

The dataset contains information about students' demographic attributes, study time, and their performance on various assessments. By analyzing this data, we attempt to uncover patterns and answer questions about how various factors like gender, ethnicity, and lunch type impact students' academic performance.

Key Steps in the Analysis Data Cleaning

Handling missing values

Correcting any inconsistencies in the dataset

Formatting columns for analysis

Data Exploration

Descriptive statistics (mean, median, standard deviation)

Visualizations (histograms, bar charts, scatter plots)

Statistical Testing

Z-Test: Comparing the means of two groups (e.g., male vs. female students) to see if there's a significant difference in performance.

Q-Test: Identifying any outliers in the dataset that could skew results.

Hypothesis Testing

Testing the impact of factors like ethnicity and lunch type on student performance. Results and Insights

Conclusions on gender differences in performance

Insights into how other factors like ethnicity and lunch type influence student marks

Technologies Used Python

Pandas for data manipulation

Matplotlib and Seaborn for data visualization

Scipy for statistical testing

Jupyter Notebook for documentation and analysis

Dataset

The dataset used in this analysis is the Student Study Performance Dataset. It contains the following features:

Gender: Male or Female

Ethnicity: Ethnic background of the student

Lunch: Type of lunch (e.g., standard or free/reduced)

Study time: Amount of study time per week

Marks: Final grades of the student

This project was created by me and @jouditafran.

About

This project analyzes the Student Study Performance Dataset to explore factors affecting student performance, focusing on gender, ethnicity, and lunch type. Using Python and statistical tests (Z-test, Q-test), it identifies patterns and insights that can inform educational strategies and interventions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published