Skip to content

dazcona/edm-dcu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analysis on Student Data

Leveraging data for 16K students, what can we say about the student experience at this university?

Can this data help us with student success?

Data

Number of students Number of fields
16,786 137

List of fields

Notebooks

  1. EDA: Exploratory Data Analysis by looking at all the fields and their values
  2. Data Summarization:
  3. Features: extract features from the fields
  4. Dimensionality Reduction: PCA, tSNE, UMAP
  5. Split the data: split the dataset into Train and Test sets
  6. Decision Tree: modelling a Decision Tree classifier
  7. Random Forest: modelling a Random Forest classifier and analysing the power of their features
  8. Linear Models & Ablation studies: fitting a Linear Regression model and running ablation studies to measure the variance explained by the features
  9. Mutual Information Exploration: measuring the mutual information (mutual dependence between two variables) of each of the features we figured as important (based on the ablation study) and the precision mark.

Tutorials

You can always view a notebook using https://nbviewer.jupyter.org/

Figures

EDA: Exploring CAO Points

Correlations:

Scatter plot: CAO Points vs Leving Cert Math Points

Decision Tree to predict the final RESULT:

Random Forest: Top 10 Most Important Features

Linear Model: Ablation Study by Excluding Features

About

Analysis on 16K student data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published