Skip to content

rcharan/mooc-data

Repository files navigation

How to Teach and Learn on the Internet

We examine a dataset from the Open University in the UK. The dataset has granular activity data for students in 6 classes over a period of 2 years (2013-2014).

  • We find statistically significant correlations between specific types of activities and activity patterns and learning outcomes (class grade).

Contributors

Background

This is our third Flatiron School project (NYC Data Science), for module 4

Se the presentation and conclusions on Google Slides or view the pdf in our repository.

How to use this Repo

There are two Jupyter notebooks, feature-engineering.ipynb and analysis.ipynb.

  • feature-engineering assembles the dataset and engineers a number of features related to activity and attention metrics. To use this, you will have to download the original dataset from the Open University (link above). It is about 500MB. The directory of csvs should be put in the top level of the directory structure as "anonymisedData".
  • analysis performs the analysis for the presentation. It works with the 5MB dataset in the repository

There is also a utilities.py file that provides a number of conveniences for use in the notebooks.

Software required: statsmodels, pandas, scipy, and seaborn, as well as their dependencies.

We ran this code with:

  • Python 3.6
  • statsmodels 0.10.1
  • pandas 0.25.1
  • scipy 1.3.1
  • seaborn 0.9.0

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published