A description of what this GitHub organization is.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



This is the GitHub organization page for Amherst College STAT/MATH 495 Advanced Data Analysis (Fall 2017). Whereas the course content and syllabus are on the course webpage, this organization centers around the problem sets.

Executive summary tl;dr

Students retrieve and submit the problem sets from this organization by using GitHub forks and pull requests.

Problem set submission process

The typical work flow for a problem set is below. All italics indicate GitHub terminology/lingo; this and more are explained in Prof. Jenny Bryan's Happy Git and GitHub for the useR

  1. On GitHub: The instructor will post a skeleton outline of each problem set in its own repository AKA repo, for example PS01. This repository will contain the necessary data files and a template R Markdown file. Let's call this the master copy of the repo.
  2. On GitHub: Students will fork (i.e. make a copy of) the repo to their own GitHub account.
  3. GitHub -> locally: Students will clone (i.e. download) this forked repo locally as an RStudio project on their own machine.
  4. Locally: Students will complete the problem set on their own machine.
  5. Locally -> GitHub: Students will commit and push (i.e. upload) their work to the forked copy of the repo in their own GitHub account.
  6. On GitHub: Students will submit their work via pull request. This is a request to the owner of the master copy of the repo (in this case the instructor) to inspect and merge the proposed changes. When prompted to "Open a pull request", please give it title your name!
  7. Feedback will be delievered.
  8. The instructor will however not complete the final step of the typical pull request: they will not merge the proposed changes.

Why are we doing this?

Question: Why did you set up this complicated scheme? Why not just give all students write-access to the repositories (by making them a collaborator) and allow them to submit individual files?

Answer: Because much of the collaboration that occurs in the open-source world centers around pull requests to propose changes/improvements. For example, many of the crowd-sourced changes/improvements to the ggplot2 R package for data visualization was done via one of (as of 2017-09-06) 610 pull requests. I would like to empower students to start taking their first steps of participation in this ecosystem.


Start small! Among my earliest pull requests; a very minor one. Open this link and click on the "Files Changed" tab.