This repository includes an exercise for aspiring DSaPP volunteers and research assistants to complete
Switch branches/tags
Nothing to show
Clone or download
Latest commit ca61076 Oct 23, 2017
Failed to load latest commit information.
.gitignore Initial commit Oct 18, 2016 Update Oct 23, 2017

DSaPP Research Assistant and Volunteer Projects

This repository includes an exercise for aspiring DSaPP volunteers and research assistants to complete. It will help us understand what skills you have so we can find the best fit for you at DSaPP.

Please do not spend more than 2-4 hours on it. Once you're done, please fill out the application form and include a link to your repository there.

For Software Engineering (Back-End or Front-End) Roles:

Please send us a link to a github repo and a working webapp that you've built. The best examples would be projects that either require user input (like an application system) or something that involves data visualization. Please include in your email a brief explaination of who the client/partner was and what their requirements were for the project as well as how you interacted with them.

For Data Roles (data modelers, database people, data analysts, machine learning, etc):

Please follow these steps:

  1. Fork this repository. You can learn how to do that here
  2. Download the data for Kaggle's "Predicting Excitement at" (here).
  3. Create a Jupyter notebook (using a Python kernel) in your fork of this repository and that includes the following three (or four) sections:
  4. Exploratory analysis. Describe the dataset using counts, averages, frequency tabeles, and plots. Do you see anything interesting or potentially problematic?
  5. A Data Story. Find a specific interesting case in the data and tell us what happens with it. Provide evidence from the data that supports your narrative.
  6. Questions for the Project Partner. What questions would you ask the partner now that you have seen the data? What is missing that you might need to get from them or other data sources to do something useful with the data?
  7. Modeling (For those applying for machine learning positions). Build (and validate) a model with the data that predicts a quantity of interest (fully funded for example), identifies underlying structure in the data, or explores a potentially important relationship among fields in the dataset. What did you learn from it? Why is it potentially useful or relevant to someone's decision making?
  8. Save your notebook and commit it with an informative message, then push the commit to your repository on github.

For Project Management Role: Please send us the following:

  • For any project you’ve done recently, send a deliverable that you did (a report or memo or something similar. For that deliverable please explain what your role/contribution was in the deliverable vs what the team did.
    • Exercise:
      • Given the following scenario (a real life project we're working on right now) please provide some thoughts on how you would scope the project:
        • What data would be needed to do this project?
        • What kind of analysis would need to be done?
        • Who are the critical stakeholders?
        • What questions would you ask in a scoping discussion with the partner?
        • What would an example scope look like?
        • What would be the deliverable be for a project like this?
        • What do you think the high level steps are to get from this discussion to the deliverable you specify
      • Scenario: City governments want us to help them understand what policies should they design around augmenting public transit infrastructure with rideshare services (uber, lyft, etc.) specifically to increase job opportunities for low-income citizens. We can get data from cities and rideshare services such as Uber. The goal is to understand how and if governments can subsidize ridesharing to help people get access to better jobs.