Skip to content
This repository has been archived by the owner on Jan 7, 2022. It is now read-only.
Aaron M edited this page Sep 1, 2015 · 10 revisions

This project currently has one main section functional and one main section planned:

  • Pair Data Science (functional)
  • Data Scientist Skills Analyzer (planned)

#Pair Data Science This software helps a Meetup Organizer arrange an event called "Pair Data Science." Pair Data Science works as follows:

  • Data scientists (or aspiring data scientists) opt-in on a weekly basis
  • Each opted-in user fills out a survey about their skills and preferences
  • Each opted-in user gets assigned a partner ("pair")
  • Pairs are told of their assignments by Meetup message
  • Pairs have 10 days to meet in person and work on a project

The software works by interacting with the Meetup API and SurveyMonkey API. In particular, it pulls the list of opted-in users from Meetup and matches it with user data pulled in via SurveyMonkey. Users that haven't taken their survey can get a reminder via Meetup message. The software then uses some matching algorithm to assign pairs (as well as possible). Meetup message is used to deliver the assignments.

There is still a lot of room for improvement:

  • Survey data isn't persisted anywhere outside of SurveyMonkey:
    • Persist record of previous matches, promoting unique matches week over week
    • Reduce number and size of API calls
  • The current matching algorithm (GeogMatcher) only considers geography.
  • Unmatched users should be assigned to a group message with all other unmatched users
  • If there is an odd number of users, the odd user out should be assigned to a group of three

#Data Scientist Skills Analyzer This software helps a data scientist assess their skills relative to other data scientists. This information will be useful in making professional development decisions. The roadmap is as follows:

##Roadmap ###Collect:

  • Pull data from existing SurveyMonkey survey on skills
  • Future: Update existing SurveyMonkey survey on skills
  • Future: Add a new SurveyMonkey survey on feedback from pairings

###Store:

  • Data held in memory in User class for analysis
  • Future: Persist data in database

###Extract-Transform-Load:

  • Future: Build pipeline

###Analyze:

  • Future: Exploratory data analysis to understand dimensions on which data scientists vary
  • Future: Discover useful insights to report

###Share:

  • Future: Build internal dashboard with results of exploratory data analysis
  • Future: Build user report that visualizes that user's place in the universe of data scientists
  • Future: Deliver 'insights' about how the user can advance in the field

###Expected Actions:

  • Future: Data Scientists can better allocate professional development time

##Implementation (Benefits): Implementing this solution requires many Data Science skills:

###Engineering:

  • Requires building a data pipeline to ingest, store, and transform data
  • Building a reporting system
  • Requires contributing to an open source project

###Data Analysis:

  • Requires exploratory data analysis
  • Requires generating automated insights for reports

###Subject Matter Expertise:

  • Requires an understanding of what insights are important to the end user

If you'd like to work on the project, please reach out to Aaron - he might just mentor you :-)

Clone this wiki locally