Home
This project currently has one main section functional and one main section planned:
- Pair Data Science (functional)
- Data Scientist Skills Analyzer (planned)
#Pair Data Science This software helps a Meetup Organizer arrange an event called "Pair Data Science." Pair Data Science works as follows:
- Data scientists (or aspiring data scientists) opt-in on a weekly basis
- Each opted-in user fills out a survey about their skills and preferences
- Each opted-in user gets assigned a partner ("pair")
- Pairs are told of their assignments by Meetup message
- Pairs have 10 days to meet in person and work on a project
The software works by interacting with the Meetup API and SurveyMonkey API. In particular, it pulls the list of opted-in users from Meetup and matches it with user data pulled in via SurveyMonkey. Users that haven't taken their survey can get a reminder via Meetup message. The software then uses some matching algorithm to assign pairs (as well as possible). Meetup message is used to deliver the assignments.
There is still a lot of room for improvement:
- Survey data isn't persisted anywhere outside of SurveyMonkey:
- Persist record of previous matches, promoting unique matches week over week
- Reduce number and size of API calls
- The current matching algorithm (GeogMatcher) only considers geography.
- Unmatched users should be assigned to a group message with all other unmatched users
- If there is an odd number of users, the odd user out should be assigned to a group of three
#Data Scientist Skills Analyzer This software helps a data scientist assess their skills relative to other data scientists. This information will be useful in making professional development decisions. The roadmap is as follows:
##Roadmap ###Collect:
- Pull data from existing SurveyMonkey survey on skills
- Future: Update existing SurveyMonkey survey on skills
- Future: Add a new SurveyMonkey survey on feedback from pairings
###Store:
- Data held in memory in User class for analysis
- Future: Persist data in database
###Extract-Transform-Load:
- Future: Build pipeline
###Analyze:
- Future: Exploratory data analysis to understand dimensions on which data scientists vary
- Future: Discover useful insights to report
###Share:
- Future: Build internal dashboard with results of exploratory data analysis
- Future: Build user report that visualizes that user's place in the universe of data scientists
- Future: Deliver 'insights' about how the user can advance in the field
###Expected Actions:
- Future: Data Scientists can better allocate professional development time
##Implementation (Benefits): Implementing this solution requires many Data Science skills:
###Engineering:
- Requires building a data pipeline to ingest, store, and transform data
- Building a reporting system
- Requires contributing to an open source project
###Data Analysis:
- Requires exploratory data analysis
- Requires generating automated insights for reports
###Subject Matter Expertise:
- Requires an understanding of what insights are important to the end user
If you'd like to work on the project, please reach out to Aaron - he might just mentor you :-)