September to December 2017
Metis is a 12-week, full-time data science bootcamp. Each project folds in a new mathematical, machine learning concept with a pragmatic programming skill. Below is a brief summary of my projects, in reverse chronological order.
Much work has been done using data science to quanitfy how gerrymandered a map has been drawn, but I had seen less work on using data science to draw an optimally fair map. For this project, I focused on North Carolina, a state which has been aggressively gerrymandered. I found an experimental algorithm developed out of the University of Nebraska, coded it in python from scratch, made a host of changes that I belive are improvements that help it generalize, and demonstrated how the alogorithm should and shouldn't be used.
Natual Language Processing and No-SQL Databases
Twitter is filled with hateful people who will attach and harass you if they don't like what you have to say. These trolls are especially cruel to women. With a collection of coded tweets from a study at Cornell University, I created a two-stage classification tool to flag potentially abusive tweets.
Classification and SQL Databases
Across the country, and at all levels of the criminal justice system, classification algorithms are used to predict offenders' risk or reoffending. These risk assessment scores influence, among other things, sentencing length and parole supervision. However, the algorithms have been shown to have poor predicitive power and exhibit strong racial bias. With data from a U.S. Dept. of Justice study, I developed my own classification algorithm and tested it for racial bias.
Regression and Web Scraping
I was curious to see if the frequency of edits to a publically traded company's Wikipedia page had any correlation to that company's stock price. Turns out, not in the slightest. But it was a really fun exploration.