Skip to content

nbcarroll/Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Projects

Academic Projects

Our group's focus in this project was to predict if a county had a high cancer rate based on data from Washington State Industrial Facilities' Toxic Release Inventory. We determined which models performed best, with Random Forest, Gradient Boosting, and Neural Network with Grid Search being the top performers. Lastly, we explored potential use cases by the EPA, as well as other non-governmental organizations, and discussed the limitations of our findings.

Scalene Works: HR Analytics | Classification

Jupyter Notebook | Tableau Dashboard

Scalene Works People Solutions LLC is a talent acquisition agency. As a business-to-business (B2B) company in HR, its objective is to create value for clients in the hiring process. There are many challenges to hiring, but the one they have identified as an outstanding, unaddressed problem is the number of candidates who receive job offers, accept them, but then for whatever reason never join the company (the 'renege rate'). In this assignment, I built a Random Forest Classifier that can be used by Scalene Works to predict candidates who are less likely to join after accepting the offer.

By leveraging existing customer demographic and purchase data, I identified two groups and the characteristics which define them so a hypothetical marketing team could better serve each customer base.

For my Data Visualization course assignment, we were given a hypothetical dataset of Amazon technology product sales occurring over several months in 2019, in a select number of urban ZIP codes. The task was to explore the dataset to find an interesting story to tell. After analyzing the data, I was able to identify the cities, product types, and times of year that the company should focus its marketing efforts on to maximize impact in a written report addressed to a hypothetical Amazon Business Intelligence team.

Airport / Airline Choice Analysis and Modeling

Presented written recommendations for supply chain optimization and other substantive business decisions based on quantitative analysis for airline clients operating out of two airports in Asia. Utilizing Python, Numpy, Pandas, and Sci-Kit Learn, we conducted exploratory data analysis, data preprocessing, and data mining, then developed logistic regression and decision tree models from passenger demographic and flight-specific data to gain a deeper understanding of the most relevant features driving a passenger’s choice of airport and airline.
Project Data & Code is not included for confidentiality reasons

Reducing Regional Hits to the Supply Chain

An automotive manufacturer-provided data from one of their manufacturing plants and asked our small group to build a model using machine learning algorithms that would predict which factors were the most important in assessing when a hit to the supply chain was likely to occur in the future. After reducing the size of the data set, we built a Random Forest Classifier Model that with 85.9% accuracy could predict when a future supply chain issue was likely to occur.
Project Data & Code is not included for confidentiality reasons

Labor Market Analysis and Modeling

Built logarithmic models utilizing R from individual-level data from the 2019 American Community Survey from California to examine the effects of education and race on earnings

Mortgage Lending Decisions Analysis and Modeling

Conducted exploratory data analysis on data from the 2017 Home Mortgage Disclosure Act provided by the Consumer Financial Protection Bureau for the state of Delaware and built logit and probit models utilizing maximum likelihood estimation in R to gain a better understanding of the probability of loan approval across racial and ethnic groups.

Non-Academic Projects

Analyzed over a thousand skincare products from lookfantastic.com to determine which ingredients were the most frequent, and to find connections between price, serving size, product types, and ingredients. Built a Tableau dashboard to present findings.

Rekordbox, a program widely used by DJs to organize their music libraries, contains a vast amount of metadata. However, the songs in DJs' libraries often come from diverse sources with inconsistent metadata tagging standards, resulting in an unclean dataset. This project focused on data cleaning and exploratory data analysis to address this challenge. Through this project, I delved into my music library to gain insights and improve data quality.

Soundcloud's mobile application has undergone many changes in the past year. This project helped to gain a better understanding of user's complaints regarding these changes.

The payment feature offered by Linktree is user-friendly, but it poses challenges for recipients and lacks native ticketing functionality. In this project, I focused on enhancing data usability through thorough data cleaning.

Guided Projects

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published