Projects

Academic Projects

Predicting Elevated Cancer Rates Near Washington State Industrial Facilities (Seattle University Capstone)

Our group's focus in this project was to predict if a county had a high cancer rate based on data from Washington State Industrial Facilities' Toxic Release Inventory. We determined which models performed best, with Random Forest, Gradient Boosting, and Neural Network with Grid Search being the top performers. Lastly, we explored potential use cases by the EPA, as well as other non-governmental organizations, and discussed the limitations of our findings.

Scalene Works: HR Analytics | Classification

Jupyter Notebook | Tableau Dashboard

Scalene Works People Solutions LLC is a talent acquisition agency. As a business-to-business (B2B) company in HR, its objective is to create value for clients in the hiring process. There are many challenges to hiring, but the one they have identified as an outstanding, unaddressed problem is the number of candidates who receive job offers, accept them, but then for whatever reason never join the company (the 'renege rate'). In this assignment, I built a Random Forest Classifier that can be used by Scalene Works to predict candidates who are less likely to join after accepting the offer.

Customer Segmentation | Clustering

By leveraging existing customer demographic and purchase data, I identified two groups and the characteristics which define them so a hypothetical marketing team could better serve each customer base.

Amazon Business Intelligence | Data Translation Assignment

For my Data Visualization course assignment, we were given a hypothetical dataset of Amazon technology product sales occurring over several months in 2019, in a select number of urban ZIP codes. The task was to explore the dataset to find an interesting story to tell. After analyzing the data, I was able to identify the cities, product types, and times of year that the company should focus its marketing efforts on to maximize impact in a written report addressed to a hypothetical Amazon Business Intelligence team.

Airport / Airline Choice Analysis and Modeling

Presented written recommendations for supply chain optimization and other substantive business decisions based on quantitative analysis for airline clients operating out of two airports in Asia. Utilizing Python, Numpy, Pandas, and Sci-Kit Learn, we conducted exploratory data analysis, data preprocessing, and data mining, then developed logistic regression and decision tree models from passenger demographic and flight-specific data to gain a deeper understanding of the most relevant features driving a passenger’s choice of airport and airline.
Project Data & Code is not included for confidentiality reasons

Reducing Regional Hits to the Supply Chain

An automotive manufacturer-provided data from one of their manufacturing plants and asked our small group to build a model using machine learning algorithms that would predict which factors were the most important in assessing when a hit to the supply chain was likely to occur in the future. After reducing the size of the data set, we built a Random Forest Classifier Model that with 85.9% accuracy could predict when a future supply chain issue was likely to occur.
Project Data & Code is not included for confidentiality reasons

Labor Market Analysis and Modeling

Built logarithmic models utilizing R from individual-level data from the 2019 American Community Survey from California to examine the effects of education and race on earnings

Mortgage Lending Decisions Analysis and Modeling

Conducted exploratory data analysis on data from the 2017 Home Mortgage Disclosure Act provided by the Consumer Financial Protection Bureau for the state of Delaware and built logit and probit models utilizing maximum likelihood estimation in R to gain a better understanding of the probability of loan approval across racial and ethnic groups.

Non-Academic Projects

Skincare Product Analysis

Analyzed over a thousand skincare products from lookfantastic.com to determine which ingredients were the most frequent, and to find connections between price, serving size, product types, and ingredients. Built a Tableau dashboard to present findings.

Rekordbox EDA

Rekordbox, a program widely used by DJs to organize their music libraries, contains a vast amount of metadata. However, the songs in DJs' libraries often come from diverse sources with inconsistent metadata tagging standards, resulting in an unclean dataset. This project focused on data cleaning and exploratory data analysis to address this challenge. Through this project, I delved into my music library to gain insights and improve data quality.

Soundcloud NLP Analysis

Soundcloud's mobile application has undergone many changes in the past year. This project helped to gain a better understanding of user's complaints regarding these changes.

Linktree Payments as a Ticketing Service

The payment feature offered by Linktree is user-friendly, but it poses challenges for recipients and lacks native ticketing functionality. In this project, I focused on enhancing data usability through thorough data cleaning.

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
Amazon Data Translation Challenge		Amazon Data Translation Challenge
Bandcamp		Bandcamp
Capstone		Capstone
CustomerSegmentation		CustomerSegmentation
DVD Rental		DVD Rental
Rekordbox EDA		Rekordbox EDA
Scalene Works		Scalene Works
Soundcloud NLP Analysis		Soundcloud NLP Analysis
Ticketing		Ticketing
churn		churn
eCommerce		eCommerce
Carroll, Debnath, Geethasree K N, Majumder - Final Paper (Capstone).pdf		Carroll, Debnath, Geethasree K N, Majumder - Final Paper (Capstone).pdf
Data_Transformation_Python.png		Data_Transformation_Python.png
Final Presentation.pptx		Final Presentation.pptx
Index.html		Index.html
ModelPerformance.jpg		ModelPerformance.jpg
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Projects

Academic Projects

Predicting Elevated Cancer Rates Near Washington State Industrial Facilities (Seattle University Capstone)

Scalene Works: HR Analytics | Classification

Customer Segmentation | Clustering

Amazon Business Intelligence | Data Translation Assignment

Airport / Airline Choice Analysis and Modeling

Reducing Regional Hits to the Supply Chain

Labor Market Analysis and Modeling

Mortgage Lending Decisions Analysis and Modeling

Non-Academic Projects

Skincare Product Analysis

Rekordbox EDA

Soundcloud NLP Analysis

Linktree Payments as a Ticketing Service

Guided Projects

The StartUp Dashboard

Coal Terminal Maintenance Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Languages

nbcarroll/Projects

Folders and files

Latest commit

History

Repository files navigation

Projects

Academic Projects

Scalene Works: HR Analytics | Classification

Airport / Airline Choice Analysis and Modeling

Reducing Regional Hits to the Supply Chain

Labor Market Analysis and Modeling

Mortgage Lending Decisions Analysis and Modeling

Non-Academic Projects

Guided Projects

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages