This is a project from my Applied Data Analytics course, a core course from my Master of Science in Business Analytics (MSBA) Program at the University of Montana. The purpose of this project is to better understand the Dram Shop customers' purchasing behavior by applying PCA to dimensionality reduction.
The Dram Shop is a local craft brewing company located at Missoula, MT, with its two locations: the Dram Shop Downtown and the Dram Shop Central. It has more than three dozen regional craft beer and wine offerings and is the first of its kind in the state.
- Write a SQL query in Google Big Query (GBQ) that returns the values of
beverage
with the top 1000 gross sales totals. - Write a SQL query built off the previous one that returns three columns:
customer ID
,beverage
andtotal sales
. - Use the pandas function
pivot
, pivot the data into a 'wide' format withcustomer ID
as the rows andbeverage
as the items. - Use the
PCA
function fromsklearn.decomposition
, and perform a PCA on this dataset. - Plot the amount of explained variation in the first 20 components.
- For each of the first 6 components, print the items with the 15 largest loadings in absolute value.