Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Final Preso_ Whiskey Business.pdf

Whiskey Business

Data Studio Output

Data Studio image

Setup Instructions

Please run everything as the w205 user unless otherwise stated.

The user should already have hadoop and hive installed and running.

More specifically, if you're booting a UCB instance, you can use the following commands:

As root (update your dev to reflect where your EBS volume is):

mount /dev/xvdf /data
su - w205

As w205 (Optional):


Env setup

If you don't have anaconda installed already, please install it from:

Setup conda env called "w205-project":

conda env create -f environment.yml

Activate env:

source activate w205-project

Update the env when activated if environment.yml is updated:

conda env update -f environment.yml

To remove the project:

conda remove --name w205-project --all

Run all

Activate environment:

source activate w205-project

Add google docs credentials to: export_data/client_secret.json

Run all scripts: ./

Manual Data setup commands

Download data to data source:

python data_get/

Transform data in data source:

python data_get/

Put data into HDFS:

cd loading_and_modelling


Transform data in hive:

cd ../transforming


Pull final table down as CSV with headers:

hive -e 'set hive.cli.print.header=true;select * from whiskey_business;' | sed 's/[\t]/,/g' | sed 's/whiskey_business\.//g' > export_data/data/whiskey_business.csv

Export data from csv to google sheets:

python export_data/

You can’t perform that action at this time.