Chocolate Bar Rating Analysis

Authors:

Rachel K. Riggs

Carrie Cheung

Project Overview

Have you ever wondered where the chocolate beans of your favourite chocolate bar came from, and whether that has an effect on how good it tastes?

To investigate this further, we needed some very delicious chocolate data - we used the chocolate bar ratings dataset from Kaggle, which contains ratings by chocolate experts on over 1,795 individual chocolate bars. The dataset includes additional information about the chocolate bars, including where the chocolate beans were grown, cocoa percentage, and bean variety.

Here is a snapshot of first few rows in the dataset:

(A CSV copy of the data from Kaggle can be found in the data folder of this repository.)

Since Venezuela is one of the largest producers of the Criollo bean, which is considered a delicacy, we set out to answer the following question using the chocolate bar ratings dataset:

Do chocolate bars made from beans grown in Venezuela have a different average rating compared to beans grown elsewhere?

Usage

You can reproduce our analysis in one of three ways with the following steps:

Option #1: With Docker

Clone/download this repository and, using the command line, navigate to the root of this project.
Run the below command in bash (filling in PATH_ON_YOUR_COMPUTER with the absolute path to the root of this project on your computer):

docker run --rm -v PATH_ON_YOUR_COMPUTER:/home/choc_analysis rachelkriggs/dsci_522-chocolate_ratings_analysis make -C '/home/choc_analysis' all

To clean up the analysis:

docker run --rm -v PATH_ON_YOUR_COMPUTER:/home/choc_analysis rachelkriggs/dsci_522-chocolate_ratings_analysis make -C '/home/choc_analysis' clean

Option #2: Without Docker Using Make

Note that using Make to run our analysis is more straight-forward and therefore recommended compared to option #3 below, which requires running multiple scripts.

Clone/download this repository and, using the command line, navigate to the root of this project.
Run the below command in bash:

make all

To clean up the analysis:

make clean

Option #3: Without Docker Without Make

Clone/download this repository and, using the command line, navigate to the root of this project.
Run the below command in bash in the order listed:

Rscript src/01_load_choc_data.R data/flavors_of_cacao.csv data/cleaned_choc_data.csv
Rscript src/02_viz_choc_data.R data/cleaned_choc_data.csv results/choc_data_viz.png
Rscript src/03_analyze_choc_data.R data/cleaned_choc_data.csv results/summarized_choc_data.csv
Rscript src/04_analyze_result_choc_data.R data/cleaned_choc_data.csv results/choc_ratings_analysis_viz.png
Rscript -e "rmarkdown::render('doc/Report.Rmd')"

Usage Flow Chart

The below flowchart visualizes the order the scripts are run as listed in Usage, along with the input file(s) needed and output file(s) produced at each step.

Report

The report for this analysis can be viewed here.

Dependencies

R & R libraries (R version 3.5.1):
- tidyverse_1.2.1
- knitr_1.20
- here_0.1
- infer_0.3.1
- dplyr_0.7.7
- ggplot2_3.0.0

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
data		data
doc		doc
imgs		imgs
results		results
src		src
CONDUCT.md		CONDUCT.md
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

doc

doc

imgs

imgs

results

results

src

src

CONDUCT.md

CONDUCT.md

Dockerfile

Dockerfile

Makefile

Makefile

README.md

README.md

Repository files navigation

Chocolate Bar Rating Analysis

Authors:

Project Overview

Usage

Option #1: With Docker

Option #2: Without Docker Using Make

Option #3: Without Docker Without Make

Usage Flow Chart

Report

Dependencies

About

Releases 4

Packages

Contributors 2

Languages

UBC-MDS/DSCI_522-Chocolate_Ratings_Analysis

Folders and files

Latest commit

History

Repository files navigation

Chocolate Bar Rating Analysis

Authors:

Project Overview

Usage

Option #1: With Docker

Option #2: Without Docker Using Make

Option #3: Without Docker Without Make

Usage Flow Chart

Report

Dependencies

About

Resources

Stars

Watchers

Forks

Languages