# Final Project Mchine Learning

* Minoru Kitajima
* 0500-35-9081

## Purpose:

The purpose of this Final Project is to identify the factors that influence coffee sensory evaluation and overall consumer preference.
The key dependent variable (DV) is Liking, measured on a 1–5 scale, indicating how much the taster liked the coffee sample.
Key independent variables (IVs) include:
* Brew temperature condition (e.g., “87-1.0-16”, “87-1.25-20”, etc.)
* TDS (Total Dissolved Solids)
* PE (Extraction)
* Dose (amount of coffee grounds)
* Position (order in which the sample was presented within a session)
* Sensory attributes (binary indicators, 0 or 1), such as:
* Fruity
* Floral
* Earthy
* Nutty
* Chocolate
* Dark chocolate
* Caramel
* Roasted
* Bitter
* Astringent
* Sweet
* Sour
* Thick/viscous
* Rubber

and several others.

This dataset contains 3186 observations, each representing one evaluated coffee sample.
Each case includes judge information, cluster assignment, session metadata, brewing parameters, sensory descriptors, and the DV (Liking).

### Dataset source:

The data come from the [Consumer preference data for black coffee](https://datadryad.org/dataset/doi:10.25338/B8993H):

Analyses of this dataset have been published previously in Ristenpart et al. (2023).


#### References:

Cotter, R. (2023). Coffee sensory evaluation dataset (Version 1). Published January 16, 2023 on Dryad. [link](https://doi.org/10.25338/B8993H)

# Machine Learning
From the original dataset containing 47 explanatory variables, we focus on a subset of features with data types that are straightforward to handle and interpret in a machine-learning setting.
Specifically, we use the following variables:
Dose, Grind, Brew Mass, Percent Extraction, pH, Volume, Brew Temperature, Pour Temp, 90Sec Temp, Flavor.intensity, Acidity, Mouthfeel, and the binary flavor notes Fruit, Bitter, Astringent, Sour, and Sweet.

In addition, results from the FP4 analysis suggested that Flavor.intensity has a nonlinear (approximately quadratic) relationship with overall liking.
To account for this effect while keeping the model interpretable, we include a squared term for Flavor.intensity in the learning process, while all other features enter the model linearly.

To enable meaningful comparisons across explanatory variables, all feature values were standardized before being used in the model.

# Result 1
In this analysis, we applied both linear regression and ridge regression models.
The dataset was split into training and test sets using an 80:20 ratio, and model performance was evaluated using R² and RMSE as evaluation metrics.
The table below compares the results of linear regression and ridge regression, along with a baseline model that always predicts the mean value of the training data.

In [1]:
from ml import Ml_beginner
Ml_beginner()

Unnamed: 0,Model,Test R2,Test RMSE
0,Baseline (mean),-0.01,1.768
1,LinearRegression,0.319,1.452
2,Ridge(alpha=1),0.32,1.45


0

The baseline model, which always predicts the mean of the training data, yields an R² value close to zero and the largest RMSE, indicating that it explains little of the variance in the test data.
In contrast, both linear regression and ridge regression outperform the baseline, achieving R² values of approximately 0.32 and lower RMSE values.
The nearly identical performance of the linear and ridge models suggests that this improvement is not due to overfitting and that the effect of regularization in ridge regression is limited for this dataset.

# Result 2
The table below shows the variances of the explanatory variables used in the training process.

In [2]:
from ml import coef_show
coef_show()

Unnamed: 0,feature,coef
17,Flavor.intensity_sq,-3.263
15,Sour,-0.272
0,Dose,-0.175
14,Astringent,-0.168
13,Bitter,-0.156
2,Brew Mass,-0.102
6,Brew Temperature,-0.07
10,Acidity,-0.038
3,Percent Extraction,-0.026
7,Pour Temp,-0.018


The standardized regression coefficients indicate that Flavor.intensity and its squared term have the largest contributions, suggesting that flavor intensity strongly influences overall liking.
The positive coefficient for the linear term and the negative coefficient for the quadratic term imply a non-monotonic, concave relationship, in which the effect of flavor intensity increases up to a certain point and then saturates.
In contrast, sour, bitter, and astringent notes tend to contribute negatively, while sweetness, fruitiness, and mouthfeel show positive associations with liking.
However, because Flavor.intensity and its squared term are highly correlated, the magnitudes of the individual coefficients should not be interpreted independently; instead, they should be understood jointly as representing a nonlinear effect.