<a href="https://colab.research.google.com/github/francji1/01NAEX/blob/main/code/01NAEX_Exercise_10_solution_MBohaty.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# O1NAEX Exercise 10


author: xxx

## Drying of beech wood planks

To investigate the effect of drying of beech wood on the humidity percentage, the following experiment was conducted. Each of 20 planks was dried in a certain period of time. Then the humidity percentage was measured in 5 depths (1,3,5,7,9) and 3 widths (1,2,3) for each plank.

**Source:** The Royal Veterinary and Agricultural University, Denmark.


**Variables:**
* plank 	 -   Numbered 1-20
* width      -   Numbered 1,2,3
* depth 	 -   Numbered 1,3,5,7,9
* humidity   -   Humidity percentage

**Number of observations:** 300 (20 planks)

**Description:**
* depth 1: 	close to the top
* depth 5: 	in the center
* depth 9: 	close to the bottom
* depth 3: 	between 1 and 5
* depth 7: 	between 5 and 9
* width 1: 	close to the side
* width 3: 	in the center
* width 2: 	between 1 and 3


**Analyze data from the Drying of beech wood planks:**

* Plot four average humidity profiles: 2 interaction plots for width and 2 for depth (done).
* Carrying out the fixed effects model analysis.
* Carry out the mixed model analysis.
* Run the post hoc analysis
* Compare the fixed parameters and use the p-value correction (TukeyHSD).
 Hint: Use function `lsmeans`  from the package `lsmeans` with `adjust="tukey`.
* Summarize results.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from itertools import product

import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.stats.anova import anova_lm
from statsmodels.stats.multicomp import pairwise_tukeyhsd


In [None]:
planks = pd.read_csv("https://raw.githubusercontent.com/francji1/01NAEX/main/data/planks.txt",sep=",")
planks

In [None]:
planks['plank'] = planks['plank'].astype('category')
planks['width'] = planks['width'].astype('category')
planks['depth'] = planks['depth'].astype('category')


In [None]:
def interaction_plot(x, trace, response, data, ax):
    categories_x = data[x].cat.categories
    categories_trace = data[trace].cat.categories

    for trace_level in categories_trace:
        subset = data[data[trace] == trace_level]
        means = subset.groupby(x)[response].mean()
        ax.plot(categories_x, means, label=f"{trace}: {trace_level}")

    ax.set_xlabel(x)
    ax.set_ylabel(response)
    ax.legend(title=trace, loc='upper left', bbox_to_anchor=(1, 1), fontsize='small')  # Adjust legend position
    ax.grid(True)


In [None]:

# Creating the 2x2 subplot layout
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Plot 1: width vs plank
interaction_plot('width', 'plank', 'humidity', planks, axes[0, 0])
axes[0, 0].set_title("Width vs Plank")

# Plot 2: depth vs plank
interaction_plot('depth', 'plank', 'humidity', planks, axes[0, 1])
axes[0, 1].set_title("Depth vs Plank")

# Plot 3: width vs depth
interaction_plot('width', 'depth', 'humidity', planks, axes[1, 0])
axes[1, 0].set_title("Width vs Depth")

# Plot 4: depth vs width
interaction_plot('depth', 'width', 'humidity', planks, axes[1, 1])
axes[1, 1].set_title("Depth vs Width")

plt.tight_layout()
plt.show()


## Fixed model
Nejprve naladíme model s pevnými parametry, kde prkna budou vystupovat jako block.

In [None]:
#fixed model, using planks as blocking factor
fixed_model = smf.ols("humidity ~ C(width) + C(depth) + C(plank)", data=planks).fit()
print(fixed_model.summary())

In [None]:
anova_fixed = anova_lm(fixed_model)
print(anova_fixed)

In [None]:
# Tukey HSD for Width
tukey_width = pairwise_tukeyhsd(endog=planks['humidity'],
                                groups=planks['width'],
                                alpha=0.05)
print(tukey_width)

In [None]:
# Tukey HSD for Depth
tukey_depth = pairwise_tukeyhsd(endog=planks['humidity'],
                                groups=planks['depth'],
                                alpha=0.05)
print(tukey_depth)

In [None]:
# Plot Tukey results for Width
tukey_width.plot_simultaneous()
plt.title("Tukey HSD for Width")
plt.show()

# Plot Tukey results for Depth
tukey_depth.plot_simultaneous()
plt.title("Tukey HSD for Depth")
plt.show()

## Mixed model
Nyní naladíme mixed model, nejprve v Pythonu, potom také v R kvůli možnosti provést ANOVU.

In [None]:
mixed_model = smf.mixedlm("humidity ~ C(width) + C(depth)", planks, groups=planks['plank']).fit()
print(mixed_model.summary())


In [None]:
%load_ext rpy2.ipython

In [None]:
%%R
packages <- c("lme4", "car", "emmeans")
install.packages(packages)

In [None]:
%%R
library(lme4) # For mixed models
library(car)  # For ANOVA
library(emmeans) # For post hoc analysis

# Load data
url <- "https://raw.githubusercontent.com/francji1/01NAEX/main/data/planks.txt"
planks <- read.csv(url, sep = ",")
planks$plank <- as.factor(planks$plank)
planks$width <- as.factor(planks$width)
planks$depth <- as.factor(planks$depth)

In [None]:
%%R
mixed_model <- lmer(humidity ~ width + depth + (1 | plank), data = planks)
summary(mixed_model)

Vidíme, že výsledky při použití R a Pythonu jsou až na zaokrouhlení stejné.

In [None]:
%%R
anova <- Anova(mixed_model, type = "III")
print(anova)

In [None]:
%%R
emm_width <- emmeans(mixed_model, pairwise ~ width, adjust = "tukey")
print(emm_width)

In [None]:
%%R
emm_depth <- emmeans(mixed_model, pairwise ~ depth, adjust = "tukey")
print(emm_depth)

In [None]:
%%R
plot(emm_width$emmeans)

In [None]:
%%R
plot(emm_width$contrasts)

In [None]:
%%R
plot(emm_depth$emmeans)

Při použití mixed modelu jsme při párovém porovnávání byli schopni rozlišit rozdíl mezi jednotlivými šířkami, což jsme pro pevný model nedokázali. Ostatní výsledky nám vyšly stejné.