# O1NAEX Exercise 10


author: Sabina Nešněrová


## Drying of beech wood planks

To investigate the effect of drying of beech wood on the humidity percentage, the following experiment was conducted. Each of 20 planks was dried in a certain period of time. Then the humidity percentage was measured in 5 depths (1,3,5,7,9) and 3 widths (1,2,3) for each plank.

**Source:** The Royal Veterinary and Agricultural University, Denmark.


**Variables:**
* plank 	 -   Numbered 1-20
* width      -   Numbered 1,2,3
* depth 	 -   Numbered 1,3,5,7,9
* humidity   -   Humidity percentage

**Number of observations:** 300 (20 planks)

**Description:**
* depth 1: 	close to the top
* depth 5: 	in the center
* depth 9: 	close to the bottom
* depth 3: 	between 1 and 5
* depth 7: 	between 5 and 9
* width 1: 	close to the side
* width 3: 	in the center
* width 2: 	between 1 and 3


**Analyze data from the Drying of beech wood planks:**

* Plot four average humidity profiles: 2 interaction plots for width and 2 for depth (done).
* Carrying out the fixed effects model analysis.
* Carry out the mixed model analysis.
* Run the post hoc analysis
* Compare the fixed parameters and use the p-value correction (TukeyHSD).
 (In R: Use function `lsmeans`  from the package `lsmeans` with `adjust="tukey`.)
* Summarize results.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from itertools import product
import statsmodels.formula.api as smf


In [None]:
planks = pd.read_csv("https://raw.githubusercontent.com/francji1/01NAEX/main/data/planks.txt",sep=",")
planks

In [None]:
planks['plank'] = planks['plank'].astype('category')
planks['width'] = planks['width'].astype('category')
planks['depth'] = planks['depth'].astype('category')


In [None]:
planks.describe()

In [None]:
def interaction_plot(x, trace, response, data, ax):
    categories_x = data[x].cat.categories
    categories_trace = data[trace].cat.categories

    for trace_level in categories_trace:
        subset = data[data[trace] == trace_level]
        means = subset.groupby(x)[response].mean()
        ax.plot(categories_x, means, label=f"{trace}: {trace_level}")

    ax.set_xlabel(x)
    ax.set_ylabel(response)
    ax.legend(title=trace, loc='upper left', bbox_to_anchor=(1, 1), fontsize='small')  # Adjust legend position
    ax.grid(True)


In [None]:

# Creating the 2x2 subplot layout
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Plot 1: width vs plank
interaction_plot('width', 'plank', 'humidity', planks, axes[0, 0])
axes[0, 0].set_title("Width vs Plank")

# Plot 2: depth vs plank
interaction_plot('depth', 'plank', 'humidity', planks, axes[0, 1])
axes[0, 1].set_title("Depth vs Plank")

# Plot 3: width vs depth
interaction_plot('width', 'depth', 'humidity', planks, axes[1, 0])
axes[1, 0].set_title("Width vs Depth")

# Plot 4: depth vs width
interaction_plot('depth', 'width', 'humidity', planks, axes[1, 1])
axes[1, 1].set_title("Depth vs Width")

plt.tight_layout()
plt.show()


In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(12, 5))

# --- BOXPLOT for width ---
plt.subplot(1, 2, 1)
sns.boxplot(x="width", y="humidity", data=planks)
plt.title("Boxplot: Humidity vs Width")
plt.xlabel("Width")
plt.ylabel("Humidity")

# --- BOXPLOT for depth ---
plt.subplot(1, 2, 2)
sns.boxplot(x="depth", y="humidity", data=planks)
plt.title("Boxplot: Humidity vs Depth")
plt.xlabel("Depth")
plt.ylabel("Humidity")

plt.tight_layout()
plt.show()

The plots show that the humidity of the planks depends on both width and depth. The highest moisture content is observed in the central part of the planks, while the surface layers are significantly drier.

In [None]:
import statsmodels.api as sm
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm

#Fixed Effects Model, ANOVA
# Fixed Effects: C(width), C(depth), C(width):C(depth)
fixed_model_formula = 'humidity ~ C(width) * C(depth)'
fixed_model = smf.ols(formula=fixed_model_formula, data=planks).fit()
print(fixed_model.summary())
print("\n" + "="*50 + "\n")
print("## 1. Results for Fixed Effects Model (ANOVA)")
# Type III - for models with interactions)
fixed_anova = anova_lm(fixed_model, typ=3)
print(fixed_anova)
print("\n" + "="*50 + "\n")

The fixed-effects model shows that depth is the only significant factor affecting humidity. Surface depths (1 and 9) have similarly low humidity, whereas the center of the plank (depths 3, 5, and 7) is significantly more humid, which is expected due to slower drying in the core.
Width has no significant effect, and the interaction between width and depth is not significant.
The model explains about 28 % of the variability in humidity, suggesting substantial differences between planks.

In [None]:
# 2. Fit Mixed Effects Model
# Formula: humidity ~ C(width) * C(depth) - Fixed Effects
# groups='plank' - Random Intercept for each plank (Random Effect)
mixed_model_formula = 'humidity ~ C(width) * C(depth)'
mixed_model = smf.mixedlm(mixed_model_formula, data=planks, groups=planks["plank"]).fit()

print("\n--- Mixed Effects Model (LMM) Results ---")
print(mixed_model.summary())

In the mixed-effects model, plank is treated as a random effect, capturing differences between individual planks. Depth remains strongly significant, with higher humidity in the center and no difference between the two surface depths.
The interaction between width and depth is not significant, and width becomes significant only for level 2, with a small effect.
The random-effect variance (0.98) shows substantial variability between planks, confirming that the mixed model is appropriate.

In [None]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd

# 3. Post-hoc Analysis (Tukey's HSD) for the Width × Depth Interaction
# Create a new factor representing all combinations of width and depth
planks["width_depth"] = (
    planks["width"].astype(str) + "x" + planks["depth"].astype(str))

# Run Tukey’s HSD on the combined factor.
# Since the design is balanced, Tukey HSD on raw group means is appropriate.
tukey_results = pairwise_tukeyhsd(
    endog=planks["humidity"],
    groups=planks["width_depth"],
    alpha=0.05)

print("\n--- Post-hoc Analysis (Tukey's HSD) for Width × Depth Interaction ---")
print(tukey_results)

In [None]:
tukey_df = pd.DataFrame(data=tukey_results._results_table.data[1:],
                        columns=tukey_results._results_table.data[0])

# Filter only significant comparisons
significant_pairs = tukey_df[tukey_df["p-adj"].astype(float) < 0.05]

print("\n--- Significant Tukey HSD Comparisons (reject = True) ---")
print(significant_pairs)

This test compared all combinations of width and depth. The significant pairs consistently show that humidity is higher at the inner depths (3, 5, and 7) compared to the surface depths (1 and 9) across all widths.
Overall, the significant comparisons confirm the strong depth effect, with the center of the plank retaining much more moisture than the surface layers.

#Conclusions

The analysis of the wood plank data used a fixed-effects model, a mixed-effects model, and post-hoc comparisons. Width and depth were treated as fixed factors, while the individual planks were modeled as a random effect.
The results show that drying is symmetrical through the thickness of the plank - the surface layers are noticeably drier, while the center retains significantly more moisture. The effect of width on humidity is weak, with only small differences between its levels.
The highest humidity occurs consistently at depths 3, 5, and 7, especially in combinations with widths 1 and 2, confirming that moisture is concentrated toward the core of the plank.