Beta animation

DominiqueMakowski · Jul 12, 2024 · 704fd11 · 704fd11
1 parent bb11b17
commit 704fd11
Show file tree

Hide file tree

Showing 8 changed files with 356 additions and 86 deletions.
diff --git a/content/3_scales.qmd b/content/3_scales.qmd
@@ -4,9 +4,30 @@
 
 In many situations, the dependent variable is not continuous, nor is it even normally distributed.
 
-## Slider (aka "Analog") Scales 
+## Bounded Variables
 
-### Beta Models 
+While might be tempted to believe that most variables in psychological science are true **continuous** variables, this is often not the case. 
+In fact, many variables are **bounded**: there values are delimited by hard bounds. 
+This is typically the case for slider (aka "analog" scakes), dimensions scores (average or sum) from multiple Likert scales, percentages, proportions, etc.
+
+Most psychometric indices are bounded. 
+For instance, the minimum and maximum values for the IQ test (WAIS-IV) are 45-155. It is 0 and 63 for the depression scale BDI-II, 20 and 80 for the STAI.
+
+Despite this fact, we still most often use **linear models** to analyze these data, which is not ideal as it assumes that the dependent variable is continuous and normally distributed.
+
+### The Problem with Linear Models
+
+
+
+### Rescaling
+
+```{julia}
+#| code-fold: false
+
+
+```
+
+### Beta Models
 
 ```{julia}
 #| code-fold: false
@@ -15,19 +36,25 @@ using Distributions, Random
 using Turing
 
 # Reparameterized Beta distribution
-function MeanVarBeta(μ, σ²)
-    if σ² <= 0 || σ² >= μ * (1 - μ)
-        error("Variance σ² must be in the interval (0, μ*(1-μ)=$(μ*(1-μ))).")
+function BetaMean(μ, σ)
+    if σ <= 0 || σ >= sqrt(μ * (1 - μ))
+        error("Standard deviation σ must be in the interval (0, sqrt(μ*(1-μ))=$(sqrt(μ*(1-μ)))).")
     end
-
-    ν = μ * (1 - μ) / σ² - 1
+    ν = μ * (1 - μ) / σ^2 - 1
     α = μ * ν
     β = (1 - μ) * ν
 
     return Beta(α, β)
 end
 ```
 
+::: {.callout-caution}
+TODO: Replace if it is supported somewhere (e.g., [Distributions.jl](https://github.com/JuliaStats/Distributions.jl/issues/1877))
+:::
+
+
+![](media/scales_BetaMean.gif)
+
 ```{julia}
 #| eval: false
 
@@ -42,7 +69,6 @@ chains = sample(model_Beta(rand(MeanVarBeta(0.5, 0.2), 200)), NUTS(), 500;
    initial_params=[0.5, 0.1])
 ```
 
-- https://github.com/JuliaStats/Distributions.jl/issues/1877
 
 
 ###  OrdBeta Models

diff --git a/content/4a_rt_descriptive.qmd b/content/4a_rt_descriptive.qmd
@@ -572,7 +572,7 @@ The `loo_compare()` function orders models from best to worse based on their ELP
 As one can see, traditional linear models perform terribly.
 
 
-## Other Models
+<!-- ## Other Models
 
 Other models are available to fit RT data, that we will demonstrate below for reference purposes.
 However, we won't be explaining them here, as we will revisit them in the next chapter in the context of choice modeling.
@@ -587,4 +587,4 @@ TODO.
 
 ### Racing Diffusion Model (RDM)
 
-TODO.
+TODO. -->
diff --git a/content/4b_rt_generative.qmd b/content/4b_rt_generative.qmd
@@ -49,9 +49,12 @@ While such "generative" models offer potential insights into the cognitive proce
 Interestingly, the **Wald** model is actually a special case of a more general type called the **Drift Diffusion Model (DDM)** (named as such because the evidence accumulation is assumed to be a "diffusion" process, i.e., a random walk). 
 One of the main difference is that in the Wald model, the drift rate $\nu$ must be *positive*, as it tracks the time taken by the diffusion processes to reach only one "positive" threshold $\alpha$.
 
-But what happens if we relax this and allow the drift rate to be null or negative? Many traces might never reach the upper threshold, and might instead reach high "negative" values.
+But what happens if we relax this and allow the drift rate to be null or negative? Many traces might never reach the upper threshold, and might instead reach high "negative" values. 
+The idea is then to also consider a **lower threshold**, which would correspond to a different response.
+Finally, one could also modulate the starting point of the process relative to the thresholds (i.e., how close to the upper or lower threshold the accumulation process starts).
+This is, in essence, the core idea behind the **Drift Diffusion Model**.
 
-Drift Diffusion Models are useful to **jointly model RTs and a binary outcome**, such as 2 different choices or accuracy (i.e., "correct" vs. "error").
+Drift Diffusion Models are useful to **jointly model RTs and a binary outcome**, such as 2 different choices or accuracy (e.g., "correct" vs. "error"). They allow, in theory, to delineate various cognitive processes, such as the **speed** of evidence accumulation (drift rate), the **amount of evidence** required to make a decision (threshold), and the starting point of the accumulation process (bias).
 
 ![](media/rt_ddm.gif)
 
@@ -96,3 +99,4 @@ TODO.
 - [**Lindelov's overview of RT models**](https://lindeloev.github.io/shiny-rt/): An absolute must-read.
 - [**De Boeck & Jeon (2019)**](https://www.frontiersin.org/articles/10.3389/fpsyg.2019.00102/full): A paper providing an overview of RT models.
 - [https://github.com/vasishth/bayescogsci](https://github.com/vasishth/bayescogsci)
+- [An expert guide to planning experimental tasks for evidence accumula6on modelling (Boag, 2024)](https://osf.io/preprints/psyarxiv/snqgp)
diff --git a/content/media/animations_scales.jl b/content/media/animations_scales.jl
@@ -25,111 +25,93 @@ end
 # BetaMod =======================================================================================
 using Turing, Distributions, Random
 
-# Reparameterized Beta distribution
-function MeanVarBeta(μ, σ²)
-    if σ² <= 0 || σ² >= μ * (1 - μ)
-        error("Variance σ² must be in the interval (0, μ*(1-μ)=$(μ*(1-μ))).")
+# # Reparameterized Beta distribution
+# function MeanVarBeta(μ, σ²)
+#     if σ² <= 0 || σ² >= μ * (1 - μ)
+#         error("Variance σ² must be in the interval (0, μ*(1-μ)=$(μ*(1-μ))).")
+#     end
+
+#     ν = μ * (1 - μ) / σ² - 1
+#     α = μ * ν
+#     β = (1 - μ) * ν
+
+#     return Beta(α, β)
+# end
+# var(MeanVarBeta(0.3, 0.1))
+# mean(MeanVarBeta(0.3, 0.1))
+
+function BetaMean(μ, σ)
+    if σ <= 0 || σ >= sqrt(μ * (1 - μ))
+        error("Standard deviation σ must be in the interval (0, sqrt(μ*(1-μ))=$(sqrt(μ*(1-μ)))).")
     end
-
-    ν = μ * (1 - μ) / σ² - 1
+    ν = μ * (1 - μ) / σ^2 - 1
     α = μ * ν
     β = (1 - μ) * ν
 
     return Beta(α, β)
 end
-var(MeanVarBeta(0.3, 0.1))
-mean(MeanVarBeta(0.3, 0.1))
+std(BetaMean(0.3, 0.2))
+mean(BetaMean(0.3, 0.2))
+
 
 
 # Range of possible parameters
 fig = Figure()
-ax = Axis(fig[1, 1], xlabel="μ", ylabel="variance σ²")
-for μ in range(0.0, 1, length=200)
-    for σ in range(0, 1, length=200)
-        x = range(0, 1, length=100)
-        try
-            y = pdf.(MeanVarBeta(μ, σ), x)
-            scatter!(ax, μ, σ, color=:red)
-        catch
-            continue
-        end
-    end
-end
-ylims!(ax, 0, 0.5)
-# ablines!(ax, [0, 1], [1, -1]; color=:black)
-xaxis = range(0, 1, length=1000)
-lines!(ax, xaxis, xaxis .* (1 .- xaxis); color=:black)
-fig
-
-
-using Turing, Distributions, Random
-
-# Reparameterized Beta distribution
-function MeanVarBeta(μ, σ²)
-    if σ² <= 0 || σ² >= μ * (1 - μ)
-        error("Variance σ² must be in the interval (0, μ*(1-μ)=$(μ*(1-μ))).")
-    end
-
-    ν = μ * (1 - μ) / σ² - 1
-    α = μ * ν
-    β = (1 - μ) * ν
-
-    return Beta(α, β)
-end
-
-@model function model_Beta(x)
-    μ ~ Beta(1, 1)
-    σ ~ Uniform(eps(typeof(μ)), μ * (1 - μ) - eps(typeof(μ)))
-    for i in 1:length(x)
-        x[i] ~ MeanVarBeta(μ, σ)
-    end
-end
-chains = sample(model_Beta(rand(MeanVarBeta(0.5, 0.2), 200)), NUTS(), 500; initial_params=[0.5, 0.1])
-
 
+μ = Observable(0.5)
+σ = Observable(0.05)
 
+ax1 = Axis(fig[1, 1], title="Possible parameter range", xlabel="μ", ylabel="σ", yticks=range(0, 0.5, 6))
+ylims!(ax1, 0, 0.6)
+# ablines!(ax, [0, 1], [1, -1]; color=:black)
+xaxis = range(0, 1, length=1000)
+yaxis = sqrt.(xaxis .* (1 .- xaxis))
+band!(ax1, xaxis, 0, yaxis, color=:orange)
+lines!(ax1, xaxis, yaxis; color=:red)
+text!(ax1, 0, 0.55, text="max. σ = √(μ * (1 - μ))", align=(:left, :center), color=:red)
+scatter!(ax1, @lift([$μ]), @lift([$σ]); color="#2196F3", marker=:xcross, markersize=15)
 
+fig
 
-μ = Observable(0.5)
-σ = Observable(0.1)
 
-# Initialize the figure
-fig = Figure()
-ax = Axis(
-    fig[1, 1],
-    title=@lift("BetaMod(μ = $(round($μ, digits = 1)), σ =  $(round($σ, digits = 2)))"),
+ax2 = Axis(
+    fig[1, 2],
+    title=@lift("BetaMean(μ = $(round($μ, digits = 1)), σ =  $(round($σ, digits = 2)))"),
     xlabel="Score",
     ylabel="Distribution",
     yticksvisible=false,
     xticksvisible=false,
     yticklabelsvisible=false,
 )
-ylims!(ax, 0, 10)
+ylims!(ax2, 0, 10)
+xlims!(ax2, 0, 1)
 
-x = range(0, 1, length=100)
-y = @lift(pdf.(MeanVarBeta($μ, $σ), x))
+x = range(0, 1, length=1000)
+y = @lift(pdf.(BetaMean($μ, $σ), x))
 
-lines!(ax, x, y)
+band!(ax2, x, 0, y, color="#2196F3")
 fig
 
 function make_animation(frame)
-    if frame < 0.15
-        μ[] = change_param(frame; frame_range=(0.0, 0.15), param_range=(0.5, 0.9))
+    if frame < 0.20
+        σ[] = change_param(frame; frame_range=(0.0, 0.20), param_range=(0.05, 0.46))
     end
     if frame >= 0.25 && frame < 0.45
-        μ[] = change_param(frame; frame_range=(0.25, 0.45), param_range=(0.9, 0.1))
+        μ[] = change_param(frame; frame_range=(0.25, 0.45), param_range=(0.5, 0.2))
+        σ[] = change_param(frame; frame_range=(0.25, 0.45), param_range=(0.46, 0.2))
     end
-    if frame >= 0.55 && frame < 0.65
-        σ[] = change_param(frame; frame_range=(0.55, 0.65), param_range=(0.1, 0.3))
+    if frame >= 0.50 && frame < 0.65
+        μ[] = change_param(frame; frame_range=(0.50, 0.65), param_range=(0.2, 0.85))
+        σ[] = change_param(frame; frame_range=(0.50, 0.65), param_range=(0.2, 0.1))
     end
     # Return to normal
     if frame >= 0.7 && frame < 0.9
-        μ[] = change_param(frame; frame_range=(0.7, 0.9), param_range=(0.1, 0.5))
-        σ[] = change_param(frame; frame_range=(0.7, 0.9), param_range=(0.3, 0.1))
+        μ[] = change_param(frame; frame_range=(0.7, 0.9), param_range=(0.85, 0.5))
+        σ[] = change_param(frame; frame_range=(0.7, 0.9), param_range=(0.1, 0.05))
     end
 end
 
 # animation settings
 frames = range(0, 1, length=120)
-record(make_animation, fig, "scales_MeanVarBeta.gif", frames; framerate=20)
+record(make_animation, fig, "scales_BetaMean.gif", frames; framerate=10)
 
diff --git a/content/media/scales_BetaMean.gif b/content/media/scales_BetaMean.gif
diff --git a/content/media/scales_betamod.gif b/content/media/scales_betamod.gif
diff --git a/data/make_data.R b/data/make_data.R
@@ -47,10 +47,25 @@ df <- df |>
 
 df <- df[complete.cases(df),]
 
-# df |>
-#   group_by(Gender) |>
-#   summarize(Neuroticism = mean(Neuroticism),
-#             Agreeableness = mean(Agreeableness))
 
 write.csv(df, "harman1967.csv", row.names = FALSE)
 
+
+
+# Patho Personality -------------------------------------------------------------
+
+
+df <- read.csv("https://raw.githubusercontent.com/RealityBending/IllusionGameValidation/main/data/study3.csv") |>
+  select(starts_with("IPIP6_"), starts_with("PID5_"), -PID5_SD, -IPIP6_SD) |>
+  mutate(across(starts_with("IPIP6_"), \(x) datawizard::rescale(x, range=c(0, 100), to=c(1, 7))),
+         across(starts_with("PID5_"), \(x) datawizard::rescale(x, range=c(0, 100), to=c(0, 3))) )
+
+df <- df[complete.cases(df),]
+
+ggplot(df, aes(x=PID5_Disinhibition)) +
+  geom_histogram(bins = 60)
+
+write.csv(df, "makowski2023.csv", row.names = FALSE)
+
+
+