Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node specific max.parents not implemented for method = "mle" #55

Open
matteodelucchi opened this issue Feb 23, 2024 · 1 comment
Open
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@matteodelucchi
Copy link
Contributor

Issue description

Only with method = "bayes" we can set the number of maximal allowed parents individually per node.

MRE

### Generate data
# Set seed for reproducibility
set.seed(123)

# Number of groups
n_groups <- 5

# Number of observations per group
n_obs_per_group <- 100

# Total number of observations
n_obs <- n_groups * n_obs_per_group

# Simulate group effects
group <- factor(rep(1:n_groups, each = n_obs_per_group))
group_effects <- rnorm(n_groups)

# Simulate variables
G1 <- rnorm(n_obs) + group_effects[group]
B1 <- rbinom(n_obs, 1, plogis(group_effects[group]))
G2 <- 1.5 * B1 + 0.7 * G1 + rnorm(n_obs) + group_effects[group]
B2 <- rbinom(n_obs, 1, plogis(2 * G2 + group_effects[group]))

# Create data frame
data <- data.frame(group = group, G1 = G1, G2 = G2, B1 = factor(B1), B2 = factor(B2))

# Look at data
str(data)
summary(data)

######
# Reproduce issue
######
### method = "mle"
# OK: Build the score cache with 2 parents for each variable
score_cache <- buildScoreCache(data.df = data,
                               data.dists = list(G1 = "gaussian", 
                                                 G2 = "gaussian", 
                                                 B1 = "binomial", 
                                                 B2 = "binomial"),
                               group.var = "group",
                               max.parents = 2,
                               method = "mle")

# BUG: Build the score cache with different number of parents for each variable
score_cache <- buildScoreCache(data.df = data,
                               data.dists = list(G1 = "gaussian", 
                                                 G2 = "gaussian", 
                                                 B1 = "binomial", 
                                                 B2 = "binomial"),
                               group.var = "group",
                               max.parents = list(G1 = 0, G2 = 2, B1 = 0, B2 = 3),
                               method = "mle")

### method = "bayes"
# OK: Build the score cache with different number of parents for each variable
score_cache <- buildScoreCache(data.df = data,
                               data.dists = list(G1 = "gaussian", 
                                                 G2 = "gaussian", 
                                                 B1 = "binomial", 
                                                 B2 = "binomial"),
                               group.var = "group",
                               max.parents = list(G1 = 0, G2 = 2, B1 = 0, B2 = 3),
                               method = "bayes")


@matteodelucchi matteodelucchi added the bug Something isn't working label Feb 23, 2024
@matteodelucchi matteodelucchi self-assigned this Feb 23, 2024
@matteodelucchi
Copy link
Contributor Author

I don't quite understand why the creation of the parent combination matrix defn.res differs in buildScoreCache.mle() and buildScoreCache.bayes().
The Bayes case uses the C function buildscorecache.c. This looks reasonable, though not super efficient (it iterates twice over the same nested for loops). buildscorecache.c handles the different number of max.parents per node in the Bayes case.

Thoughts:

  1. Is the Bayes variant with buildscorecache.c limited by not handling multinomial variables?
  2. If yes, can we easily extend it? E.g. following the approach in buildscorecache.mle(), which splits them up in their levels after the parent combination matrix has been created in the first place. This would be a first step to the extension of the Bayes framework to multinomials -> make separate issue/milestone.

@matteodelucchi matteodelucchi added the help wanted Extra attention is needed label Feb 28, 2024
@matteodelucchi matteodelucchi transferred this issue from another repository Apr 9, 2024
@matteodelucchi matteodelucchi transferred this issue from furrer-lab/shuttle-abn Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant