Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add hurdle_cumulative family to brms #1448

Merged
merged 7 commits into from Jan 16, 2023
Merged

Conversation

sjwild
Copy link
Contributor

@sjwild sjwild commented Jan 14, 2023

This pull request implements the hurdle_cumulative family discussed in issue 1429.

I mostly followed the code for the ordinal families. I tried to follow pull request 1310, as it seem fairly complete.

To demonstrate the family works, I produce a short Rmd document. You can see the html output here.

To get one of the supported link functions to work, it was necessary to edit fun_cauchit.stan. The function expects a vector, but the model block uses a for loop, so it is getting a real instead. This produces errors.

One other thing comes up for posterior_predict_hurdle_cumulative() To get the proper proportion of values for hu, it's necessary to have theta < hu, which seems to run counter to other hurdle and zero-inflated families.

Let me know if there are any changes I need to make or further tests/docs.

This has been a fun learning experience, and I want to keep contributing to brms, so any advice/etc is helpful!

Steve

@paul-buerkner
Copy link
Owner

Thanks! Upon a quick look, the PR already looks quite good. I will make some edits mostly to align the code structure with the rest of the brms code base.

Specific comments:

Changing the cauchit.stan functions that way will break other code that expects this to be a vector. I will check what can be done about it. In the worst case, I will just disallow cauchit link for hurdle_cumulative for now.

One other thing comes up for posterior_predict_hurdle_cumulative() To get the proper proportion of values for hu, it's necessary to have theta < hu, which seems to run counter to other hurdle and zero-inflated families.

I am afraid I don't understand. Can you clarify a bit please?

@paul-buerkner
Copy link
Owner

Ah, I think I understand the theta < hu now. You mean that the names are awkwardly chosen in previous hurdle models and it should actually be the other way round (changing theta and hu names), right? If so, I agree. Will change it accordingly in this PR:

@sjwild
Copy link
Contributor Author

sjwild commented Jan 16, 2023

That's correct. I should have been clearer that I found the naming convention a bit confusing. I'm happy to go and edit them if you want.

As for the cauchit link, it didn't work when I tried it with a cumulative family either, though I should have tested it with the other families before editing it. Because I did not test it with the others (beta, binomial, etc), I can try later today when I get back to my computer.

@paul-buerkner
Copy link
Owner

Thanks for offering to fix the naming conventions! I will let you know once I am done with my minor cleaning of the code.

As for the cauchit link, indeed this currently does not work as it should for some families apparently. But then, not many people seem to use it anyway :-D I will solve this separately but for now will revert the change.

@paul-buerkner
Copy link
Owner

Okay, I have done some minor cleaning to your code and also fixed the awkward naming conventions in posterior_predict functions. Tests etc. pass for me. Is there anything you want to change or add before I merge?

Thank you for providing such a high quality PR! It was a pleasure reviewing this one :-)

@sjwild
Copy link
Contributor Author

sjwild commented Jan 16, 2023

No problem!

There is nothing I want to add to this particular one right now. If there is demand I'd like to add the ability to specify a hurdle value other than 0, because for survey data such responses are often coded as 99 or something similar. As well if there's enough demand I'd like to add a zero (or other level) inflated ordinal family, but there's a few issues to think through, including how to properly specify the level so it works for both ordered factor and integer responses, properly accounting for the other link functions, etc.

@paul-buerkner
Copy link
Owner

Alright, merging this now and then looking into the cauchit issue separately. Thank you again for your work on brms! I have also added you as a contributor.

@paul-buerkner paul-buerkner merged commit 726fbd3 into paul-buerkner:master Jan 16, 2023
@paul-buerkner
Copy link
Owner

Update: The cauchit link issue is now also fixed.

@sjwild
Copy link
Contributor Author

sjwild commented Jan 17, 2023

Awesome! Now I shall never use the cauchit link again :-p

@paul-buerkner
Copy link
Owner

Agreed :-D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants