Skip to content

Add unnormalized PDF and log-likelihood functions #1978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sethaxen
Copy link
Contributor

@sethaxen sethaxen commented May 12, 2025

As suggested in #153 (comment), this PR adds the following API functions:

  • logupdf: contains all terms in logpdf that depend on the argument
  • updf: contains all terms in pdf that depend on the argument
  • logulikelihood: contains all terms in loglikelihood that depend on the parameters

Each of these has a fallback to the full logpdf, pdf, or loglikelihood, respectively. After surveying the repo, it seems the existing functions could in most cases be split into 1) calling one of these functions and 2) computing the normalization factor. This is left for future PRs.

A few questions

  • Should we implement _logupdf, _updf, logupdf!, updf! for Arrayvariate analogous to the existing non-u versions?
  • In the future, should unnormalized variants be added to StatsFuns before adding here? If so, perhaps it makes more sense to open a PR there to work out an API before adding a matching API here?

TO-DO

  • define, document, and export functions
  • add checks for self-consistency of these functions (e.g. logupdf(d, x1) - logupdf(d, x2) ≈ logpdf(d, x1) - logpdf(d, x2)) to the test suites for existing distributions
  • add these to the documentation (where?)

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Project coverage is 86.25%. Comparing base (c2a7387) to head (f2abb17).

Files with missing lines Patch % Lines
src/common.jl 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1978      +/-   ##
==========================================
- Coverage   86.28%   86.25%   -0.03%     
==========================================
  Files         146      146              
  Lines        8787     8790       +3     
==========================================
  Hits         7582     7582              
- Misses       1205     1208       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sethaxen
Copy link
Contributor Author

sethaxen commented May 12, 2025

I've intentionally left it so that updf(d, x) is not necessarily exp(logupdf(d, x)), because I suspect it's more likely that the exponential of logupdf will under-/overflow than for logpdf, so I wouldn't be surprised if some implementations need to divide by some normalization factor just to get good numerical behavior.

I also think updf will not be nearly as useful or widely used as logupdf (EDIT: maybe we shouldn't have updf after all?)

@sethaxen
Copy link
Contributor Author

I'd prefer to get a 👍 feedback on the general approach from the maintainers before messing around with all of the testing functions. Maybe @devmotion ?

@devmotion
Copy link
Member

Sorry, I've been swamped with tasks this week. I like the proposal a lot, I think this will resolve one of my main feature requests (apart from fixing rand/eltype and speeding up loglikelihood by factoring out constants and basing it on sufficient statistics).

I also think updf will not be nearly as useful or widely used as logupdf (EDIT: maybe we shouldn't have updf after all?)

I'd suggest not adding it for now.

Should we implement _logupdf, _updf, logupdf!, updf! for Arrayvariate analogous to the existing non-u versions?

I don't like the current interface for non-univariate distributions, making _... methods part of the (development) API seems wrong.

In the future, should unnormalized variants be added to StatsFuns before adding here? If so, perhaps it makes more sense to open a PR there to work out an API before adding a matching API here?

Yes, IMO that would be reasonable for distribution functions implemented in StatsFuns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants