Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add convex decreasing density estimator for π0 #56

Merged
merged 3 commits into from Jan 31, 2018

Conversation

juliangehring
Copy link
Owner

@juliangehring juliangehring commented Dec 10, 2017

Implements the convex decreasing density estimator for π0 (#3 ), based on section 4.3 in

Langaas, M., Lindqvist, B.H., and Ferkingstad, E. (2005). Estimating the proportion of true null hypotheses, with application to DNA microarray data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67, 555–572.
http://cds.cern.ch/record/639626/files/sis-2003-306.pdf

end

function find_f_theta(x::Vector{Float64}, theta::Float64)
return @. 2 / theta^2 * (theta-x) * (x.<theta)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the @. notation mean?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies the dot vectorisation to all functions on the right, and turns the line into 2 ./ theta^2 .* (theta.-x) .* (x.<theta). I find it a bit easier to read without all the extra dots.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thanks, that seems very useful! (Then there's one . too much in the code though).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this, fixed in 2966dba.

@. f = (1-ε)*f + ε*f_theta
@. f_p = @views gridsize*(f[idx_lower]-f[idx_upper])*px + f[idx_upper]
theta = dx * find_theta(t, p, f_p)
f_theta .= find_f_theta(x, theta)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of code readability, instead of calling this find_f_theta, can we maybe use Distributions.jl (TriangularDist) and use pdf?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling a bit with TriangularDist since the weights at x=0 always yield 0:
pdf.(TriangularDist(0.0, 1.0, 0.0), 0.0) (I would have expected it to yield the same as pdf.(TriangularDist(0.0, 1.0, 1.0), 1.0)). Is this a bug or a feature in Distributions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting.. I'd say it is almost surely a bug with how they handle the boundary..

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good, then it's not just me ;)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have given the function a more meaningful name in 2966dba. Since the TriangularDist in Distributions seems buggy, I'd keep our function for now.

f_theta .= find_f_theta(x, theta)
pi0_new = f[end]
if abs(pi0_new - pi0_old) <= xtol
return pi0_new, thetas, true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the paper suggests a stopping rule based on sum((f_p.-f_theta_p)./f_p), do these criteria behave similarly?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They mention the stopping rule sum((f_p.-f_theta_p)./f_p) < -δ in the algorithm, but don't use it in their reference implementation - they don't use any stopping rule there and simply stop after 100 iterations. Since I'm not sure how to choose a good value for δ in this case, I have used a general stopping rule based on the change of the π0 estimate as in the CensoredBUM estimator (#4).

end
end
@. f = (1-ε)*f + ε*f_theta
@. f_p = @views gridsize*(f[idx_lower]-f[idx_upper])*px + f[idx_upper]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get f_p exactly by updating the previous one instead of this linear interpolation?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you have in mind for getting f_p exactly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I see it correctly, f is the same as f_p, except evaluated on the sorted p-values rather than the grid. So e.g. above could update it with something like f_p = (1-ε)*f_p+ ε*f_theta_p (assuming properly initialized in the beginning). But maybe it's not worth doing this exactly also from computational perspective?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run some simulations with both update rules, and I could never see a difference in the final π0 estimate. Your solution looks nicer and cleaner to me, I have added it in 2966dba.

@nignatiadis
Copy link
Contributor

nignatiadis commented Dec 19, 2017

Except the 3 comments, looks very good to me; this is very cool! With this + Grenander + isotonic, I wonder if it might be worth to move them to a ShapeConstrainedStatistics package, that implements them with Distributions/Regression interfaces.

@juliangehring
Copy link
Owner Author

@nignatiadis Do you think this needs more work, or are you fine with merging this feature?

@nignatiadis
Copy link
Contributor

Hey Julian, sorry for delaying this. It looks good to me!

@juliangehring
Copy link
Owner Author

No problem - thanks for having a look at it.

@juliangehring juliangehring merged commit 65727ce into develop Jan 31, 2018
@juliangehring juliangehring deleted the convex-decreasing-pi0 branch January 31, 2018 20:46
@juliangehring juliangehring added this to the v0.3.0 milestone Mar 18, 2018
@juliangehring juliangehring mentioned this pull request May 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants