# zaxtax/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers forked from CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

Split Jeffrey's prior bit into its own cells

 @@ -1389,20 +1389,45 @@ "\n", "2. There typically exist conjugate priors for simple, one dimensional problems. For larger problems, involving more complicated structures, hope is lost to find a conjugate prior. For smaller models, Wikipedia has a nice [table of conjugate priors](http://en.wikipedia.org/wiki/Conjugate_prior#Table_of_conjugate_distributions).\n", "\n", - "Really, conjugate priors are only useful for their mathematical convenience: it is simple to go from prior to posterior. I personally see conjugate priors as only a neat mathematical trick, and offer little insight into the problem at hand. \n", - "\n", + "Really, conjugate priors are only useful for their mathematical convenience: it is simple to go from prior to posterior. I personally see conjugate priors as only a neat mathematical trick, and offer little insight into the problem at hand. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ "## Jefferys Priors\n", "\n", "Earlier, we talked about objective priors rarely being *objective*. Partly what we mean by this is that we want a prior that doesn't bias our posterior estimates. The flat prior seems like a reasonable choice as it assigns equal probability to all values. \n", "\n", - "But the flat prior is not transformation invariant. What does this mean? Suppose we have a random variable $\\bf X$ from Bernoulli($\\theta$). We define the prior on $p(\\theta) = 1$. \n", - "\n", - "PUT PLOT OF THETA HERE\n", - "\n", - "Now, let's transform $\\theta$ with the function $\\psi = log \\frac{\\theta}{1-\\theta}$. This is just a function to stretch $\\theta$ across the real line. Now how likely are different values of $\\psi$ under our transformation.\n", - "\n", - "PUT PLOT OF PSI HERE\n", - "\n", + "But the flat prior is not transformation invariant. What does this mean? Suppose we have a random variable $\\bf X$ from Bernoulli($\\theta$). We define the prior on $p(\\theta) = 1$. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "PUT PLOT OF THETA HERE" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, let's transform $\\theta$ with the function $\\psi = log \\frac{\\theta}{1-\\theta}$. This is just a function to stretch $\\theta$ across the real line. Now how likely are different values of $\\psi$ under our transformation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "PUT PLOT OF PSI HERE" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ "Oh no! Our function is no longer flat. It turns out flat priors do carry information in them after all. The point of Jeffreys Priors is to create priors that don't accidentally become informative when you transform the variables you placed them originally on." ] }, @@ -1695,4 +1720,4 @@ "metadata": {} } ] -} +}