Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added real2nat/real2int and stdlib expansion #117

Closed
wants to merge 24 commits into from

Conversation

mkhattab940
Copy link
Contributor

Added real2int and real2nat coercions to hakaru. Rounding errors need to be resolved

Added functions factorial() and choose() for discrete distributions
Adding a bunch of discrete distributions (binomial + transformations)
Adding beta distribution transformations (inverse Beta)
Adding gamma distribution transformations (inverse Gamma, erlang)
Adding some chi distribution transformations (F distribution, T distribution, Cauchy distribution)
Adding uniform distribution transformations (triangle, TSP (incomplete), pareto, standard Power, gompertz)

@@ -114,6 +114,8 @@ primTable =
,("nat2real", primCoerce cNat2Real)
,("nat2prob", primCoerce cNat2Prob)
,("nat2int", primCoerce cNat2Int)
,("real2int", primUnsafe cInt2Real)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem backwards -- why is it called real2int but then hooked to cInt2Real ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried cReal2Int at first and got an error saying it wasn't defined. Then I noticed line 110 has:

("int2nat", primUnsafe cNat2Int)

So I tried the same thing for real2int and it worked. Other than occasional rounding errors, these functions give the desired result

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be because they are actually identity functions in the underlying representation! But that line 110 feels like a bug too. I understand that you were misled by it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, shall I remove this then? Removing it doesn't affect the functionality of the functions I was using it for. But my choose function does return real values with lots of trailing decimal places

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok to remove real2int, but it seems that

    ,("int2nat",     primUnsafe cNat2Int)
    ,("real2prob",   primUnsafe cProb2Real)
    ,("real2int",    primUnsafe cInt2Real)

are all correct given the way Language.Hakaru.Syntax.TypeCheck handles U.UnsafeTo_. If you take a look at the source there you'll see that dom and cod are used in the opposite way compared to how they are used for U.CoerceTo_.

X <~ beta(a,b)
return X / (1 - X)

def stdUniform():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that stdUniform is equivalent to a beta should not be part of the code of the standard library, but instead should be in a test.

return p


invBeta(2,2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard library should contain no raw calls to distributions - it should only contain definitions.

X <~ stdCauchy()
return a + alpha*X

T_distribution(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - no raw calls.

Also, why are all of these defined in this one file? Comments would be useful!!!

# Limitation: n <= 20
# If return w/ real2nat, there are rounding errors
def choose(n nat, k nat):
real2nat((product i from 1 to n+1: i)/((product i from 1 to int2nat(n-k+1): i)*(product i from 1 to k+1: i)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the 'wrong' way to define choose. What you wrote is the mathematical definition, which is a very rotten way of actually computing it. Please re-implement.

Also, you should never use real2nat. This is a sign of a bug.

@@ -0,0 +1,31 @@
# Histogram plotting tool to check the shape of distributions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful -- but does it belong here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can put it in the test folder

@@ -0,0 +1,18 @@
# Standard Normal Distribution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this duplicating things in chiDistribution.hk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove these from chiDistribution.hk since these are simpler implementations

@JacquesCarette
Copy link
Contributor

As I indicated on the details of PR itself, there are flaws with this. It would be best if you partitioned your PRs into smaller chunks, so that we can accept the good parts, and continue the discussion on the parts that need tweaking.


def cauchy(a real, alpha prob):
X <~ stdCauchy()
return a + alpha*X
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all together because they are related. I can seperate them out (i.e. central chi from the non-central chi). For the Cauchy and T distributions, I'd like to put them in their own file (chiTransformations) but that would require having a way to import functions from other files, which if there is, I don't know how to do. Haven't seen anything in the documentation about it.

@@ -0,0 +1,31 @@
# Histogram plotting tool to check the shape of distributions
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can put it in the test folder

@@ -114,6 +114,8 @@ primTable =
,("nat2real", primCoerce cNat2Real)
,("nat2prob", primCoerce cNat2Prob)
,("nat2int", primCoerce cNat2Int)
,("real2int", primUnsafe cInt2Real)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, shall I remove this then? Removing it doesn't affect the functionality of the functions I was using it for. But my choose function does return real values with lots of trailing decimal places

@@ -0,0 +1,18 @@
# Standard Normal Distribution
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove these from chiDistribution.hk since these are simpler implementations


def choose(n nat, k nat):
# TODO error check k <= n
(product i from k+1 to n+1: i)/(product i from 1 to int2nat(n-k+1): i)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a better definition for choose?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still computationally quite bad. Best is likely to assume it as a primitive (i.e. that binomial takes 2 nat and returns a nat). And implement that primitive in Haskell, by using the implementation from Math.Combinatorics.Exact.Binomial (from the exact-combinatorics package). See https://hackage.haskell.org/package/exact-combinatorics-0.2.0.8/docs/Math-Combinatorics-Exact-Binomial.html if you are curious.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First let me note that what is mathematically the same thing may be best defined differently depending on how the program is going to be interpreted -- by running and sampling, by simplification and human reading, or in some other way. For simplification and human reading, this definition of choose in terms of product is fine. For running and sampling, if it's ok to have some rounding errors for large inputs, it's better to use betaFunc. According to https://en.wikipedia.org/wiki/Beta_function,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which makes me think that we might be better off implementing choose as a primitive!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to tell without knowing how it would be used.

@JacquesCarette
Copy link
Contributor

Yes, please remove real2int.

@JacquesCarette
Copy link
Contributor

More generally, Hakaru really (really!) needs some kind of 'import' facility. I think this work may well push that need "over the edge", so that it gets implemented.

@JacquesCarette JacquesCarette mentioned this pull request Nov 23, 2017
@mkhattab940 mkhattab940 deleted the master branch November 28, 2017 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants