Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numbers.random and testing.js #76

Merged
merged 5 commits into from
May 11, 2014
Merged

Conversation

milroc
Copy link
Contributor

@milroc milroc commented Dec 24, 2012

numbers.statistics

  • Moved statistics.randomSample to random.sample

numbers.random

  • random.sample moved from statistics.randomSample
  • random.boxMullerTransform
  • random.irwinHall
  • random.bates
  • random.distribution
    1. distribution.normal
    2. distribution.logNormal
    3. distribution.boxMuller
    4. distribution.irwinHall
    5. distribution.irwinHallNormal
    6. distribution.bates

testing.js

  • testing.approxEquals
  • testing.between
  • testing.likelyTrue

Possible changes:

no longer do .distribution, but rather, encompass that within random.*:

e.g: random.normal() will return a variable with a probability of distribution from [0, 1]. Limitation of this would be where to place the n (denoting size of array), if in front, random.normal(mu, sigma) not possible, if in back random.normal(0, 1, 100) required for each creation of a basic normal distribution.

see comments made in code for other possible changes

Need to work on this.
Added random.js and first set of methods.
var sample = [];

for (var i = 0; i < n; i++) {
sample.push(Math.random() * (upper - lower) + lower);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a change here to improve performance and allow the lower bound to be a number > Math.random() limit.

@milroc
Copy link
Contributor Author

milroc commented Dec 24, 2012

I should note that the current state of testing done to random.js will greatly increase the test time (roughly 2 seconds per test run now).

return sum/n;
};

random.distribution = {};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if .distribution namespace is kept, should it be created with way or in a separate file?

@KartikTalwar
Copy link
Member

I was/am working on a distribution class as well and this is what I have done:

A new file distributions.js

  • Each distribution is a 'class' by itself
  • Each distribution has its own sub/prototype functions wrt the distribution type.

So as an example this is how you'd use one of then

norm = new Normal(data);
console.log(norm.expectedValue());
console.log(norm.mean());
console.log(norm.variance());

Let me know if this makes sense because I have a bunch of distributions lined up.

@milroc
Copy link
Contributor Author

milroc commented Dec 24, 2012

Cool Kartik,

So what exactly are these objects, are these representations of the distributions (i.e: rather than what I have which creates an array of n that is a sample of a given distribution), or will they be precise values representing the expected distributions (norm.mean() returns 0)?

What is data and how does it affect the Normal class?

@KartikTalwar
Copy link
Member

Sorry, that fist line wast suppose to represent anything. It should have said args.

What I have is just representation of these distributions. The prototypes don't make much sense for normal because they already are your parameters (mean, variance) but for example on a NegativeBionomail(k, p), the mean is k(1-p)/p and variance is k(1-p)/p^2and initializing a NB() and calling .mean() would return that value.

Now I do see this as not the best way to represent things and I like your sample solution but I'm not sure what/how to do this. Should we allow that n parameter and plug in directly into the equation and the user be in charge of creating the array or create the sample and return the computed values.

Let me know what you think. Maybe I have it all wrong here..

@milroc
Copy link
Contributor Author

milroc commented Dec 25, 2012

Hey Kartik,

I think your idea might be pretty valuable for testing the random namespace. Perhaps NegativeBinomial(k, p, kEps, pEps) where the mean is (k +|- kEps)(1 - (p +|- pEps)) / (p +|- pEps).

If we built it perhaps people will also find non-test use cases for it.

@KartikTalwar
Copy link
Member

Hmm.
I have about 9 other distribution functions (plus normal) that compute the numerical value if you give in all the parameters.(Haven't gotten to their protoypes functions yet). So as an example:

Negative Binomial Distribution

So the user is now in charge of computing for(k<10) { data.push(NB(k, r, p)); }. If this approach was to be taken, would you mind elaborating why/where having that epilson inside the function comes into place?

I guess this makes more sense if we generalize all this in distributions.js instead of random.js correct?

@milroc
Copy link
Contributor Author

milroc commented Dec 25, 2012

So my recommendation was to use it as a method for testing real life data created similar to the method I described above.

That means not only could you test the limiting inputs of the generation methods (stdev or mean similar to the test cases outlined in the PR), but also more behavior (for example, finding a distribution that is normal with stdev, mean and mode, but very kurtotic in relationship to the expected behavior of a normal distribution, thus not a Normal distribution). Epsilon was recommended because each calculated function will also need to include the error you deem viable to consider the data to be normal.

distribution.tryNormal(data); gets mean and standard deviation of the data, throw it into your normal class and that allows you to determine if other properties are still respected. Not sure what to return (hence the try vs is). The more I talk about this, the more I feel like the utility is negligible and adds a great deal of confusion on both contributors and developers ends.

It should be noted that discrete probability distributions will require no sampling along a function assuming you know the given parameters to create the distribution. The above PR is sampling values along a continuous probability distribution. Thus creating a Negative Binomial Distribution of length n where you know p and r would be the matter of doing what you mentioned, this would be more inline with the way I did things in the PR (not sure which is best, but lets have everyone discuss it):

distribution.negativeBinomial = function(n, r, p) { 
  var data = []; 
  for (var k = 0; k < n; k++) {
    // not sure where to put the helper random? 
    data.push(basic.negativeBinomial(k, r, p)); //no randomness involved
  } 
  return data;
};

As far as where either of these go, I'd wait until Steve, Ethan, etc have time to look at this (it's the holiday season too so there's no rush).

Can you commit code to a branch on your fork and reference it here so I can see your idea more clearly?

@KartikTalwar
Copy link
Member

Sounds good. I'll try to push it by tomorrow.

@KartikTalwar
Copy link
Member

This is what I have:
https://github.com/KartikTalwar/numbers.js/blob/distributions/lib/numbers/distributions.js

From the comments above my goal was to implement whatever I could from the table found on each distribution's wikipedia entry as protoypes.

@milroc
Copy link
Contributor Author

milroc commented Dec 28, 2012

So I'm getting a better idea about what your goal was. However, I find probability distributions for picking random values from it. So this would work for any discrete probability functions, because you can pass in random numbers in certain types. However for continuous values you would need to build a method to give an approximation to that function (e.g: box muler).

I'm not sure how to combine these two pieces of code but I think at this point we should wait for some outside input on how to continue.

@KartikTalwar
Copy link
Member

Agreed.

@sjkaliski
Copy link
Member

@milroc @KartikTalwar hopping back into this. Any interest in getting back to this?

@KartikTalwar
Copy link
Member

I'm in.

@milroc
Copy link
Contributor Author

milroc commented Jun 25, 2013

I'm down. Terribly busy and head down on some new projects but if I can help I definitely will. Have we figured out which direction to go with numbers.js? I'm beginning to debate really if adding browser support is a necessity.

@sjkaliski
Copy link
Member

@KartikTalwar great. @milroc Look at the abstract-structures branch. Trying to take a structure first, application second approach. Browser support is secondary. I'd rather resolve that after the API is defined.

@milroc
Copy link
Contributor Author

milroc commented Jun 25, 2013

Alright, sounds good. Feel free to give me something to work on in the future and I'll see if I can work on it. Otherwise just @ me if you need an additional opinion on the API. I'll try to keep track of issues/PR's too.

@sjkaliski
Copy link
Member

@milroc bring this back to life?

@Dakkers
Copy link
Contributor

Dakkers commented May 2, 2014

@milroc I was hoping we could merge this because I wanted to add to statistics.js and put MonteCarlo into random.js BUT statistics.js has since changed, would probably be a bad time trying to merge. any chance on us fixing this? :D

@milroc
Copy link
Contributor Author

milroc commented May 2, 2014

Hey @StDako, can you be more specific on what I need to fix? It's been a while since this pull request and from my memory there wasn't anything blocking from getting this merged, aside waiting for some outside input on it.

@Dakkers
Copy link
Contributor

Dakkers commented May 2, 2014

@milroc the only thing that will be a problem is that statistic.randomSample has changed (I updated it recently), so that'll cause an issue, from my understanding. I'm pretty sure that's it though!

@milroc
Copy link
Contributor Author

milroc commented May 9, 2014

@StDako, @sjkaliski ran the tests, updated random.sample (did however not remove statistic.randomSample).

Dakkers added a commit that referenced this pull request May 11, 2014
numbers.random and testing.js
@Dakkers Dakkers merged commit 4edc3a6 into numbers:master May 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants