Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a UserDistribution type to the list of distributions #49

Closed
GoogleCodeExporter opened this issue Apr 6, 2015 · 4 comments
Closed

Comments

@GoogleCodeExporter
Copy link

Add a UserDistribution type to the list of distributions. A UserDistribution is 
made up of empirical data constructed as a list of paired data items. 

Each data pair has a value property (integer or double) and a weight property 
(double). The sequence of pairs forms a probability distribution rather like a 
histogram.

Also a mechanism is required to indicate if the UserDistribution is a discrete 
or continuous distribution (see Issue 47 too) 

Example:

<userDistribution distributionType=”continuous”>
   <entries>
        <entry value="4.5" weight="2.0"/>
        <entry value="5.0" weight="50.0"/>
        <entry value="7.0" weight="80.0"/>
        <entry value="8.9" weight="30.0"/>
   </entries>
</userDistribution>

Original issue reported on code.google.com by jhor...@lanner.co.uk on 10 Feb 2012 at 2:36

@GoogleCodeExporter
Copy link
Author

Elsewhere we used the term "non-parametric." I understand "non-parametric" to 
be a distribution provided by a collection of points. I think we have the same 
thing in mind, but are representing it differently. I think your weights are 
based on the frequency of occurrence of that point ??? We should settle on a 
terminology.

Another issue, though I think counting ("bins" in histograms) is fine in 
discrete valued spaces (integers) I'm not sure whether counting real valued 
data is a good idea unless you give ranges.

Original comment by pode...@gmail.com on 16 Feb 2012 at 3:28

@GoogleCodeExporter
Copy link
Author

Done in version 0.2

The UserDistribution class provides a custom sampling of points with the 
likeliness of each one to occurs.

The UserDistributionDataPoint class represents a data point in the User 
Distribution

Original comment by dga...@trisotech.com on 27 Feb 2012 at 7:42

  • Changed state: Fixed

@GoogleCodeExporter
Copy link
Author

At the session on the 23rd Feb there was discussion about the parameters for a 
user distribution. The initial suggestion was that the weights could be 
expressed as real data without normalisation (i.e. dont have to sum to 1 or 
100). In our experience this is user friendly as it allows empirical data to be 
entered. Tools would then handle these weights. 

Whilst we can demand that the weights sum to 1.0 in reality this isnt 
absolutely necessary.

Original comment by jhor...@lanner.co.uk on 28 Feb 2012 at 9:21

@GoogleCodeExporter
Copy link
Author

Original comment by sringue...@trisotech.com on 24 Oct 2012 at 6:40

  • Changed state: Applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant