Skip to content

Conversation

Planeshifter
Copy link
Contributor

  • updated package as outlined in TODO list
  • supports accessor function

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 8d89b0e on development into 71069ef on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 481b2c8 on development into 71069ef on master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 481b2c8 on development into 71069ef on master.

@kgryte
Copy link
Contributor

kgryte commented Mar 20, 2015

@Planeshifter See the TODO note about having an option for calculating the population variance. If so, we should modify the API, such that the second parameter is an options argument, with accessor and bias properties. Your thoughts?

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 2303662 on development into 71069ef on master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 2303662 on development into 71069ef on master.

@Planeshifter
Copy link
Contributor Author

@kgryte Does that really matter? For most sample sizes, the variance estimates are going to be very similar. In R, there is no option either, but I have seen that Matlab allows choosing between the biased and unbiased versions, and the parameter also allows you to pass a vector of weights for the individual elements. I imagine that this is quite useful in certain sampling or survey data settings. Did you think about that option?

@kgryte
Copy link
Contributor

kgryte commented Mar 21, 2015

@Planeshifter re: weights. Yes, but not this method. I find it interesting that MATLAB allows weights to be provided to the var function, but not the mean function. I think I prefer the convention we currently have which is to have a separate function for handling weights; e.g., wmean. Similarly, we would have wvariance.

Re: bias. Python provides a separate function for computing the population variance: pvariance, while having variance be the sample variance. Granted the latter is more common, but I feel that the semantics are reversed, at least from a math perspective: you start with variance and the sample statistic is the special case.

I will think a bit more on this.

@Planeshifter
Copy link
Contributor Author

Makes sense to have different functions handling the weighted case, I agree. Concerning the variances, both pvariance and variance would be estimators of the true underlying variance (expect if you observe the whole population, then pvariance would indeed be the true population variance). Most of the time though, we will only have a sample from some population, and so it is implicit that functions like mean, variance, median etc. all just calculate sample statistics and not the true parameters. So I don't think that there is a naming problem. Your bias option idea sounds better to me than an extra pvariance function, though.

@kgryte
Copy link
Contributor

kgryte commented Mar 21, 2015

Agreed re:pvariance/variance. I feel like many things in Python have been added after-the-fact. Like someone did something (sample variance). A little while later someone wanted to make a change (have the population variance). Said, well, variance is taken/needs to remain the same due to backwards compatibility, so guess another function needs to be created. The Pandas df is another example of this, where several methods have different names, but do the same thing. My guess is that this is attributable more to history than being by design.

Re: bias option. I agree. This will make is consistent with covariance.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling ff569df on development into 71069ef on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 77e5d26 on development into 71069ef on master.

kgryte added a commit that referenced this pull request Mar 21, 2015
Adds accessor and bias options. fmt. dotfiles.
@kgryte kgryte merged commit 72a7e90 into master Mar 21, 2015
@kgryte kgryte deleted the development branch March 21, 2015 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants