Skip to content

Conversation

@frank113
Copy link
Contributor

Resolves # .

This pull request will add ANOVA functionality.

Checklist

Please ensure the following tasks are completed before submitting this pull request.

  • [y] Read, understood, and followed the contributing guidelines, including the relevant style guides.
  • [y] Read and understand the Code of Conduct.
  • [y] Read and understood the licensing terms.
  • [y] Searched for existing issues and pull requests before submitting this pull request.
  • [y] Filed an issue (or an issue already existed) prior to submitting this pull request.
  • [y] Rebased onto latest develop.
  • [y] Submitted against develop branch.

Description

What is the purpose of this pull request?

This pull request:

  • Creates an anova function to perform one-way ANOVA given an array of continuous values and an array of factors. Support for other input methods will be added.
  • b
  • c

Related Issues

Does this pull request have any related issues?

Questions

Any questions for reviewers of this pull request?

No.

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

No.


@stdlib-js/reviewers

@kgryte kgryte added Feature Issue or pull request for adding a new feature. Math Issue or pull request specific to math functionality. labels Jan 30, 2018
Copy link
Member

@Planeshifter Planeshifter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, Frank! Thanks for all your work. I left some comments on things we should resolve before merging in this pull request.

<!-- /.examples -->

<section class="links">

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add

[mdn-array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array

[mdn-typed-array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays

so that the Markdown reference links are not missing their definitions and are rendered correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

@@ -0,0 +1,140 @@
# One Way ANOVA

> Perform One-Way ANOVA on given data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change the description to Perform a one-way analysis of variance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.


### anova1( x, factor\[, ...params]\[, opts] )

For an [array][mdn-array] or [typed array][mdn-typed-array] of numeric values `x` and an [array][mdn-array] of classifications `factor`, one-way anova1 is performed. The hypotheses are given as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revert this undesired change that was introduced when renaming the package: s/anova1/anova.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure what you mean here. I am taking this to mean the removal of the optional parameters that we discussed after the 200 staff meeting on Tuesday. I will ask about this when we meet tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @Planeshifter is referring to the description:

... classifications factor, one-way anova is performed.


For an [array][mdn-array] or [typed array][mdn-typed-array] of numeric values `x` and an [array][mdn-array] of classifications `factor`, one-way anova1 is performed. The hypotheses are given as follows:

$$H_{0}: \mu_{1} = \mu_{2} = \dots = \mu_{k}$$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a special way of writing formulas so that we can later automatically render the equations using MathJax. For this one, we could do

<!-- <equation class="equation" label="eq:hypotheses" align="center" raw="\begin{align*} H_{0}:& \; \mu_{1} = \mu_{2} = \dots = \mu_{k} \\ H_{a}:& \; \text{at least one} \; \mu_{i} \; \text{not equal to the others} \end{align*}" alt="Hypotheses of ANOVA"> -->

<!-- </equation> -->

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved. Out of curiosity, does the markdown format always revert to MathJax?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as MathJax is what we use for server-side equation rendering. See

var mathjax = require( 'mathjax-node-sre' );
.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -0,0 +1,19 @@
'use strict';

var ANOVA = require('../lib/make_anova.js');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename ANOVA to anova1 here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

var newMean;


if (x.length !== factor.length) {
Copy link
Member

@Planeshifter Planeshifter Feb 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to remove this check given since this function is only called inside of make_anova.js (where we have already established that the arrays have the same length) and won't be publicly available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. This may be useful for future development yet for now I agree with the sentiment. Removed.

function meanTable(x, factor) {
var treatments;
var factorCount;
var meanTable;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename the meanTable variable? As-is, it has the same name as the function, which can be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tableOfMeans

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably don't have to be as descriptive. It would be enough to name the variable out. The meaning of the variable is readily available from the function docs, and, as it stands now, one has to read until function end to know what value is returned from the function.

'use strict';

/**
* One-Way Analysis of Variance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-way analysis of variance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

f = ['frank', 'philipp', 'nugent', 'frank', 'philipp', 'nugent', 'frank', 'philipp', 'nugent', 'frank'];
out = anova1(x, f);

t.equal(out.treatment_df, 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do these values come from? We should add a comment, e.g. // Tested against R:. Better yet would be to have a script to generate the R fixtures automatically. But we can add that later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values came from R testing side by side, forgot to add the R testing fixtures. Will do at a later date.


// Now to make the out object
out = {};
setReadOnly(out, 'treatment_df', treats.length - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid underscores in property names? We could use names like treatSumSq and errMeanSq? Not so sure about the degrees of freedom. What do you think, @frank113 and @kgryte?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would take inspiration from MATLAB (https://www.mathworks.com/help/stats/anova1.html), including in terms of the output data structures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kgryte The MATLAB looks much cleaner than what I have; far more readable. I will update the naming to mirror it. For Prob>F could we simply replace it with p?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Prob>F, let's use pValue to be consistent with the other stats packages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or pValue, whichever is the convention currently used my @Planeshifter elsewhere.

@frank113 frank113 closed this Feb 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature Issue or pull request for adding a new feature. Math Issue or pull request specific to math functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants