Add Box-Cox transform #218

justin1dennison · 2018-10-18T14:33:39Z

Resolves #117.

Checklist

Please ensure the following tasks are completed before submitting this pull request.

Read, understood, and followed the contributing guidelines, including the relevant style guides.
Read and understand the Code of Conduct.
Read and understood the licensing terms.
Searched for existing issues and pull requests before submitting this pull request.
Filed an issue (or an issue already existed) prior to submitting this pull request.
Rebased onto latest develop.
Submitted against develop branch.

Description

What is the purpose of this pull request?

This pull request:

Implements the one parameter Box-Cox transformation.

Related Issues

Does this pull request have any related issues?

This pull request:

resolves RFC: compute the Box-Cox transformation #117

Questions

Any questions for reviewers of this pull request?

No.

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

No.

@stdlib-js/reviewers

kgryte · 2018-10-18T18:49:05Z

@justin1dennison This is awesome! Thanks for working on this!!! Should I go ahead and review?

justin1dennison · 2018-10-18T21:43:21Z

@kgryte A review would be awesome as I would like to work on the other related Box-Cox related packages. Thanks. Are the CI systems failing because of a malformed report or am I missing something in the logs?

kgryte

@justin1dennison Overall, great work! Most of the PR review is nitpicks and growing accustomed to project convention.

Re: CI. Both TravisCI and AppVeyor will fail atm due to failing tests on older platforms which have yet to be fixed. The main CI results to pay attention to for this PR are the results from CircleCI.

Thanks again for working on this!

lib/node_modules/@stdlib/math/base/special/boxcox/README.md

lib/node_modules/@stdlib/math/base/special/boxcox/test/test.js

kgryte · 2018-10-19T06:52:55Z

lib/node_modules/@stdlib/math/base/special/boxcox/test/test.js

+		ex = data[i].expected;
+		b = boxcox( x, y );
+		if ( b === ex ) {
+			t.strictEqual( b, ex, 'returns '+b+'when provided '+x+' and '+y+'.');


Suggested change

t.strictEqual( b, ex, 'returns '+b+'when provided '+x+' and '+y+'.');

t.strictEqual( b, ex, 'returns '+b+'when provided '+x+' and '+y+'.' );

justin1dennison · 2018-10-19T13:22:01Z

I have noticed there seems to be an issues with the tests as well. I will work on the above suggestions and resolving the testing issues as well. Thank you for the suggestions and review. Do I need to follow any procedure when the changes and fixes have been completed?

justin1dennison · 2018-10-19T18:06:11Z

lib/node_modules/@stdlib/math/base/special/boxcox/test/test.js

+			t.strictEqual( b, ex, 'returns '+b+'when provided '+x+' and '+y+'.');
+		} else {
+			delta = abs( ex - b );
+			tol = EPS * abs( ex );


In fixing the testing issues, I have a couple of tests that are failing because of tolerance issues. If I change to @stdlib/constants/math/float32-eps instead of @stdlib/constants/math/float64-eps, then the tests pass. What is the recommended course of action regarding these floating point comparison?

Sorry for the confusion here. We really need to document this, as it has been a recurring question. The rationale for L105 is that we want to get a handle on how much different our implementations are from reference implementations in terms of ULPs (or the number of trailing bits which differ in the significand). Our proxy to this (for now) is to scale double-precision floating-point epsilon. Currently, L105 suggests that the tolerance is approximately one ULP (or within floating-point rounding error). When tests fail, we scale EPS by some real-valued multiple (e.g., 2.0 * EPS * abs( ex )) until tests pass. This does not have to be exact, but you can get a rough idea of how large the tolerance needs to be by looking at the test message output which should include both delta and tol. Based on those values, we can derive an approximate "minimum" tolerance required for tests to pass.

Obviously, this is a bit ad hoc, and we should invest some time into automating this.

kgryte · 2018-10-19T19:11:45Z

@justin1dennison Thanks for resolving the PR suggestions! In terms of process, once the PR is ready for another review, we'll do another review to ensure we didn't miss anything. If everything is good, we'll indicate our intent to merge; otherwise, we'll provide another round of feedback and suggestions.

One thing I will do is update deps on develop. When I do, I'll ask you to pull in the most recent develop changes to your branch, updating your PR. Currently, one reason CircleCI (our canary build system) is failing is due to out-of-date deps. I'll ping shortly once I've updated develop.

kgryte · 2018-10-19T19:22:32Z

lib/node_modules/@stdlib/math/base/special/boxcox/test/fixtures/python/runner.py

+
+def main():
+    """Generate fixture data."""
+    x = np.arange(0.25, 5.0, 0.25)


@justin1dennison One additional PR point: as a general principle, we typically avoid test values which follow patterns (e.g., in this case, values which increment by a fixed 0.25 increment). Why? While not necessarily applicable here, often tolerance issues (i.e., large deviations from reference implementations) can only be discovered by testing many different underlying bit patterns. Here, the underlying bit patterns have relatively low "entropy", meaning we are repeatedly testing similar underlying bit patterns.

Accordingly, we prefer, e.g., dividing input values into regimes (e.g., very small numbers, very large numbers, medium numbers, etc) and generating 1000 (enough to reasonably generate many different bit patterns, but not so large as to overwhelm test infrastructure) random values per regime. Dividing the input values into regimes ensures that we can probe different code paths and better understand how an implementation fares as we scale input values.

In this case, if I were writing this implementation, I would define two regimes: "small" values (e.g., < 1.0e-19) and everything else. If you wanted to be pedantic, because this is a multi-parameter function, you could test various regime combinations.

We don't have to mirror the regimes as defined by an implementation exactly, but we may want to follow the spirit. Even if we were to later change the underlying implementation, testing such regimes is useful as it allows us to gauge how well the new implementation handles different input value regimes.

I can see that being an issue. Is there a place that I could take a look at a testing construct like you are discussing?

@justin1dennison See this file. While not using random, we try to sample different bit patterns by ensuring that increments are not "nice", which means, in this case, instead of 1000 values, generating 1007 values, thus making the increment (hopefully) an irrational number, or at least a number which does not have a "regular" underlying bit pattern.

kgryte · 2018-10-20T00:03:36Z

@justin1dennison I updated develop, so feel free to pull in the latest changes on develop into your PR. Hopefully, this will resolve the build failure on CircleCI.

… boxcox to address new test failures.

justin1dennison · 2018-10-22T18:34:38Z

@kgryte I think that I have resolved all of the suggestions that you made. Again, thanks for taking the time. Additionally, I changed the tests to fall in line with the linked file that you provided. I accidentally closed the pull request earlier because I fat fingered. If I need to fix anything else, please let me know.

kgryte · 2018-10-22T19:00:40Z

@justin1dennison Awesome! Will take a final look shortly!

kgryte

@justin1dennison Great work! Thanks for working on this!

kgryte changed the title ~~Rfc 117 boxcox transform~~ Add Box-Cox transform Oct 18, 2018

kgryte added Feature Issue or pull request for adding a new feature. Math Issue or pull request specific to math functionality. labels Oct 18, 2018

kgryte requested changes Oct 19, 2018

View reviewed changes

justin1dennison commented Oct 19, 2018

View reviewed changes

kgryte reviewed Oct 19, 2018

View reviewed changes

justin1dennison force-pushed the rfc-117-boxcox-transform branch from f23d392 to e496e08 Compare October 20, 2018 03:46

justin1dennison added 18 commits October 22, 2018 09:46

initial implementation of boxcox transform package

9baa1b7

added benchmarking

fff2b8f

added the python benchmark for scipy boxcox

479e477

altered to use float64 eps constant

79cb918

added tests for boxcox

a803e8f

fixed tests and test fixtures

0c755ab

added negative lambda parameters and removed extraneous new lines

2621037

updated keywords and description in package.json

b14d250

updated jsdoc for boxcox

64c64ff

updated import of boxcox in examples

2a1acb1

updated formatting and output of repl.txt

64bb683

added end of file newline and removed odd indentation

43b28b8

fixed formatting of readme including examples and equation

b6b5170

updated benchmarking code

bd82695

altered the data fixture for testing the implementation of boxcox

47908b4

updated the threshold for tolerance testing of values

79531f3

removed equation image

e8d1d0b

mirrored epsilon usage from scipy instead of float62-eps

e3cfbbd

altered testing to testing more random bit arrangements. Had to alter…

4d705b9

… boxcox to address new test failures.

justin1dennison force-pushed the rfc-117-boxcox-transform branch from e496e08 to 4d705b9 Compare October 22, 2018 14:44

justin1dennison closed this Oct 22, 2018

justin1dennison reopened this Oct 22, 2018

kgryte approved these changes Oct 22, 2018

View reviewed changes

kgryte merged commit 2370165 into stdlib-js:develop Oct 22, 2018

	t.strictEqual( b, ex, 'returns '+b+'when provided '+x+' and '+y+'.');
	t.strictEqual( b, ex, 'returns '+b+'when provided '+x+' and '+y+'.' );

Uh oh!

Add Box-Cox transform #218

Add Box-Cox transform #218

Uh oh!

Conversation

justin1dennison commented Oct 18, 2018

Checklist

Description

Related Issues

Questions

Other

Uh oh!

kgryte commented Oct 18, 2018

Uh oh!

justin1dennison commented Oct 18, 2018

Uh oh!

kgryte left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kgryte Oct 19, 2018

Choose a reason for hiding this comment

Uh oh!

justin1dennison commented Oct 19, 2018

Uh oh!

justin1dennison Oct 19, 2018

Choose a reason for hiding this comment

Uh oh!

kgryte Oct 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kgryte commented Oct 19, 2018

Uh oh!

kgryte Oct 19, 2018

Choose a reason for hiding this comment

Uh oh!

justin1dennison Oct 19, 2018

Choose a reason for hiding this comment

Uh oh!

kgryte Oct 19, 2018

Choose a reason for hiding this comment

Uh oh!

kgryte commented Oct 20, 2018

Uh oh!

justin1dennison commented Oct 22, 2018

Uh oh!

kgryte commented Oct 22, 2018

Uh oh!

kgryte left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kgryte Oct 19, 2018 •

edited

Loading