Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/2431 lang tuples step1 #2498

Merged
merged 145 commits into from Aug 3, 2018
Merged

Conversation

mitzimorris
Copy link
Member

Submission Checklist

  • Run unit tests: ./runTests.py src/test/unit
  • Run cpplint: make cpplint
  • Declare copyright holder and open-source license: see below

Summary

Refactor of stan/lang AST needed in order to extend Stan language to tuples. See issue #2431 for details.

Intended Effect

Everything should work exactly as before.

How to Verify

Run all unit tests.

Side Effects

Changes to the AST nodes required changes to the generator code. Previously the AST contained a set of variant types for variable declarations, with the result that the generator code required many vistor classes to generate the series of c++ statements which correspond to a single Stan language variable declaration or compound declaration-definition. In rewriting the generator methods, I changed the generated code itself in order to remove duplicate checks on container sizes and improve overall readability. Significant changes to the generated code:

  1. Previously first all variable dimensions were validated, then variables were initialized. Now validation is followed directly by initialization, which means that some failing programs may fail differently.

  2. Double literal variables now store the actual string value from the program. Previously, generated c++ code used the stored double value, e.g. double literal "5.1" would be initialized to not exactly 5.1, depending on the compiler. Now the string value "5.1" is generated.

  3. spaces after commas in args, to improve readibility.

Documentation

N/A.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@syclik
Copy link
Member

syclik commented Jul 18, 2018 via email

@syclik
Copy link
Member

syclik commented Jul 18, 2018 via email

@mitzimorris
Copy link
Member Author

src/test/unit/services/util/generate_transitions_test.cpp:7:10: fatal error: 'test/test-models/good/optimization/rosenbrock.hpp' file not found

that could be a missing dependency in the make/tests makefile - I'll check.

@syclik
Copy link
Member

syclik commented Jul 18, 2018 via email

@mitzimorris
Copy link
Member Author

right - that is the error.

but regarding the other, it does seem that file make/tests needs a fix as well -

src/test/unit/services/util/generate_transitions_test.cpp: src/test/test-models/good/services/test_lp.hpp src/test/test-models/good/optimization/rosenbrock.hpp

@syclik
Copy link
Member

syclik commented Jul 18, 2018 via email

@mitzimorris
Copy link
Member Author

thanks @syclik - that's a legit error - investigating

@bob-carpenter bob-carpenter merged commit 1ce1d9d into develop Aug 3, 2018
@bob-carpenter bob-carpenter deleted the feature/2431-lang-tuples-step1 branch August 3, 2018 16:01
@seantalts
Copy link
Member

Hey, this broke existing models and literally doesn't return all of the parameters for the SIR model. I thought this was a Math issue first, see stan-dev/math#972

Should we revert while this undergoes more testing?

@mitzimorris
Copy link
Member Author

yes, absolutely.

@mitzimorris
Copy link
Member Author

@seantalts which models besides SIR model are problematic?

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Aug 9, 2018 via email

@seantalts
Copy link
Member

I don't know the full extent of models that have issues because my test program is currently stopping at the first error like this it encounters (I'm trying to fix that but I'm tied up all day with Rok and Erik).

The tests run here: http://d1m1s1b1.stat.columbia.edu:8080/blue/organizations/jenkins/CmdStan%20Performance%20Tests/detail/master/214/pipeline/

The repo (with instructions) is here: https://github.com/stan-dev/performance-tests-cmdstan

@mitzimorris
Copy link
Member Author

there's an error with the indexing logic - for the SIR model, list of parameters looks like:

y.1.1,y.2.1,y.3.1,y.4.1

where it should be

y.1.1, y.1.2, y.1.3, y.1.4,

the underlying problem is that we need more tests on generator - the Stan test suite currently doesn't do much end-to-end testing, other than tests for advi.
the benchmark tests are useful as good tests on the generator - some subset of them should be added to the Stan test suite.

@mitzimorris mitzimorris restored the feature/2431-lang-tuples-step1 branch August 9, 2018 15:03
@seantalts
Copy link
Member

Nice find. Just to copy slightly relevant info from the other thread:
I don't know the correct name, but these tests also test that HMC is still recovering means within 30% of the old estimate. Similar to the stat comp benchmarks.

@mitzimorris
Copy link
Member Author

I've made a lot of mistakes in doing this PR. I should have had the discipline to leave the generated .hpp code exactly as is and not change the output to be more readable and save that for a second PR. mea culpa.

As a way forward, I'm writing a test plan. The specific bug that the benchmarks test uncovered is that the new generator code is getting mixed up on the order of array indexes. This isn't surprising - the whole point of this PR is to refactor the way that dimensions on container types are represented in the AST. Here is what I have so far w/r/t a test plan for current bug:

Multi-dimensional block-level variables in Stan programs.

Block variables can have constraints/specialized types;
constraints enforced in program block in which they occur.
Block variables are declared with sizes.

Example: matrix<lower=1,upper=1>[2,3] ar_mat[4,5];

Verify that refactor contains tests for all AST variable decl nodes and type:

  • variable ar_mat is block_var_decl

  • block_var_decl.type() = array_block_type

  • array_block_type.element_type() is array_block_type

  • array_dims = 2

  • array_lens = [4,5]

  • array_block_type contains matrix_type

  • matrix_type has bounds

  • matrix_type M_ (arg1) = 2

  • matrix_type N_ (arg2) = 3

Multi-dimensional local variables in Stan programs.

Local variables are declared with sizes.

functions {
 function void foo() {
  matrix[2,3] local_ar_mat[4,5];
 }
}

Verify that refactor contains tests for all AST variable decl nodes and type:

  • variable ar_mat is local_var_decl

  • local_var_decl.type() = array_local_type

  • array_local_type.element_type() is array_local_type

  • array_dims = 2

  • array_lens = [4,5]

  • array_local_type contains matrix_type

  • matrix_type M_ (arg1) = 2

  • matrix_type N_ (arg2) = 3

Multi-dimensional bare variables in Stan programs

Bare type array variables are declared with number of dimensions.
Bare type variables don't have constraints or sizes.
Used in function declarations

functions {
 function void foo(matrix ar_mat[,]) {
   int a = 0;
 }
}

Program 1: data block only, single variable decl:

data {
 matrix<lower=1,upper=1>[2,3] ar_mat[4,5];
 }

Parser tests: AST for entire program as expected

Generator tests:

check .hpp file for:

  • generated code for variable declation
  • generated code for variable initialization

program end-to-end test:

  • run program, good inputs
  • run program, inputs wrong shape
  • run program, inputs correct shape, violate bounds constraint

Program 2: transformed data block only, single variable decl:

transformed data {
 matrix<lower=1,upper=1>[2,3] ar_mat[4,5];
}

check .hpp file for:

  • generated code for variable declation
  • generated code for variable initialization

program end-to-end test:

  • run program - should run

Programs 3,4,5 : parameters / transformed parameters /generated quantities block only:

parameters {
 matrix<lower=1,upper=1>[2,3] ar_mat[4,5];
}

check .hpp file for:

  • generated code for variable declation
  • generated code for variable initialization

program end-to-end test:

  • run program - should run
  • output should list all parameters in order:
    ar_mat.1.1.1.1, ar_mat.1.1.1.2, ... ar_mat.4.5.2.2, ar_mat.4.5.2.3

is there a GitHub way to add test plans to PRs?
comments on this one?

@mitzimorris
Copy link
Member Author

mitzimorris commented Aug 12, 2018

filed issue #2604 - fixing on branch bugfix/2406-generator-multidim-params

mitzimorris added a commit that referenced this pull request Aug 16, 2018
…es-step1"

This reverts commit 1ce1d9d, reversing
changes made to 11d5b7b.
@mcol mcol deleted the feature/2431-lang-tuples-step1 branch January 18, 2020 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants