Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in ordered_logistic_lpmf derivative #1248

Closed
t4c1 opened this issue May 21, 2019 · 19 comments

Comments

@t4c1
Copy link
Contributor

commented May 21, 2019

Description

I think there is a bug in ordered_logistic_lpmf() in calculation of derivative wrt c. On line 139 there is ops_partials.edge2_.partials_vec_[n](0) = d;. This overwrites any value that was calculated in previous iterations of the loop. I think there should be ops_partials.edge2_.partials_vec_[n](0) += d; (+= instead of =).

I found this while implementing a GLM for this distribution and derivatives did not match.

Can someone check, whether this line of thinking is correct?

Current Version:

v2.19.1

@andrjohns

This comment has been minimized.

Copy link
Collaborator

commented May 21, 2019

Hi @t4c1!

I wrote this lpmf and that is not a bug. There is no previous value to overwrite because ops_partials.edge2_.partials_vec_[n](0) is only visited once for each n

Can you link me to a branch or a test where the derivatives are coming out wrong?

@t4c1

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

I think this is happening for the case where c is vector (not a matrix). It seems to me that in that case that n gets ignored by the broadcast_array partials_vec_ so it is actually visited multiple times.

I did not write any test specifically for this. As I am developing a GLM for this distribution I test it by comparing the derivatives to this lpmf. I dont have it on git yet as it is work in progress.

@t4c1

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

If you want to see the test anyway I can push it on github.

@andrjohns

This comment has been minimized.

Copy link
Collaborator

commented May 21, 2019

Do you mean std::vector or Eigen::Vector?

When you say "n is ignored" do you mean that regardless of the value of n, ops_partials.edge2_.partials_vec_[n](0) always refers to the same value?

The derivatives of this are actually pretty thoroughly tested, if you have a look at the tests in mix/fwd/rev, there are multiple places that it would fail if it wasn't probably iterating through the derivatives for c

@andrjohns

This comment has been minimized.

Copy link
Collaborator

commented May 21, 2019

If you could push that test, that'd be great too!

@t4c1

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

I'm using Eigen::Vector, but I dont think it should matter.

Yes.

I checked out rev tests. In all cases at most one y has value 1 so this would be missed.

@andrjohns

This comment has been minimized.

Copy link
Collaborator

commented May 21, 2019

Right, but if n isn't properly iterating through ops_partials.edge2_.partials_vec_ , wouldn't all of the derivatives be wrong? Not just the cases when y = 1?

@t4c1

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

Other branches do have that += instead of =

@t4c1

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

@andrjohns

This comment has been minimized.

Copy link
Collaborator

commented May 21, 2019

You were right! I added this to the rev test:

TEST(ProbDistributionsOrdLog, vv_vec_y1) {
  using stan::math::ordered_logistic_lpmf;
  using stan::math::var;
  using stan::math::vector_d;
  using stan::math::vector_v;

  std::vector<int> y{1, 1, 1, 1};

  vector_v lam_v(4);
  lam_v << -2.95, -1.68, 0.96, 2.68;

  vector_v c_v(3);
  c_v << -2.68, -1.53, 0.46;

  vector_d c_adj(4);
  c_adj << 0.4329070950345456674650,
                 0.7310585786300048792511,
                 0.9744192116879923125450,
                 0.9953210890136492969744;


  AVAR out_v = ordered_logistic_lpmf(y, lam_v, c_v);
  out_v.grad();

  EXPECT_FLOAT_EQ(lam_v[0].adj(), -c_adj[0]);
  EXPECT_FLOAT_EQ(lam_v[1].adj(), -c_adj[1]);
  EXPECT_FLOAT_EQ(lam_v[2].adj(), -c_adj[2]);
  EXPECT_FLOAT_EQ(lam_v[3].adj(), -c_adj[3]);

  EXPECT_FLOAT_EQ(c_v[0].adj(), c_adj.sum());
  EXPECT_FLOAT_EQ(c_v[1].adj(), 0.0);
  EXPECT_FLOAT_EQ(c_v[2].adj(), 0.0);
}

And it fails with :

test/unit/math/rev/mat/prob/ordered_logistic_test.cpp:329: Failure
Expected equality of these values:
  c_v[0].adj()
    Which is: 0.99532109
  c_adj.sum()
    Which is: 3.1337061

So it is expecting the partial derivative of the last y, and not the sum of the partials.

After doing some digging, it looks like the problem is in the use of ops_partials.edge2_.partials_vec_ when edge2 is not a nested container. Because the partials_vec_ member is used for std::vector<vector> inputs, when it's used with a vector input it ignores the iterator because there's only a single vector to iterate through. This was entirely me misunderstanding how partials_vec_ worked.

Thanks very much for catching this one!

@andrjohns

This comment has been minimized.

Copy link
Collaborator

commented May 21, 2019

@bob-carpenter @syclik @seantalts

I'm very sorry, but I've introduced a bug with ordered_logistic_lpmf.

When calling ordered_logistic_lpmf(int[] y, vector lambda, vector cutpoints), the wrong derivatives for the cutpoints are returned if y contains multiple 1's.

The problem does not occur for calls of ordered_logistic_lpmf(int y, double lambda, vector cutpoints) or ordered_logistic_lpmf(int[] y, vector lambda, vector[] cutpoints)

I'll submit a PR straight away, sorry again!

@seantalts

This comment has been minimized.

Copy link
Member

commented May 21, 2019

Thank you both! Don't worry about bugs - they happen to everyone. These distributions are particularly hard to test with all of their combinatoric instantiations - we have to figure out how to make it easier to write the code as well as automate property-based testing for things like distributions and gradients that we should be able to verify automatically. In this particular case, is there a way to add ordered_logistic to the distribution tests? I noticed it just got manual tests.

@andrjohns

This comment has been minimized.

Copy link
Collaborator

commented May 21, 2019

Yep I can add distribution tests for it, that shouldn't be a problem

@syclik

This comment has been minimized.

Copy link
Member

commented May 21, 2019

Thanks, all, for taking care of it so quickly!

If you can, try the distribution test on develop just to make sure it catches it first. If you need help implementing, please reach out.

@betanalpha

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

@bob-carpenter

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

Well, that would explain why it's been reported to be so slow to fit!

@andrjohns

This comment has been minimized.

Copy link
Collaborator

commented May 21, 2019

@betanalpha The changes were introduced in the 2.19 release

@betanalpha

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

@drezap

This comment has been minimized.

Copy link
Collaborator

commented Jun 19, 2019

this issue has been resolved.

@drezap drezap closed this Jun 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.