Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added marginal flag to segment.nce #227

Merged
merged 6 commits into from Mar 3, 2017

Conversation

bmcfee
Copy link
Collaborator

@bmcfee bmcfee commented Nov 1, 2016

Implements #226

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 9, 2017

@craffel do you have any opinions on how tests for this one should look? Some options:

  1. Add a simple unit test that calls with a simple test case that you can check manually for marginal=True. Something like [A, B], [a, b, c, b].

  2. Add a wrapper function vmeasure (or whatever we decide to call it from RFC: fixing the segment.nce metrics? #226) that overrides marginal and calls the nce metrics. Then add full regression test outputs as before. I'm confident enough in the correctness of the implementation that correctness doesn't worry me.

If we do go the regression test route, I think we'd need to add the wrapper function so that the results are kept distinct from the existing NCE metrics.

What do you think?

@craffel
Copy link
Owner

craffel commented Feb 9, 2017

Regression would be good, both would be good too. Are you planning on making it so that the evaluate method calls the function both with marginal True and False? (Sorry, may have to jog my memory on this a bit)

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 9, 2017

Are you planning on making it so that the evaluate method calls the function both with marginal True and False?

Your call. If we do, I'd advocate doing it via a wrapper/rename, instead of having two separate output dict entries. It will make it easier to keep things separate down the road.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 9, 2017

Added bonus of the wrapper solution: we can encourage people to deprecate nce in favor of v-measure, and all they have to do is ignore columns (rather than change defaults).

@craffel
Copy link
Owner

craffel commented Feb 9, 2017

If the goal is to eventually deprecate marginal=False, then I agree that it doesn't make sense to compute both in evaluate. But, then you need to decide whether evaluate will call it with marginal=True or marginal=False.

Regarding the test, whichever is the non-default behavior (i.e. whichever is NOT called in evaluate) is OK to just have a unit test. The other should be regression tested, of course.

Does that seem reasonable?

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 9, 2017

If the goal is to eventually deprecate marginal=False, then I agree that it doesn't make sense to compute both in evaluate.

I guess it's a matter of whether we deprecate marginal=False and still call it NCE, or deprecate NCE entirely in favor of v-measure. I prefer the latter, since it would reduce possibility for mis-interpretation.

@craffel
Copy link
Owner

craffel commented Feb 9, 2017

I guess it's a matter of whether we deprecate marginal=False and still call it NCE, or deprecate NCE entirely in favor of v-measure. I prefer the latter, since it would reduce possibility for mis-interpretation.

When do you propose we do the deprecation? What was the community consensus?

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 9, 2017

When do you propose we do the deprecation? What was the community concensus?

Predictably, there was no obvious consensus. I think the best thing to do is implement it now, then pitch it in the unconference at the next ismir so that people have to actually respond.

@craffel
Copy link
Owner

craffel commented Feb 9, 2017

SG. I would advocate for a unit test for v-measure and regression test for the current behavior, and then switching to regression for v-measure and unit for current behavior once it's deprecated.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 9, 2017

I would advocate for a unit test for v-measure and regression test for the current behavior, and then switching to regression for v-measure and unit for current behavior once it's deprecated.

Cool. OTOH, we could also just regression-test both if it's going to become another entry in the evaluate output. I only propose this because it will be the minimal modification to the existing test structure, and LIS, the change is simple enough that I feel confident in the correctness without a specific unit test.

@craffel
Copy link
Owner

craffel commented Feb 9, 2017

OTOH, we could also just regression-test both if it's going to become another entry in the evaluate output

I thought we just decided it wasn't going to become another output of evaluate.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 9, 2017

I thought we just decided it wasn't going to become another output of evaluate.

Hah, I thought we just decided the opposite? Let me be totally clear about what I think makes the most sense:

  1. Add a new function vmeasure that's equivalent to nce(..., margin=True). This way, we don't cause confusion between the new and old style.
  2. Add vmeasure to the evaluate outputs. Old-style NCE is still included.
  3. Add regression tests for the extended dictionary.
  4. Eventually, deprecate nce and keep vmeasure. (Or, if not deprecate in the library, encourage people not to use it.)

EDIT: forgot to address the confusing point. The above would be in contrast to doing something like:

{'NCE Over': ..., 'NCE Over (marginal)': ..., ...}

where the marginal=True case is added as an alternate version of NCE. I think that would only be confusing.

@craffel
Copy link
Owner

craffel commented Feb 9, 2017

Ok, sounds fine. For evaluate, I assume you will just make a new key in the output dict which is v_measure or something.

@bmcfee bmcfee force-pushed the segment-nce-marginal-entropy branch from d6ef7bd to 07646c1 Compare February 24, 2017 18:51
@craffel
Copy link
Owner

craffel commented Feb 27, 2017

What is left on this other than wait for the other tests to stop failing?

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 27, 2017

What is left on this other than wait for the other tests to stop failing?

Well, I'd need to actually implement the separate wrapper function and top-level evaluate hooks. I've been holding off until separation rights itself.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 28, 2017

Okay, this one's up to speed. I added vmeasure as a function, hooked it into evaluate, and updated the tests and fixtures.

I noticed that the reference section is missing in the segmentation docstring. Do you want me to add one while we're at it?

@bmcfee bmcfee requested a review from craffel February 28, 2017 21:13
@bmcfee bmcfee added this to the 0.5 milestone Feb 28, 2017
@bmcfee
Copy link
Collaborator Author

bmcfee commented Feb 28, 2017

@craffel I had to move the scipy version requirement here because it turns out scipy.stats.entropy did not exist prior to 0.14. At this point, that release is almost 3 years old, so I doubt this is a huge deal.

Copy link
Owner

@craffel craffel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting some documentation fixes.

marginal : bool
If `False`, normalize conditional entropy by uniform entropy.
If `True`, normalize conditional entropy by the marginal entropy.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add (Default value = False) for consistency within this docstring.

true_given_est = p_est.dot(scipy.stats.entropy(contingency, base=2))
pred_given_ref = p_ref.dot(scipy.stats.entropy(contingency.T, base=2))

if marginal:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment to the effect of # Normalize conditional entropy by the marginal entropy

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done below

marginal : bool
If `False`, normalize conditional entropy by uniform entropy.
If `True`, normalize conditional entropy by the marginal entropy.

Returns
-------
S_over
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the math in this part of the docstring to note that it can be different according to the value of marginal? For example denominator should be H(y_ref) right? Or at least, we should change them in the v_measure docstring right?

@craffel
Copy link
Owner

craffel commented Mar 1, 2017

I noticed that the reference section is missing in the segmentation docstring. Do you want me to add one while we're at it?

Sure!

@craffel
Copy link
Owner

craffel commented Mar 1, 2017

@craffel I had to move the scipy version requirement here because it turns out scipy.stats.entropy did not exist prior to 0.14. At this point, that release is almost 3 years old, so I doubt this is a huge deal.

No problem.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Mar 1, 2017

I think I've hit all the review points. @craffel last pass?

@craffel
Copy link
Owner

craffel commented Mar 1, 2017

Last thing to check - did you rebuild the docs and make sure there are no new errors/glitches now that the refs are there?

@bmcfee
Copy link
Collaborator Author

bmcfee commented Mar 1, 2017

I have built the docs locally and didn't notice anything bad, but I wasn't looking too carefully.

@craffel
Copy link
Owner

craffel commented Mar 1, 2017

I have built the docs locally and didn't notice anything bad, but I wasn't looking too carefully.

Look carefully at the changes caused by this PR and I will be happy :)

@bmcfee
Copy link
Collaborator Author

bmcfee commented Mar 1, 2017

Look carefully at the changes caused by this PR and I will be happy :)

Done. Fixed a couple of minor formatting issues, but I think it's solid.

@craffel craffel merged commit a9c4d60 into craffel:master Mar 3, 2017
@craffel
Copy link
Owner

craffel commented Mar 3, 2017

Merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants