Skip to content

support unsorted segment intervals#203

Merged
bmcfee merged 1 commit intomir-evaluation:masterfrom
bmcfee:interval_sort
Jun 13, 2016
Merged

support unsorted segment intervals#203
bmcfee merged 1 commit intomir-evaluation:masterfrom
bmcfee:interval_sort

Conversation

@bmcfee
Copy link
Copy Markdown
Collaborator

@bmcfee bmcfee commented Jun 10, 2016

This is a bugfix for issue #202.

@bmcfee bmcfee added the bug label Jun 10, 2016
@bmcfee bmcfee added this to the 0.4 milestone Jun 10, 2016
@bmcfee
Copy link
Copy Markdown
Collaborator Author

bmcfee commented Jun 10, 2016

TODO:

  • Support unsorted intervals in util.adjust_intervals
  • In general, replace implicit start time intervals[0, 0] and end-time intervals[-1, 1] by calls to min() and max() unless we've explicitly sorted the data already.
  • Explicit test for util.sort_intervals
  • Rewrite util.interpolate_intervals to support discontinuities; this may be better done as a separate PR though.

@bmcfee bmcfee changed the title [wip] support unsorted segment intervals support unsorted segment intervals Jun 10, 2016
@bmcfee
Copy link
Copy Markdown
Collaborator Author

bmcfee commented Jun 10, 2016

I think we should punt on rewriting interpolate_intervals since it's non-essential to fixing the current issue with implicitly ordered timings. AFAIK, there's no need for label interpolation with incomplete interval coverage, so we can deal with that if/when it comes up.

Otherwise, I think this one's ready for CR from @craffel or @urinieto . (Should be an easy one.)

@bmcfee
Copy link
Copy Markdown
Collaborator Author

bmcfee commented Jun 10, 2016

(PS: speedy resolution of this would be appreciated, as it's stalling a project i'm working on.)

Comment thread mir_eval/segment.py
if intervals.size > 0:
# Make sure intervals start at 0
if not np.allclose(intervals[0, 0], 0.0):
if not np.allclose(intervals.min(), 0.0):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and the max call below, could be made 2x faster as intervals[:, 0].min() and intervals[:, 1].max() respectively.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope 😁 the slice construction dominates runtime for the range of values we care about.

N = 100
x = np.random.randn(N, 2)

%timeit x.min()
100000 loops, best of 3: 1.9 µs per loop

%timeit x[:, 0].min()
100000 loops, best of 3: 1.92 µs per loop

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there you go.

@craffel
Copy link
Copy Markdown
Collaborator

craffel commented Jun 10, 2016

Looks reasonable to me. I am not an expert on interpolate_intervals, so it might be useful if another user chimed in to make sure the change makes sense, but if it's doing what you want it to do it seems ok to me.

@bmcfee
Copy link
Copy Markdown
Collaborator Author

bmcfee commented Jun 10, 2016

Cool, thanks.

I didn't make any substantive changes to interpolate_intervals (all previous tests still pass). The function itself is strange and a little broken IMO, but this isn't the right place to fix it.

The logic is: for each time point t, find the first interval k that starts after t, go back to the previous segment k-1, and apply that segment's label to t. This obviously requires ordered intervals to work, ie, interval[k-1] immediately precedes interval[k].

@craffel
Copy link
Copy Markdown
Collaborator

craffel commented Jun 10, 2016

Ok, cool.

@bmcfee
Copy link
Copy Markdown
Collaborator Author

bmcfee commented Jun 11, 2016

@craffel thanks for the CR. Unless you object, I think this is merge-ready.

@craffel
Copy link
Copy Markdown
Collaborator

craffel commented Jun 13, 2016

Feel free to squash and merge.

simplified interval sorting, adjust_intervals no longer needs ordered input

added tests for util.sort_labeled_intervals
@bmcfee bmcfee merged commit dc45756 into mir-evaluation:master Jun 13, 2016
@bmcfee bmcfee deleted the interval_sort branch June 13, 2016 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants