OPENNLP-1026: Replace references and usages of opennlp.tools.util.Heap with java.util.SortedSet #162

smarthi · 2017-04-17T01:19:05Z

Thank you for contributing to Apache OpenNLP.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?
Is your initial contribution a single, squashed commit?

For code changes:

Have you ensured that the full suite of tests is executed via mvn clean install at the root opennlp folder?
Have you written or updated unit tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
If applicable, have you updated the LICENSE file, including the main LICENSE file in opennlp folder?
If applicable, have you updated the NOTICE file, including the main NOTICE file found in opennlp folder?

For documentation related changes:

Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.

coveralls · 2017-04-17T01:24:57Z

Coverage decreased (-0.08%) to 56.158% when pulling 5f0be72 on smarthi:OPENNLP-1026 into 2721401 on apache:master.

kojisekig · 2017-04-17T03:10:07Z

+1

kottmann · 2017-04-17T09:08:35Z

The Heap which was used before was re-ordering the elements inserted into it, the Deque is not doing that.

kottmann · 2017-04-17T09:51:47Z

I also tried this once, and as far as I remember there is one place where the Heap is iterated, but a Heap doesn't have well defined ordering, so the iteration ordering is different between PriortyQueue and the Heap, but I then never managed to run some benchmarks to see if it makes a real difference in performance.

Lets first find the performance issue which was introduced and then we can merge this one afterwards.

smarthi · 2017-04-17T11:09:28Z

PriorityQueue has elements in sorted order based on the comparator specified - hence the smallest element is always the first element in PQ if using Natural Ordering and behaves like Heap - but it doesn't guarantee that the largest element is the last element.

Since we are trying to also retrieve the largest (or last) element, it seemed to make sense to use a LinkedList - but u r right the ordering then is not available. Let's revisit this later post 1.8.0

kottmann · 2017-04-26T10:18:42Z

In my test I used sorting to retrieve the last element, since that case is in a less frequent code path than then retrieving the first element. I did some tests and didn't managed to see a runtime difference due to the increased cost to retrieve the last element.

About the iteration problem: It might be a good idea to anyway iterate the parses in a sorted order rather than the order the heap has internally. Let us make some tests and see how this influences the performance of the parser.

coveralls · 2017-04-26T18:53:37Z

Coverage decreased (-0.08%) to 57.298% when pulling 5b5e051 on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

coveralls · 2017-04-26T20:09:59Z

Coverage decreased (-0.08%) to 57.298% when pulling 63a7d5a on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

smarthi · 2017-04-26T20:11:50Z

Updated the PR to use a SortedSet now - minimal impact to the present code

coveralls · 2017-04-27T15:17:06Z

Coverage decreased (-0.08%) to 57.298% when pulling e5b4ab7 on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

coveralls · 2017-04-27T20:32:27Z

Coverage decreased (-0.08%) to 57.298% when pulling b86ecd8 on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

…p with java.util.PriorityQueue

coveralls · 2017-04-27T21:13:56Z

Coverage decreased (-0.08%) to 57.298% when pulling 3b87797 on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

smarthi · 2017-04-27T21:45:40Z

Close this PR, will make a fresh PR

smarthi changed the title ~~OPENNLP-1026: Replace references and usages of opennlp.tools.util.Heap with java.util.PriorityQueue~~ OPENNLP-1026: Replace references and usages of opennlp.tools.util.Heap with java.util.SortedSet Apr 26, 2017

OPENNLP-1026: Replace references and usages of opennlp.tools.util.Hea…

3b87797

…p with java.util.PriorityQueue

smarthi closed this Apr 27, 2017

smarthi deleted the OPENNLP-1026 branch April 27, 2017 21:46

smarthi restored the OPENNLP-1026 branch April 27, 2017 21:46

smarthi deleted the OPENNLP-1026 branch April 27, 2017 21:47

OPENNLP-1026: Replace references and usages of opennlp.tools.util.Heap with java.util.SortedSet #162

OPENNLP-1026: Replace references and usages of opennlp.tools.util.Heap with java.util.SortedSet #162

Uh oh!

Conversation

smarthi commented Apr 17, 2017

For all changes:

For code changes:

For documentation related changes:

Note:

Uh oh!

coveralls commented Apr 17, 2017

Uh oh!

kojisekig commented Apr 17, 2017

Uh oh!

kottmann commented Apr 17, 2017

Uh oh!

kottmann commented Apr 17, 2017

Uh oh!

smarthi commented Apr 17, 2017

Uh oh!

kottmann commented Apr 26, 2017

Uh oh!

coveralls commented Apr 26, 2017

Uh oh!

coveralls commented Apr 26, 2017

Uh oh!

smarthi commented Apr 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Apr 27, 2017

Uh oh!

coveralls commented Apr 27, 2017

Uh oh!

coveralls commented Apr 27, 2017

Uh oh!

smarthi commented Apr 27, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

smarthi commented Apr 26, 2017 •

edited

Loading