-
Notifications
You must be signed in to change notification settings - Fork 489
OPENNLP-1026: Replace references and usages of opennlp.tools.util.Heap with java.util.SortedSet #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Coverage decreased (-0.08%) to 56.158% when pulling 5f0be72 on smarthi:OPENNLP-1026 into 2721401 on apache:master. |
|
+1 |
|
The Heap which was used before was re-ordering the elements inserted into it, the Deque is not doing that. |
|
I also tried this once, and as far as I remember there is one place where the Heap is iterated, but a Heap doesn't have well defined ordering, so the iteration ordering is different between PriortyQueue and the Heap, but I then never managed to run some benchmarks to see if it makes a real difference in performance. Lets first find the performance issue which was introduced and then we can merge this one afterwards. |
|
PriorityQueue has elements in sorted order based on the comparator specified - hence the smallest element is always the first element in PQ if using Natural Ordering and behaves like Heap - but it doesn't guarantee that the largest element is the last element. Since we are trying to also retrieve the largest (or last) element, it seemed to make sense to use a LinkedList - but u r right the ordering then is not available. Let's revisit this later post 1.8.0 |
|
In my test I used sorting to retrieve the last element, since that case is in a less frequent code path than then retrieving the first element. I did some tests and didn't managed to see a runtime difference due to the increased cost to retrieve the last element. About the iteration problem: It might be a good idea to anyway iterate the parses in a sorted order rather than the order the heap has internally. Let us make some tests and see how this influences the performance of the parser. |
|
Coverage decreased (-0.08%) to 57.298% when pulling 5b5e051 on smarthi:OPENNLP-1026 into bbbb431 on apache:master. |
|
Coverage decreased (-0.08%) to 57.298% when pulling 63a7d5a on smarthi:OPENNLP-1026 into bbbb431 on apache:master. |
|
Updated the PR to use a SortedSet now - minimal impact to the present code |
|
Coverage decreased (-0.08%) to 57.298% when pulling e5b4ab7 on smarthi:OPENNLP-1026 into bbbb431 on apache:master. |
|
Coverage decreased (-0.08%) to 57.298% when pulling b86ecd8 on smarthi:OPENNLP-1026 into bbbb431 on apache:master. |
…p with java.util.PriorityQueue
|
Coverage decreased (-0.08%) to 57.298% when pulling 3b87797 on smarthi:OPENNLP-1026 into bbbb431 on apache:master. |
|
Close this PR, will make a fresh PR |
Thank you for contributing to Apache OpenNLP.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.