Skip to content

Conversation

@smarthi
Copy link
Member

@smarthi smarthi commented Apr 17, 2017

Thank you for contributing to Apache OpenNLP.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically master)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn clean install at the root opennlp folder?
  • Have you written or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file in opennlp folder?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found in opennlp folder?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 56.158% when pulling 5f0be72 on smarthi:OPENNLP-1026 into 2721401 on apache:master.

@kojisekig
Copy link
Member

+1

@kottmann
Copy link
Member

The Heap which was used before was re-ordering the elements inserted into it, the Deque is not doing that.

@kottmann
Copy link
Member

I also tried this once, and as far as I remember there is one place where the Heap is iterated, but a Heap doesn't have well defined ordering, so the iteration ordering is different between PriortyQueue and the Heap, but I then never managed to run some benchmarks to see if it makes a real difference in performance.

Lets first find the performance issue which was introduced and then we can merge this one afterwards.

@smarthi
Copy link
Member Author

smarthi commented Apr 17, 2017

PriorityQueue has elements in sorted order based on the comparator specified - hence the smallest element is always the first element in PQ if using Natural Ordering and behaves like Heap - but it doesn't guarantee that the largest element is the last element.

Since we are trying to also retrieve the largest (or last) element, it seemed to make sense to use a LinkedList - but u r right the ordering then is not available. Let's revisit this later post 1.8.0

@kottmann
Copy link
Member

In my test I used sorting to retrieve the last element, since that case is in a less frequent code path than then retrieving the first element. I did some tests and didn't managed to see a runtime difference due to the increased cost to retrieve the last element.

About the iteration problem: It might be a good idea to anyway iterate the parses in a sorted order rather than the order the heap has internally. Let us make some tests and see how this influences the performance of the parser.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 57.298% when pulling 5b5e051 on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

@smarthi smarthi changed the title OPENNLP-1026: Replace references and usages of opennlp.tools.util.Heap with java.util.PriorityQueue OPENNLP-1026: Replace references and usages of opennlp.tools.util.Heap with java.util.SortedSet Apr 26, 2017
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 57.298% when pulling 63a7d5a on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

@smarthi
Copy link
Member Author

smarthi commented Apr 26, 2017

Updated the PR to use a SortedSet now - minimal impact to the present code

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 57.298% when pulling e5b4ab7 on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 57.298% when pulling b86ecd8 on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 57.298% when pulling 3b87797 on smarthi:OPENNLP-1026 into bbbb431 on apache:master.

@smarthi
Copy link
Member Author

smarthi commented Apr 27, 2017

Close this PR, will make a fresh PR

@smarthi smarthi closed this Apr 27, 2017
@smarthi smarthi deleted the OPENNLP-1026 branch April 27, 2017 21:46
@smarthi smarthi restored the OPENNLP-1026 branch April 27, 2017 21:46
@smarthi smarthi deleted the OPENNLP-1026 branch April 27, 2017 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants