Skip to content

Conversation

@gi114
Copy link

@gi114 gi114 commented Mar 14, 2019

…ue if last element is dequeued

This is response to the request of optimization found here: scala/bug#11396

"immutable.Queue's methods should be optimised so that when dequeueing elements, it does not leave out empty (unless in is also empty), and when enqueueing elements, it puts at least one element in out if both in and out are empty (although once you're checking, might as well put everything in out)."

@scala-jenkins scala-jenkins added this to the 2.13.1 milestone Mar 14, 2019
@diesalbla diesalbla added the library:collections PRs involving changes to the standard collection library label Mar 17, 2019
@gi114
Copy link
Author

gi114 commented Mar 19, 2019

Thanks, I signed the CLA !

@SethTisue SethTisue requested a review from NthPortal March 19, 2019 18:55
Copy link
Contributor

@NthPortal NthPortal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tail (the method/part of the usage pattern listed in the linked ticket) also needs to be changed for this, as well as a bunch of other methods.

*/
def enqueue[B >: A](elem: B): Queue[B] = new Queue(elem :: in, out)
def enqueue[B >: A](elem: B): Queue[B] =
if (isEmpty) new Queue(Nil, out :+ elem)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't want to append things (:+) to Lists - it is very slow. In this case however, we actually know the whole Queue is empty, so it can be elem :: Nil.

def dequeue: (A, Queue[A]) = out match {
case Nil if !in.isEmpty => val rev = in.reverse ; (rev.head, new Queue(Nil, rev.tail))
case Nil if !in.isEmpty => in match {
case x :: Nil => (x, new Queue(Nil, Nil))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much this line adds. The cost of reversing a List with a single element is pretty low, and ideally doesn't happen very often (whereas this check has to be done for all reversals). Without benchmarks showing it's an improvement for near-empty Queues and not a significant cost for larger ones, I'm not convinced this check improves things.

Additionally, in order to fulfill the optimisation requested by the linked ticket, there should be a check for out having a single element, in which case in is eagerly reversed (slightly before it's needed, but presumably it's going to be removed eventually anyway - it's a queue after all).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(@gi114 is new to the repo, so might need pointers on running benchmarks if we decide that's needed here)

Copy link
Author

@gi114 gi114 Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments, I shall try to adjust the PR. And yes, this is my first ticket so let me know if I there is anything relevant I should do. Also, do you usually run tests suites or do you test your changes separately on a .gitignore file?

@SethTisue SethTisue added the welcome hello new contributor! label Mar 20, 2019
@gi114
Copy link
Author

gi114 commented Mar 22, 2019

Hi, I updated my PR with some other changes as it is shown above

Rationale behind changes is the following: The fist element added to an empty queue is added to 'out', as specified by the ticket, and 'out' should never be forced empty. Because of that, dequeuing elements involves 'in' eagerly reversing before the 'out' list is ever empty ( one step before, so when 'out' has only one element ).
Thus, the .tail method matches the following conditions: if the 'in' list is non empty for the first case, for the second case it matches single or multiple elements in 'out' (which is never empty), and the third case is the exception of calling tail on an empty queue. In this way we never encounter an empty 'out' unless the Queue is emptied. We should make sure all functions under immutable Queue operate on this constrain, so let me know if you find that it is not the case. Very importantly, with this constrain we would not need to worry to call .head on some empty 'out' list and and traverse 'in' with O(n) to find the last element, so the return value is always out.head, unless the queue is empty

…ase of empty queue, the first iterable input is added to out. This is in line with previous commits on enqueue and dequeue methods
@gi114
Copy link
Author

gi114 commented Apr 8, 2019

Fixed a test which was failing. Everything seems fine, the PR follows a specific logic. Let me know if agree or see anything wrong.

Also, as am new, I wanted to ask you if you could point at how to run some code performance analysis. Thanks again

@NthPortal NthPortal self-assigned this Apr 21, 2019
Copy link
Contributor

@NthPortal NthPortal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great and takes care of all the obvious methods.

However, I am concerned that there may be some other method that doesn't comply with the new invariant "if out is empty then in must also be empty." These implementations currently assume/require the invariant to hold, and if it does not, they will do the wrong thing.

It would certainly be a good idea to have someone run collection-laws on this; additionally, I will attempt to comb through all the other methods to check that they maintain the invariant.

I think it would also be good to add a comment somewhere (I'm not sure exactly where, tbh) stating the invariant that all methods must maintain.

if it's empty, in and out are both Nil

Co-Authored-By: Nth <7505383+NthPortal@users.noreply.github.com>
@gi114
Copy link
Author

gi114 commented May 31, 2019

Hi,

Would you think this pull request is ready to be merged ?

Thanks!

@NthPortal
Copy link
Contributor

Unfortunately, this cannot be merged until 2.13.0 is released, as it is not a blocker for the release. Once it is and we move on to 2.13.1 however, we will be able to look at this PR again

@gi114
Copy link
Author

gi114 commented May 31, 2019 via email

@NthPortal
Copy link
Contributor

@gi114 sorry for the wait, but we'll get there :)

@szeiger
Copy link
Contributor

szeiger commented Aug 20, 2019

Any progress on this? Feel free to reschedule to 2.13.2 if it will take longer.

@NthPortal
Copy link
Contributor

I'll try to put it on my weekend TODO list

@gi114
Copy link
Author

gi114 commented Aug 22, 2019

Hi, let me know if there is anything I can add/change from a code perspective on this one. Would you be able to rerun the tests to see if anything is failing? Thanks and let me know!

@szeiger szeiger modified the milestones: 2.13.1, 2.13.2 Aug 30, 2019
@NthPortal
Copy link
Contributor

I'm so sorry that I can't get to this right now. if someone else would like to take over getting this over the finish line, that would be great

@NthPortal NthPortal removed their assignment Sep 28, 2019
@gi114
Copy link
Author

gi114 commented Sep 29, 2019 via email

def enqueue[B >: A](elem: B): Queue[B] = new Queue(elem :: in, out)
def enqueue[B >: A](elem: B): Queue[B] =
if (isEmpty) new Queue(Nil, elem :: Nil)
else new Queue(elem :: in, out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an easy win. I wonder if there is a good way to test it. Maybe a benchmark is the only test that really matters. If a test that just continually enqueues and dequeues a single element at a time doesn't get a bit faster, would anything?

override def head: A =
if (out.nonEmpty) out.head
else if (in.nonEmpty) in.last
if (nonEmpty) out.head
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm dubious, that is, I think this is dubious.

*/
def dequeue: (A, Queue[A]) = out match {
case Nil if !in.isEmpty => val rev = in.reverse ; (rev.head, new Queue(Nil, rev.tail))
case x :: Nil if in.nonEmpty => (x, new Queue(Nil, in.reverse))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure incurring the reverse eagerly is worth the goal of supporting dubious head/tail pattern described in the ticket. As an aside, there is ongoing conversation about whether to always prefer !isEmpty to nonEmpty for performance reasons (because cost is not zero).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Queue is a LinearSeq so there is an expectation that head/tail should have decent performance.

I didn't look at the generated code but going by the List implementation I would definitely prefer to call isEmpty when the type is known (like it is in this case). The Scala optimizer should be able to inline the method so you don't even have to rely on HotSpot.

@som-snytt
Copy link
Contributor

som-snytt commented Oct 1, 2019

I don't know if encouraging the head/tail anti-pattern is desirable, but I'll comment about that on the ticket.

I did spot a potential improvement at #8449 which I guess is desirable, but it is late here. I touched the junit test for this class there.

Do you know the local customs to squash commits and push -f to this branch and run sbt junit/test?

I see the previous question whether it's easy to show that out.nonEmpty always holds. (I prefer nonEmpty, of course.) I would suggest that if it's not easy to show that much, then it's not worth any complexity.

However, always putting the first element in out seems good to me.

@gi114
Copy link
Author

gi114 commented Oct 1, 2019 via email

@NthPortal
Copy link
Contributor

anyone else have thoughts on @som-snytt's comment about whether or not this is a desirable change?

@NthPortal NthPortal closed this Dec 1, 2019
@NthPortal NthPortal reopened this Dec 1, 2019
@szeiger
Copy link
Contributor

szeiger commented Dec 9, 2019

@gi114 See https://github.com/scala/scala/blob/2.13.x/test/benchmarks/README.md for benchmarking. There's already a number of benchmarks for other collections that would make a good starting point. One thing the readme doesn't mention (but really should) is that you need to turn on optimization for the standard library (e.g. setupPublishCore) to get useful results. The JMH documentation has more information about how to write good benchmarks.

@SethTisue SethTisue modified the milestones: 2.13.2, 2.13.3 Feb 6, 2020
@szeiger
Copy link
Contributor

szeiger commented Apr 1, 2020

This still looks like a worthwhile improvement to me. There were some changes in the 2.13.x implementation in the meantime so this would need to be rebased. In particular, the code for enqueueAll is now in appendedAll.

As long as enqueue, dequeue, tail and appendedAll are modified to keep the new invariants, everything else should work fine. I think head could be simplified further to only check out, which is cheaper than both the current implementation and the one proposed here.

There should still be some basic benchmarks to ensure we're not regressing in more normal use cases though.

@SethTisue SethTisue modified the milestones: 2.13.3, 2.13.4 May 12, 2020
@SethTisue SethTisue marked this pull request as draft May 19, 2020 01:05
@SethTisue
Copy link
Member

is anyone interested in picking this one up?

@NthPortal
Copy link
Contributor

NthPortal commented Oct 1, 2020

I would like to, but I don't have the capacity right now. if no one picks it up in a month, I'll see if I have the capacity for it

@dwijnand dwijnand modified the milestones: 2.13.4, 2.13.5 Oct 16, 2020
@dwijnand
Copy link
Member

dwijnand commented Dec 4, 2020

Closing for inactivity. Happy to reopen any time, by Nth or anyone else.

@dwijnand dwijnand closed this Dec 4, 2020
@SethTisue SethTisue changed the title add first element of empty queue to out list / return explicit emty que… add first element of empty queue to out list / return explicit empty queue Feb 9, 2021
@SethTisue SethTisue removed this from the 2.13.5 milestone Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

library:collections PRs involving changes to the standard collection library welcome hello new contributor!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants