std/lists: support destructive add/prepend for `SinglyLinkedList`, `DoublyLinkedList` + other `lists` improvements #303

timotheecour · 2020-12-17T00:48:22Z

current implementation in nim-lang/Nim#16362 is problematic because:

I've just realized that this operation is not really meaningful for doubly linked lists, as it will always leave one in an inconsistent state.

But actually it turns out there's a similar problem with SinglyLinkedList: b is left in inconsistent state after being added twice, and other problems below:

when defined case1:
  import lists, sequtils
  var a = [0,1].toSinglyLinkedList
  a.add a
  echo a # hangs, since you have a cycle

when defined case2:
  import lists, sequtils
  var a = [0,1].toSinglyLinkedList
  var b = [2,3].toSinglyLinkedList
  a.add b
  a.add b
  echo b # would hang: b is left in inconsistent state after being added twice
  # echo a # would hang

proposal

This proposal involves adding a flag to indicate that a list is cycle-free assuming user sticks add, prepend + other APIS in lists.nim; overhead is negligible.

The proposal is to support destructive add and prepend for both SinglyLinkedList and DoublyLinkedList as follows:

add a field owned: bool in SinglyLinkedList
implement add as follows:

proc add*(a, b: var SinglyLinkedList) =
  ## b degrades to be "owned"
  addImpl(a, b) # implementation from https://github.com/nim-lang/Nim/pull/16362
  if b.owned: a.owned = true else: b.owned = true

implement prepend in analogous way
note that prepend is still needed even if we have add because prepend(a,b) will set b.owned but add(a,b) will set b.owned), so you can't simply implement prepend(a,b) in terms of add as that would "own" a
implement $ and items as follows:

proc items*(a: SinglyLinkedList): auto =
  if a.owned: # use a HashSet to avoid looping in potential cycles
  else: oldItems() # no cycles here at least not as created by correct usage of lists.nim apis

then this will make $, and toSeq correct instead of infinite loop on cycles

Note: the owned flag is added to SinglyLinkedList, not SinglyLinkedNodeObj, so it doesn't asymptotically increase memory for normal use (few lists, lots of nodes)

and everything above done similarly with DoublyLinkedList

The text was updated successfully, but these errors were encountered:

Araq · 2020-12-17T07:52:12Z

While I can imagine your solution to work, I think it's too complex and I am not aware of other list implementations that use your solution.

See also: https://doc.rust-lang.org/std/collections/struct.LinkedList.html#method.append

This reuses all the nodes from other and moves them into self. After this operation, other becomes empty.
This operation should compute in O(1) time and O(1) memory.

timotheecour · 2020-12-17T08:13:36Z

that's almost the same approach, in fact that was my original idea but then i figured keeping b (but flagging it as "owned" (or "consumed")) is a bit more flexible. In either case you have a destructive add/prepend, but with "owned" you can still peek at the data that was appended.

Either way, either using an "owned" flag or destroying b would be a net improvement over nim-lang/Nim#16362

(thanks for the link btw)

salvipeter · 2020-12-17T08:15:10Z

This won't solve the problem of DoublyLinkedLists, because if you a.add(b) and then c.add(b), the first node of b cannot refer to the last node of a and c at the same time.

In my opinion, if someone uses list concatenation reasonably, these cases do not occur. If you create a cycle, you get an infinite loop, I think of that as a "fact of life" :) Adding a dirty bit, like the one proposed here, will result in many extra computations, and won't save the user if he does something like a.tail.next = a.head.
But this is a good reason to add an outplace operator, as well.

Also, e.g. Common Lisp has a special variable that determines whether cycles are detected or not when printing. We could at least give a predicate to decide if a given list is circular.

salvipeter · 2020-12-17T08:18:26Z

I like the Rust solution, because it's so simple, although takes a bit of freedom away (sometimes it is good to have shared structures).

timotheecour · 2020-12-17T08:47:44Z

This won't solve the problem of DoublyLinkedLists, because if you a.add(b) and then c.add(b), the first node of b cannot refer to the last node of a and c at the same time.

after a.add b, b is marked as owned, then when you call c.add(b) c will also be marked as owned according to what i wrote above: if b.owned: a.owned = true else: b.owned = true, to indicate c is in an invalid state (ie that may contain cycles); this may or may not be what you want, but at least it's self consistent.

If destroying b instead of using an owned flag (ie the rust approach), the semantics are instead making c.add(b) a noop since b was set to empty after a.add(b); it's different, but also self consistent.

note: special case of `a.add a` or similar

in rust:

fn main() {
  use std::collections::LinkedList;
  let mut list1 = LinkedList::new();
  list1.push_back('a');
  list1.push_back('b');
  println!("{:?}", list1);
  list1.append(&mut list1); // error
}

you (correctly) get an error:

error[E0499]: cannot borrow `list1` as mutable more than once at a time
 --> /Users/timothee/git_clone//nim//timn//tests/nim/all/t11524b.rs:8:3
  |
8 |   list1.append(&mut list1);
  |   ^^^^^^------^----------^
  |   |     |      |
  |   |     |      first mutable borrow occurs here
  |   |     first borrow later used by call
  |   second mutable borrow occurs here

The question is, how do we do the same for nim (in a robust way, not just this special case), with same O(1) complexity? nim-lang/Nim#14976 would work but is there something simpler?

var a = [0,1].toSinglyLinkedList
var b = [2,3].toSinglyLinkedList
a.add b
assert a.toSeq == [0,1,2,3]
assert b.toSeq == [] # with the rust behavior, ie no owned flag, just destroy b
a.add a  # this should give an error

less obvious example:

when true:
  import lists, sequtils
  var a = initDoublyLinkedList[int]()
  var n = newDoublyLinkedNode[int](0)
  a.append n
  a.append 1
  a.append 2
  var b = a
  b.remove(n)
  echo a
  echo b
  a.add b # would create a cycle

(note that with the "owned" flag, this would work (after a.add a, a would automatically be marked as owned))

remove can also create cycles:

this example uses a DoublyLinkedRing instead

  var a = initDoublyLinkedRing[int]()
  var n = newDoublyLinkedNode[int](0)
  a.append n
  a.append 1
  a.append 2
  var b = a
  b.remove(n)
  echo b
  echo a # hangs

salvipeter · 2020-12-17T09:58:04Z

I am not convinced about the owned solution for doubly linked lists. In the above example, the owned flag will be consistent, but the whole list will not be, because the invariant b.head.prev == nil is not satisfied.

As for handling a.add(a), I think that is also a case when the user should know better...

If someone would like to uses lists in a FP-like manner, he would need something like first and rest:

func first*[T](L: SinglyLinkedList[T]): T = L.head.value
func rest*[T](L: SinglyLinkedList[T]): SinglyLinkedList[T] =
  result = initSinglyLinkedList[T]()
  result.head = L.head.next
  result.tail = L.head.next

But then a.rest would also share structure with a, leading to similar cases. I think the solution is to don't treat these as problems.

Araq · 2020-12-17T10:10:10Z

We should follow Rust and move the elements.

timotheecour · 2020-12-17T10:24:49Z

We should follow Rust and move the elements.

indeed.

We also need a copy proc for a shallow copy of the list, for non-destructive append/prepend, eg:

var a = [0,1].toSinglyLinkedList
var b = [2,3].toSinglyLinkedList
# a.add b # ok but destructive for b
a.add b.copy # not destructive for b
assert a.toSeq == [0,1,2,3]
assert b.toSeq == [2,3]
a.add a.copy

note: copy is intentionally not deepCopy here. It's O(n) where n = num elements of the list

example:

type Foo = ref object
  x: int
var f = Foo(x: 1)
var a = [f, f].toSinglyLinkedList
var b = [f, f].toSinglyLinkedList
a.add b.copy
assert b.head.value == f

salvipeter · 2020-12-17T10:52:50Z

Implemented the move semantics, and added a shallow copy as well (both for both singly- and doubly linked lists).
One thing that bothers me is that with the current implementation I can't write a.add(b.copy), because b.copy is immutable, I need to add another variable explicitly:

var
  a = [0, 1].toSinglyLinkedList
  b = [2, 3].toSinglyLinkedList
  c = b.copy
a.add(c)

Is there a workaround for this?

timotheecour · 2020-12-17T11:08:32Z

how about:

proc addMove[T: SinglyLinkedList | DoublyLinkedList](a: var T, b: var T) = ...
proc add[T: SinglyLinkedList | DoublyLinkedList](a: var T, b: T) =
  var b2 = b.copy
  addMove(a, b2)

ditto with prepend, prependMove

it might actually be better, because in rest of stdlib, add is not destructive.

bikeshedding: addMove or addFrom

Araq · 2020-12-17T14:19:23Z

Is there a workaround for this?

Use a sink parameter. And maybe overwrite the = operator for linked lists.

salvipeter · 2020-12-17T21:21:35Z

Use a sink parameter.

That's nice! Haven't heard of these before. So if the second list is a sink parameter, then even though add is implemented as a move, it is actually a copy when the list passed as second parameter is used afterwards. (Or, to be exact, when it cannot be proved that it is not used afterwards, right?)

I like this, but I don't really like the idea that this operation is O(1) or O(n) based on what the compiler thinks, unknown to the user.

Araq · 2020-12-17T22:25:59Z

You can make the = operator for lists an .error and then the copies cannot be implicit.

salvipeter · 2020-12-17T22:49:23Z

That sounds like a good solution, but when I actually tried it out at this simple example:

import lists
var a = [1, 2, 3].toSinglyLinkedList
let b = [4, 5].toSinglyLinkedList
a.add b.copy
assert $a == "[1, 2, 3, 4, 5]"
assert $b == "[4, 5]"
a.add b
assert $a == "[1, 2, 3, 4, 5, 4, 5]" # b is inaccessable

... I got an error for the a.add b line, saying that it's not the last read of b... (this was my whole file)

timotheecour · 2020-12-17T22:49:42Z

You can make the = operator for lists an .error and then the copies cannot be implicit.

but that would prevent valid use cases, and be a breaking change (not necessarily in the right direction).

import std/lists
type Foo = object
  list: DoublyLinkedList[int]
proc main()=
  block:
    var a = initDoublyLinkedList[int]()
    let b = a # was legal
  block:
    let a = Foo(list: initDoublyLinkedList[int]())
    let b = a.list # was legal
main()

(and {.byaddr.} still runs into the famous template bug nim-lang/Nim#15920 so isn't always a workaround)

I do prefer the terminology addMove for the overload without copy because rest of stdlib assumes an add doesn't consume the RHS

I don't really like the idea that this operation is O(1) or O(n) based on what the compiler thinks, unknown to the user.

me neither, it's error prone

Araq · 2020-12-17T22:59:06Z

... I got an error for the a.add b line, saying that it's not the last read of b... (this was my whole file)

Because b is a global variable, the compiler doesn't try too hard to reason about global variables.

Araq · 2020-12-17T23:00:42Z

I do prefer the terminology addMove for the overload without copy because rest of stdlib assumes an add doesn't consume the RHS

That's true. But addMove isn't the best name. Maybe, addConsumed?

salvipeter · 2020-12-17T23:28:29Z

I have put it in as addMove for the time being... maybe addMoved is the best of both worlds?

timotheecour · 2020-12-17T23:33:43Z

addMoved is a bit shorter; any name that works well with both add and prepend is fine.
so this would be:

addMoved
add (calls copy + addMoved)
prependMoved
prepend (calls copy + prependMoved)

salvipeter · 2020-12-17T23:45:36Z

Do we need prepend[Moved]? Its naming suggests that the alternative is append[Moved].
And a.prependMove b is essentially just

b.addMove a
swap a, b

(or a = b.addMove a, if you don't care about b being empty afterwards), while a.prepend b is

a = b.copy.addMove a

timotheecour · 2020-12-18T00:18:19Z

add is better than append, even if we later add prepend:

reasons already given in std/lists: O(1) concatenation of singly- and doubly linked lists. Nim#16362 (comment),
better compatiblity with stdlib, eg:

with add: this works out of the box with procs assuming some add, eg collect:

import lists, sugar
# it works, I tried it after s/append/add/ in your PR
let b = collect(initSinglyLinkedList): for i in 0..<3: i

the classical "works better in generic code" is a real argument ... and it doesn't mean that "everything compiles", just the stuff that makes sense ;-).

my suggestion is:

don't modify pre-existing append in this PR but use add for your new procs in your PR
in future PR, change pre-existing append to add in lists.nim and add:

template append*[T: DoublyLinkedList or SinglyLinkedList](a: T, b: T) {.deprecated: "use add".} = add(a,b)
template append*[T](a: DoublyLinkedList[T] or SinglyLinkedList[T], b: T) {.deprecated: "use add".} = add(a,b)

(I also tried, seems to work; whereas {.deprecated: [append: add].} hits another bug that really should be fixed)

add an entry to https://nim-lang.github.io/Nim/nep1.html specifying this preferred terminology (add + prepend instead of append + prepend)

salvipeter · 2020-12-18T00:24:18Z

I was not really reasoning for append, more like reasoning against prepend... given that we have add, prepending is a trivial issue, as shown above. I don't consider prepending a basic operation.

timotheecour · 2020-12-18T00:34:13Z

arguments for prepend(a: list,b: list):

we already have prepend(a: list: b: elem) in lists.nim (https://nim-lang.github.io/Nim/lists.html#prepend%2CSinglyLinkedList%5BT%5D%2CSinglyLinkedNode%5BT%5D)
symmetry (since we'll have add(a:list, b:elem) and add(a: list, b: list))
(minor argument): rust has both append, prepend: https://doc.rust-lang.org/std/collections/struct.LinkedList.html#method.prepend
forcing users to use this isnt' very user friendly; append/prepend is exactly the type of thing i'd expect in a lists module

b.addMove a
swap a, b

in any case this can be settled after your PR since it's orthogonal

Araq · 2020-12-18T07:17:10Z

the classical "works better in generic code" is a real argument ... and it doesn't mean that "everything compiles", just the stuff that makes sense ;-).

It is a real argument in this case because you gave a real example with collect that is convincing.

timotheecour · 2020-12-19T19:36:34Z

@salvipeter after nim-lang/Nim#16362 (which I just LGTM'd) is merged, do you have interest in writing another PR for the following:

add remove for SinglyLinkedList (analog to the one for DoublyLinkedList)
add prepend for SinglyLinkedList and DoublyLinkedList
rename append to add (rationale was discussed in std/lists: support destructive add/prepend for SinglyLinkedList, DoublyLinkedList + other lists improvements #303 (comment)) and add a deprecated append for backward compatibility

salvipeter · 2020-12-31T21:55:42Z

I've created a new PR.

timotheecour · 2021-01-01T22:28:49Z

this gives SIGSEGV:

when true:
  import lists
  import std/enumerate
  var a = [10,11,12,13].toDoublyLinkedList
  echo a
  a.remove(nil)
  echo a

instead, we should use assert n!=nil (and not doAssert)

timotheecour · 2021-01-02T00:48:08Z

remove(a; DoublyLinkedList, n) behavior when n notin a is error-prone; I don't think it's possible to fix (because O(1) behavior should be kept) but it should be documented (it's not a "noop")

when true:
  import lists
  block:
    var a = [10,11,12,13].toDoublyLinkedList
    var b = [20,21,22,23].toDoublyLinkedList
    a.remove(b.head)
    echo a # [10, 11, 12, 13]
    echo b # [20, 21, 22, 23]
  block:
    var a = [10,11,12,13].toDoublyLinkedList
    var b = [20,21,22,23].toDoublyLinkedList
    a.remove(b.head.next)
    echo a # [10, 11, 12, 13]
    echo b # [20, 22, 23]

timotheecour · 2021-01-02T02:08:28Z

add(a: SinglyLinkedList, n: Node) behavior when n in a is bad, it should be as follows:

when true:
  import lists
  var a = [10,11,12].toSinglyLinkedList
  a.add a.head.next
  echo take(a, 6)

current output: @[10, 11]
expected output: @[10, 11, 12, 11, 12, 11]  # segment followed by cycle

this would be consistent with behavior for a.add a which creates a cycle

(take is defined here: timotheecour/Nim#504)

timotheecour · 2021-01-02T02:23:44Z

add(a: DoublyLinkedList, n: Node) behavior when n in a is bad, it should be as follows:

when true:
  import lists
  var a = [10,11,12].toDoublyLinkedList
  a.add a.head.next
  echo take(a, 6)

current output: @[10, 11]
expected output: raise `ValueError`

unlike with SinglyLinkedList (#303 (comment)), there's no way to have segment followed by cycle because pred(11) would need to have 2 values (12 and 10). So the simplest is to raise an error when n is such that n.prev != nil

If user wants to add a n with n.prev!=nil, we can discuss that; one possibility is to call split (refs nim-lang/Nim#16536 (comment)) or setting n.prev = nil first.

(take is defined here: timotheecour/Nim#504)

timotheecour · 2021-01-21T00:26:41Z

for reference, see also std/lists: Various changes to lists (RFC #303) Nim#16536 (comment)

timotheecour mentioned this issue Dec 17, 2020

std/lists: O(1) concatenation of singly- and doubly linked lists. nim-lang/Nim#16362

Merged

timotheecour mentioned this issue Dec 17, 2020

misc lists timotheecour/Nim#464

Open

timotheecour changed the title ~~support destructive add/prepend for SinglyLinkedList, DoublyLinkedList~~ support destructive add/prepend for SinglyLinkedList, DoublyLinkedList + other lists improvement Dec 19, 2020

timotheecour changed the title ~~support destructive add/prepend for SinglyLinkedList, DoublyLinkedList + other lists improvement~~ support destructive add/prepend for SinglyLinkedList, DoublyLinkedList + other lists improvements Dec 19, 2020

timotheecour mentioned this issue Dec 21, 2020

Add jsheaders nim-lang/fusion#56

Merged

salvipeter mentioned this issue Dec 31, 2020

std/lists: Various changes to lists (RFC #303) nim-lang/Nim#16536

Merged

timotheecour changed the title ~~support destructive add/prepend for SinglyLinkedList, DoublyLinkedList + other lists improvements~~ std/lists: support destructive add/prepend for SinglyLinkedList, DoublyLinkedList + other lists improvements Jan 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

std/lists: support destructive add/prepend for `SinglyLinkedList`, `DoublyLinkedList` + other `lists` improvements #303

std/lists: support destructive add/prepend for `SinglyLinkedList`, `DoublyLinkedList` + other `lists` improvements #303

timotheecour commented Dec 17, 2020 •

edited

Loading

Araq commented Dec 17, 2020 •

edited

Loading

timotheecour commented Dec 17, 2020 •

edited

Loading

salvipeter commented Dec 17, 2020

salvipeter commented Dec 17, 2020

timotheecour commented Dec 17, 2020 •

edited

Loading

salvipeter commented Dec 17, 2020

Araq commented Dec 17, 2020

timotheecour commented Dec 17, 2020

salvipeter commented Dec 17, 2020

timotheecour commented Dec 17, 2020 •

edited

Loading

Araq commented Dec 17, 2020

salvipeter commented Dec 17, 2020

Araq commented Dec 17, 2020

salvipeter commented Dec 17, 2020

timotheecour commented Dec 17, 2020 •

edited

Loading

Araq commented Dec 17, 2020

Araq commented Dec 17, 2020

salvipeter commented Dec 17, 2020

timotheecour commented Dec 17, 2020 •

edited

Loading

salvipeter commented Dec 17, 2020 •

edited

Loading

timotheecour commented Dec 18, 2020 •

edited

Loading

salvipeter commented Dec 18, 2020

timotheecour commented Dec 18, 2020

Araq commented Dec 18, 2020

timotheecour commented Dec 19, 2020

salvipeter commented Dec 31, 2020

timotheecour commented Jan 1, 2021 •

edited

Loading

timotheecour commented Jan 2, 2021

timotheecour commented Jan 2, 2021

timotheecour commented Jan 2, 2021

timotheecour commented Jan 21, 2021

std/lists: support destructive add/prepend for SinglyLinkedList, DoublyLinkedList + other lists improvements #303

std/lists: support destructive add/prepend for SinglyLinkedList, DoublyLinkedList + other lists improvements #303

Comments

timotheecour commented Dec 17, 2020 • edited Loading

proposal

Araq commented Dec 17, 2020 • edited Loading

timotheecour commented Dec 17, 2020 • edited Loading

salvipeter commented Dec 17, 2020

salvipeter commented Dec 17, 2020

timotheecour commented Dec 17, 2020 • edited Loading

note: special case of a.add a or similar

remove can also create cycles:

salvipeter commented Dec 17, 2020

Araq commented Dec 17, 2020

timotheecour commented Dec 17, 2020

salvipeter commented Dec 17, 2020

timotheecour commented Dec 17, 2020 • edited Loading

Araq commented Dec 17, 2020

salvipeter commented Dec 17, 2020

Araq commented Dec 17, 2020

salvipeter commented Dec 17, 2020

timotheecour commented Dec 17, 2020 • edited Loading

Araq commented Dec 17, 2020

Araq commented Dec 17, 2020

salvipeter commented Dec 17, 2020

timotheecour commented Dec 17, 2020 • edited Loading

salvipeter commented Dec 17, 2020 • edited Loading

timotheecour commented Dec 18, 2020 • edited Loading

salvipeter commented Dec 18, 2020

timotheecour commented Dec 18, 2020

Araq commented Dec 18, 2020

timotheecour commented Dec 19, 2020

salvipeter commented Dec 31, 2020

timotheecour commented Jan 1, 2021 • edited Loading

timotheecour commented Jan 2, 2021

timotheecour commented Jan 2, 2021

timotheecour commented Jan 2, 2021

timotheecour commented Jan 21, 2021

std/lists: support destructive add/prepend for `SinglyLinkedList`, `DoublyLinkedList` + other `lists` improvements #303

std/lists: support destructive add/prepend for `SinglyLinkedList`, `DoublyLinkedList` + other `lists` improvements #303

timotheecour commented Dec 17, 2020 •

edited

Loading

Araq commented Dec 17, 2020 •

edited

Loading

timotheecour commented Dec 17, 2020 •

edited

Loading

timotheecour commented Dec 17, 2020 •

edited

Loading

note: special case of `a.add a` or similar

timotheecour commented Dec 17, 2020 •

edited

Loading

timotheecour commented Dec 17, 2020 •

edited

Loading

timotheecour commented Dec 17, 2020 •

edited

Loading

salvipeter commented Dec 17, 2020 •

edited

Loading

timotheecour commented Dec 18, 2020 •

edited

Loading

timotheecour commented Jan 1, 2021 •

edited

Loading