Originally on 2011-05-03
As mentioned in #621, we may want to improve a little bit the
description and/or behaviour of intbitset's pop(). Notably, people
may be using search engine's API functions returning intbitsets that
look like lists. Here, pop() has ordered meaning for lists, while
for sets it pops any random element:
In : from invenio.intbitset import intbitset
In : l = [1, 10, 2, 20]
In : s = set(l)
In : i = intbitset(l)
In : l, s, i
Out: ([1, 10, 2, 20], set([1, 10, 20, 2]), intbitset([1, 2, 10, 20]))
In : xl, xs, xi = l.pop(), s.pop(), i.pop()
In : l, s, i
Out: ([1, 10, 2], set([10, 20, 2]), intbitset([2, 10, 20]))
In : xl, xs, xi
Out: (20, 1, 1)
The difference between lists and intbitsets is strictly taken OK,
because intbitsets emulate the API of sets, so pop() removes an
arbitrary set element. However, behind the scenes intbitset's pop()
calls intBitSetGetNext() that does an ordered removal, not an
"arbitrary" removal; so we can document this better for end users.
More to the point, intbitset has a native notion of element order,
being a set of increasing integers; it does resemble ordered lists of
integers in this respect. intbitset can be considered as a kind of
ordered set of increasing integers that emulates set API, so having
some facets of lists and some facets of sets, as it were.
Therefore we may want to improve the docstring of intbitset's pop()
in order to reflect this mixed nature of intbitsets: (i) at least by
documenting the non-arbitrary parts, but (ii) more to the point, we
may want to alter perhaps the meaning of what pop() returns, so that
intbitset would resemble lists more (i.e. returning last, not first
element). It will still be an "arbitrary" removal from the set API
point of view, but it will resemble more to what people may be used to
from the list API point of view, if they think of intbitsets as of
lists of increasing integers.
P.S. Not thinking here about list-specific calls like pop(n).
Originally on 2012-07-05
I agree with you on the non arbitrary implementation of pop() in intbitset. However, for performance reasons, it might be nicer to still return the smaller bit, because in order to find the bit to return, the implementation still has to scroll the whole bitset.
So what if we fully document this behavior? Alternatively one can imagine to add a flag to the .pop() function saying:
In this way the faster implementation is used by default, but one can still use the slower and more stack-friendly one.
Originally by Samuele Kaplun firstname.lastname@example.org on 2012-07-06
#CommitTicketReference repository="" revision="6ad995694e33b9d5d83fe5b09bf988f85050e2b3"
intbitset: pop last element
- When pop is called on a list, the last element is returned. When
it is called on a set, an arbitrary element is returned. So far
intbitset was returning the smallest element. In order to expose
a behaviour more similar to the expectation of an ordered list,
intbitset now returns the largest element.
Originally by Nikola Yolov email@example.com on 2012-08-03
#CommitTicketReference repository="" revision="6511a023ef60bf53a289e13257e8053e939604c6"
intbitset: fixes pop() function
- Fixes pop() function. (addresses #626)
Originally by Samuele Kaplun firstname.lastname@example.org on 2012-08-09
Originally by Nikola Yolov email@example.com on 2012-08-09