Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External Iterators #2991

Open
kodebach opened this issue Sep 20, 2019 · 8 comments

Comments

@kodebach
Copy link
Contributor

commented Sep 20, 2019

The internal iterator inside KeySets seems to cause more problems than it solves. IMO we should move to an external structure e.g.

typedef struct _KeySetIterator {
    size_t index;
    Key * current; // in case the underlying KeySet is modified
} KeySetIterator;

To ensure backwards compatibility, we could embed one such iterator into KeySets, but recommend that people use an external instance in new code.

Originally posted by @kodebach in #2979 (comment)

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Sep 20, 2019

Thank you for creating the proposal!

We already have external iterators, ksAtCursor. They also work when the keyset is modified, as the keyset guarantees that the array is always fully filled.

What exactly would your external iterators improve?

@kodebach

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

The documentation for the cursor system is not the best.

  1. * @brief return key at given cursor position
    *
    * @param ks the keyset to pop key from
    * @param pos where to get
    * @return the key at the cursor position on success
    * @retval NULL on NULL pointer, negative cursor position
    * or a position that does not lie within the keyset
    */
    Key * ksAtCursor (KeySet * ks, cursor_t pos)
    {
    if (!ks) return 0;
    if (pos < 0) return 0;
    if (ks->size < (size_t) pos) return 0;
    return ks->array[pos];
    }

    The key is not "popped" as far as I can tell.
  2. The only function that returns a cursor_t is ksGetCursor. Its documentation is completely broken:
    cursor_t jump;
    ksRewind (ks);
    while ((key = keyNextMeta (ks))!=0)
    {
    // now mark this key
    jump = ksGetCursor(ks);
    //code..
    keyNextMeta (ks); // now browse on
    // use ksCurrent(ks) to check the keys
    //code..
    // jump back to the position marked before
    ksSetCursor(ks, jump);
    }

    ksRewind and keyNextMeta cannot be used on the same variable (ks).
    The second example is the only way I have ever seen ksGetCursor used:
    int f (KeySet *ks)
    {
    cursor_t state = ksGetCursor(ks);
    // work with keyset
    // now bring the keyset to the state before
    ksSetCursor (ks, state);
    }
  3. ksSetCursor uses the same broken example.
  4. The documentation (not just for the above function, but all over the ks* functions) also contains lots of warnings (use only cursor from same keyset, may be invalid, may become invalid, etc.) that make it seem like there is something special about these cursors, when in fact they are simply indicies. There is no information (at least that I could find), how cursors can be modified. Most of the time it even seems like modifying cursors would be a bad idea.

What exactly would your external iterators improve?

I guess you can do the same things with the cursor system. It is just not very well documented and without reading the source of libelektra-core I wouldn't know how the cursors can and cannot be used. Even now I am not sure, I understand everything correctly.

for (cursor_t cursor = 0; cursor < ksGetSize(ks); ++cursor)
{
    Key * cur = ksAtCursor (ks);
    // do whatever
    // don't add/remove keys from ks
    // using ksLookup is fine
}

Would that be correct? I know that adding/removing keys during iteration can never really work (because of the sorting), but ksLookup should work with ksAtCursor shouldn't it?

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2019

I fully agree that the docu is not good. ksAtCursor was only a prototype API to implement iterators for C++ and never got the attention it should have had.

Maybe we should rename ksAtCursor to ksAt (KeySet * ks, size_t index)? This would make more clear that it is simply an array access and has nothing to do with the cursors?

The internal iterator+cursor API ideally should be removed but this is another issue.

What I am missing are benchmarks between the different iterator APIs. At the moment it is very unclear how ksNext/ksAt/ksFilter compare to each other. Do you use such loops in LCDproc?

Would that be correct?

There are arguments missing in ksAtCursor, but yes, besides the small things it looks right.

markus2330 added a commit that referenced this issue Sep 22, 2019
see #2991
@kodebach

This comment has been minimized.

Copy link
Contributor Author

commented Sep 22, 2019

Maybe we should rename ksAtCursor to ksAt (KeySet * ks, size_t index)? This would make more clear that it is simply an array access and has nothing to do with the cursors?

Yes, that would make more sense. Maybe we should even just add const Key ** ksArray (KeySet * ks);. Then the iteration could be fully external and we would avoid the overhead of a function call and the ifs to avoid out of bounds issues.

The internal iterator+cursor API ideally should be removed but this is another issue.

That would break a lot of code. IMO we should do it before 1.0, even if it is a big change.

What I am missing are benchmarks between the different iterator APIs. At the moment it is very unclear how ksNext/ksAt/ksFilter compare to each other.

Benchmarks should be easy enough to create.

Do you use such loops in LCDproc?

The low-level version uses ksNext to iterate over (some) arrays. However, it is not used directly and I don't think LCDproc is a good benchmark for this. ksNext is also used in almost every plugin.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Sep 22, 2019

Then the iteration could be fully external and we would avoid the overhead of a function call and the ifs to avoid out of bounds issues.

Yes but it also would be less safe. It would be interesting what the overhead is.

That would break a lot of code. IMO we should do it before 1.0, even if it is a big change.

Yes, if we somehow manage to do it, it would be great to have it fixed for 1.0.

Benchmarks should be easy enough to create.

If you need a few more pages in your thesis, it would be great to have it ❇️

The low-level version uses ksNext to iterate over (some) arrays. However, it is not used directly and I don't think LCDproc is a good benchmark for this. ksNext is also used in almost every plugin.

Micro-benchmarks should also do.

@kodebach kodebach referenced this issue Sep 23, 2019
@PhilippGackstatter PhilippGackstatter referenced this issue Oct 7, 2019
0 of 11 tasks complete
@markus2330

This comment has been minimized.

Copy link
Contributor

commented Oct 10, 2019

I propose to do following changes:

  • make ksNext ksRewind private (not remove it because it would be too much effort to change all the loops)
  • better document and advocate the external iterators (docu mostly done, maybe we also should have the examples how to write loops in the release notes)
  • start using external iterators in new code
  • remove ksPopAtCursor
  • remove docu about internal cursor from all functions. Changing the internal cursor will be an implementation detail. (Of course we need to somehow keep it consistent to the current state as otherwise we will break our own code.)

What do you think?

@kodebach

This comment has been minimized.

Copy link
Contributor Author

commented Oct 10, 2019

Sounds good, but:

remove ksPopAtCursor

You mean ksPop? ksPopAtCursor doesn't exist.

The function elektraKsPopAtCursor uses an external cursor_t and therefore should stay.

not remove it because it would be too much effort to change all the loops

We should also aim to remove it over time, i.e. every time we make bigger changes to a function that uses the internal cursor, try to remove as much of it as feasible. In many plugins the change should be fairly easy, only doing all of them at once would be a lot of work.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Oct 10, 2019

ksPopAtCursor

Is a wrapper in src/libs/proposal/proposal.c lines 220-223.

The function elektraKsPopAtCursor uses an external cursor_t and therefore should stay.

Ok.

We should also aim to remove it over time, i.e. every time we make bigger changes to a function that uses the internal cursor, try to remove as much of it as feasible. In many plugins the change should be fairly easy, only doing all of them at once would be a lot of work.

Yes, this would be the long term goal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.