Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permutations unnecessarily complex? #747

Closed
phimuemue opened this issue Sep 6, 2023 · 8 comments · Fixed by #790
Closed

Permutations unnecessarily complex? #747

phimuemue opened this issue Sep 6, 2023 · 8 comments · Fixed by #790

Comments

@phimuemue
Copy link
Member

During my review of #739, I found Permutations quite complex.

As of now, it has four different states: StartUnknownLen, OngoingUnknownLen, Complete and Empty. Its cousin Combinations, on the other hand, only has fields first, pool, indices.

My gut feeling is that Permutations could be massaged into a simpler form that more resembles the one of Combinations.

@Philippe-Cholet Since you studied that code, can I have your two cents on this?

@Philippe-Cholet
Copy link
Member

Philippe-Cholet commented Sep 6, 2023

I agree it's quite complex (but the states are clear thanks to the type system). It's probably because of this complexity that the size hint was incomplete until I fixed it.
I have a similar gut feeling, it's probably simplifiable into a single case since UnknownLen states are only transition until we have CompleteState::Ongoing (and Empty might be easily ditched?).
The major difference though is the additional field cycles, which is currently defined only once we know n for sure.
Instead of first: bool, maybe a more complex type to represent the transition (until we know n). Well, I'm just thinking..

Do you want me to investigate changes further?
EDIT: Well, I'm investigating.

@Philippe-Cholet
Copy link
Member

I just wanted to take time to appreciate what's currently done. Each Iterator::next step is advance+item in the below state diagram (I wanted to do some nice mermaid diagram for quite some time now):

stateDiagram-v2
    classDef rust color:orange

    state k_eq_0 <<choice>>
    k_eq_0: k == 0 ?
    state enough <<choice>>
    enough: Enough values ?
    state full <<choice>>
    full: pool.get_next() ?
    state end <<choice>>
    end: The end ?

    [*]
    [*] --> k_eq_0
    k_eq_0 --> enough: No
    k_eq_0 --> CompleteStart: Yes
    enough --> [*]: No
    enough --> StartUnknownLen: Yes
    CompleteStart --> [*]: NO item
    StartUnknownLen --> OngoingUnknownLen: advance
    full --> OngoingUnknownLen: No
    full --> CompleteStart: Yes, then advance X times...
    OngoingUnknownLen --> full: advance
    OngoingUnknownLen --> OngoingUnknownLen: item
    CompleteStart --> CompleteOngoing: advance
    CompleteOngoing --> CompleteOngoing: item
    CompleteOngoing --> end: advance
    end --> CompleteOngoing: No, mostly
    end --> CompleteStart: Yes, a single time

    class CompleteStart, StartUnknownLen, OngoingUnknownLen, CompleteOngoing rust

Once loaded, we are in CompleteOngoing state until the end.

@Philippe-Cholet
Copy link
Member

Philippe-Cholet commented Sep 7, 2023

Now in master...Philippe-Cholet:itertools:permutations-states , I have

pub struct Permutations<I: Iterator> {
    vals: LazyBuffer<I>,
    state: PermutationState,
}
enum PermutationState {
    Start { k: usize },
    Ongoing { k: usize, min_n: usize },
    Complete { indices: Vec<usize>, cycles: Vec<usize> }, // CompleteState have been merged
    End,
}

Maybe we can simplify this state further. Option<PermutationState> to get rid of End? Maybe merge Ongoing with another variant (but it would probably be less readable)?

The main commit has 80 additions and 143 deletions: nice simplification IMO.

Updated mermaid diagram
stateDiagram-v2
    classDef rust color:orange

    state enough <<choice>>
    enough: Enough values ?
    state k0 <<choice>>
    k0: k == 0 ?
    state full <<choice>>
    full: pool.get_next() ?
    state end <<choice>>
    end: The end ?

    [*] --> enough
    enough --> [*]: No
    enough --> Start: Yes
    Start --> k0: next
    k0 --> [*]: Yes
    k0 --> Ongoing: No
    full --> Ongoing: No
    full --> Complete: Yes, then advance X times...
    Ongoing --> full: next
    Complete --> end: next
    end --> Complete: No
    end --> [*]: Yes

    class Start, Ongoing, Complete rust

@Philippe-Cholet
Copy link
Member

Philippe-Cholet commented Sep 7, 2023

master...Philippe-Cholet:itertools:permutations-states
I flattened PermutationState variant by variant until it became a struct, which I merged into Permutations leading to

pub struct Permutations<I: Iterator> {
    indices: Option<Vec<usize>>, // is None when there is no permutation left
    first: bool,
    cycles: Option<Vec<usize>>, // is None while n is still unknown
    vals: LazyBuffer<I>,
}

indices starts with length k ; and once n is known, cycles is of length k while indices become of length n.

I'm not sure it's better than a more verbose structure, but it's definitely possible.

EDIT: Previous c17761a and 486b348 about advance function might be cherry-picked OUT because it is a bit slower (+10% on some huge test, but only in debug mode?!) with those changes. Apart from that, my changes do not speed things up or down.

@Philippe-Cholet
Copy link
Member

@phimuemue Since you are around, I put a little remainder here.

@phimuemue
Copy link
Member Author

Hi @Philippe-Cholet, I appreciate that you want to tackle the problem. I tried to review your commits, and I think most of it is really good.

Unfortunately I cannot point my finger at the exact places, but I think I've lost track of the possible states. It surely does not help that I'm not intricately familiar with the cycles logic in there and that I do not have enough time to do a proper review of this.

As much as I'd love to answer with "yeah, I saw and understood this", I can't offer a more thorough review at the moment.

@Philippe-Cholet
Copy link
Member

@phimuemue
It's a huge rewrite with quite a few steps (as atomic as reasonable but each change is not straightforward).

Destroy the enum PermutationState variant by variant is possible but it leads to having impossible but representable states (basically ignored with match patterns order and .. ignoring stuff) and therefore lead to most of your (legitimate) questions. And after a big month, I have to understand some parts of the code again.
Having first: bool was easy enough to keep in mind but indices and cycles possibly being none is complex to easily remember (and indices length is k or n depending on having the lazy buffer is loading or filled).

I really prefer enum PermutationState after the second commit: the Start { k }, Ongoing { k, min_n }, Complete { indices, cycles }, End variants are way more explicit.
Below I quote one of my commit messages about None uses and comment it as I think it was better represented with the enum:

indices being None means there is no permutation left. Even if first is true. // It was previously better represented by the End variant.
cycles being None means the buffer is not full yet and n is still unknown. // It was previously better represented by the Ongoing {k,min_n} variant.


I think we should only keep the first 3 commits and simply discard the rest. The first and third are basic. The second that merges CompleteState into PermutationsState and hugely simplify next after #748 as it merges the two steps there was before and discard panic!("unexpected iterator state").
Maybe I can simplify/split this second commit though.

@phimuemue
Copy link
Member Author

Hi @Philippe-Cholet, sounds very reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants