Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a StreamWithState streamer #61

Closed
wants to merge 1 commit into from
Closed

Add a StreamWithState streamer #61

wants to merge 1 commit into from

Conversation

Kerollmops
Copy link
Contributor

@Kerollmops Kerollmops commented Apr 28, 2018

This is an attempt to be able to retrieve the Automaton State from a streamer, this streamer will be called StreamWithState.

Currently the map::Stream struct implement the Streamer trait with an associated Streamer::Item that is (&[u8], u64) in other term the matching string associated with the value.

The problem is that it is not possible to retrieve the state of the automaton with the current map::Stream type. It could be useful when the map::Stream is used with a levenshtein automaton, the state will permit to know the distance of the matching string returned.

So I propose to add a new struct named StreamWithState that implement the Streamer trait and its associated type Streamer::Item is (&[u8], u64, A::State) where A: Automaton. This way the user will be able to retrieve the state of the Automaton and therefore the levenshtein distance of the matching string returned (by answering the automaton).

The StreamWithState is accessible via the with_state self consuming method on map::StreamBuilder.

All of this resolves #60.

What I have done here is probably a little bit confused due to the way github shows the diffs.
For me, map::StreamWithState is a superset of raw::Stream It should be possible to implement map::Stream using the other and just ignoring the Automaton::State, so I created a special StreamWithState::next method that accept a function which will "transform" the A::State:

src/raw/mod.rs Outdated
@@ -880,6 +957,7 @@ impl<'f, 'a, A: Automaton> Streamer<'a> for Stream<'f, A> {
let trans = state.node.transition(state.trans);
let out = state.out.cat(trans.out);
let next_state = self.aut.accept(&state.aut_state, trans.inp);
let aut_state = self.aut.accept(&state.aut_state, trans.inp);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am asking for the state a second time because the Automaton State needs to be returned from this function and saved into the stack, so the other solution would be to ask a Cloneable Automaton State but it adds to many restriction to the entire code base (Clone and Clone constraints everywhere).

The solution would be:

  • ask for a Cloneable State only for the StreamWithState and in that case not be able to reuse the StreamWithState next method in the Stream next method,
  • add Clone constraints everywhere
  • the current one: ask two times for the state.

The other solution would be to return a &A::State, but sometimes the state is not stored into the Stream struct.

src/raw/mod.rs Outdated
return Some((&self.0.inp, out));
}
}
None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like code duplication, the only thing different with the StreamWithState is the clone of the aut_state.

@BurntSushi
Copy link
Owner

There is a lot of new code here. Could you please describe what problem you're trying to solve here?

@Kerollmops
Copy link
Contributor Author

Kerollmops commented Aug 25, 2018

Here I updated the original PR description. I removed many useless methods and types implementations. All these commits needs to be squashed.

EDIT: Ok so the tests are broken, I will find a way to keep one implementation of map::Streamer without requiring Clone on Automaton::State everywhere.

@Kerollmops
Copy link
Contributor Author

I found a great solution 🎉

I updated the original PR description to reflect the last changes. Functions documentations and other methods are missing (e.g. StreamWithStateBuilder::ge/gt/le/lt) but I wait for your validation to add them and not pollute the diff.

@Kerollmops
Copy link
Contributor Author

This issue will never be merged as it adds to many complexity to the library and this is not useful for most of the users.

@Kerollmops Kerollmops closed this Feb 15, 2019
hntd187 added a commit to hntd187/fst that referenced this pull request Aug 2, 2019
@fulmicoton fulmicoton mentioned this pull request Aug 5, 2019
Closed
halvorboe pushed a commit to quickwit-inc/fst that referenced this pull request Jan 6, 2020
vs.push((k.to_vec(), v.value()));
}
vs
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were all of these methods removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those weren't removed, it is because git doesn't understand that it is a new struct named StreamWithState that I created and I didn't implement those methods on it to keep the code easier to follow.

@BurntSushi
Copy link
Owner

I am going to look into merging this PR, but will probably re-implement it myself. There are some changes here that I don't quite understand yet.

@Kerollmops
Copy link
Contributor Author

Notice that even if you would have wanted to merge it, it wouldn't have been possible as I deleted the repo and branch. A great idea of mine, again...

@BurntSushi
Copy link
Owner

@Kerollmops No worries! :) And thanks for responding.

BurntSushi added a commit that referenced this pull request Mar 6, 2020
This is apparently useful in some cases when access to the underlying
automaton's state can produce useful information about a match.
Regretably, the implementation requires the states to satisfy `Clone`,
or else we would otherwise need to execute the state transition twice.
In practice, states are usually `Copy` and quite small.

Closes #60, Closes #61
@BurntSushi
Copy link
Owner

An implementation of this is now in my rollup branch: b1e7441

Feedback is welcome, but I think I pretty much followed the ideas in this PR.

BurntSushi added a commit that referenced this pull request Mar 6, 2020
This is apparently useful in some cases when access to the underlying
automaton's state can produce useful information about a match.
Regretably, the implementation requires the states to satisfy `Clone`,
or else we would otherwise need to execute the state transition twice.
In practice, states are usually `Copy` and quite small.

Closes #60, Closes #61
BurntSushi added a commit that referenced this pull request Mar 7, 2020
This is apparently useful in some cases when access to the underlying
automaton's state can produce useful information about a match.
Regretably, the implementation requires the states to satisfy `Clone`,
or else we would otherwise need to execute the state transition twice.
In practice, states are usually `Copy` and quite small.

Closes #60, Closes #61
BurntSushi added a commit that referenced this pull request Mar 7, 2020
This is apparently useful in some cases when access to the underlying
automaton's state can produce useful information about a match.
Regretably, the implementation requires the states to satisfy `Clone`,
or else we would otherwise need to execute the state transition twice.
In practice, states are usually `Copy` and quite small.

Closes #60, Closes #61
BurntSushi added a commit that referenced this pull request Mar 7, 2020
This is apparently useful in some cases when access to the underlying
automaton's state can produce useful information about a match.
Regretably, the implementation requires the states to satisfy `Clone`,
or else we would otherwise need to execute the state transition twice.
In practice, states are usually `Copy` and quite small.

Closes #60, Closes #61
BurntSushi added a commit that referenced this pull request Mar 8, 2020
This is apparently useful in some cases when access to the underlying
automaton's state can produce useful information about a match.
Regretably, the implementation requires the states to satisfy `Clone`,
or else we would otherwise need to execute the state transition twice.
In practice, states are usually `Copy` and quite small.

Closes #60, Closes #61
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Return the automaton state with the (key, value)
2 participants