make_sentence_that_contains #52

ekt1701 · 2017-01-11T03:25:43Z

Is it possible to create a Markov chain that contains a key word?

I'm thinking of a chat bot situation. For example, the word "computer" is used in the utterance, then this command would be called: text_model.make_sentence_that_contains("computer")

If so, could you please either add this to markovify, or show me how I could write the code for my personal use.

Many thanks.

The text was updated successfully, but these errors were encountered:

jsvine · 2017-01-12T02:08:40Z

There's no built-in method to do this in markovify. The easiest way for you to do it would probably be to have a while generate sentences until it generates one that contains the word you want. E.g.,

while True:
    sentence = text_model.make_sentence()
    if "computer" in sentence: break
print(sentence)

ekt1701 · 2017-01-12T22:21:03Z

Thanks, I will give that a try.

voltaxvoltax · 2020-01-24T19:42:57Z

I'm sorry to re-open this Feature Request, but I've been trying the solution approach of using the while loop and with a corpus of 6mb it takes almost 30 seconds (or more) to generate a sentence containing the specific word.
It's somehow possible for you to implement this feature in code so it would be more fast to create sentences?

Thank you very much, and continue the good work!

jsvine · 2020-01-27T00:53:42Z

Hi @voltaxvoltax, and thanks for reminding me about this thread. Implementing a more efficient version of make_sentence_that_contains would require a nontrivial amount of experimentation and testing. That said, I’m open to including such a feature in markovify. If you (or anyone else reading this thread) would like to try coding a proposal, drop a note here and we can discuss how the feature might work.

Assuming the word is fish, to construct a sentence containing that word, we’d need to generate the following:

The words that come after fish. This is easy, since we can just use .make_sentence_with_start(‘fish’, strict = False).
The words that come before fish. This is more complicated; we would first need to calculate the reverse Markov probabilities for the corpus. That logic isn’t currently built into markovify.

caerulius · 2020-01-27T21:38:25Z

Just to +1 interest in this, I have a project right now with this exact use case. I agree with your analysis that the only way to achieve this is a reverse markov probability, but it would be highly useful. Just to describe my goal quickly, it would be to 'half-simulate' a conversation between different models. The first one generates a sentence, the second one generates a sentence containing some word from the first one, and so on.

Thanks for the consideration. I'd be very tolerant to longer model processing time if there were a second reverse model processed at the same time.

JGCoelho · 2020-04-22T22:48:12Z

You could generate a sentence that begins with the keyword and them generate another random sentence, cut it on a verb (using nltk) and append it at the beggining of the other sentence.

brienna · 2020-05-16T22:51:42Z

This works fine for me on a corpus of 14 MB.

generated_headlines = []
while len(generated_headlines) < 25:
    headline = model.make_sentence(tries=100)
    if headline is not None:
        if word.lower() in headline.split(' '):
            generated_headlines.append(headline)

It helps to also check the Markov chain to see how often that word appears. If it appears at a very low frequency, that might be why it takes a long time for the model to generate a sentence with the word in it. There's nothing we can do about it in that case, other than to choose a different word or to substantially increase the size of the corpus.

If make_sentence_that_contains becomes a feature, a word-frequency check might help.

Sylv-Lej · 2020-08-28T09:55:25Z

I'm working in a naive approach on my fork.

To build the reversed matrix, i reverse the sentences then i process it, probably not the best idea but it looks like it's working. We will probably need another method in order to create that matrix.

In order to make it work i combine make_sentence_that_finish with .make_sentence_with_start(‘fish’, strict = False) as @jsvine explained.

Still working in multi words containing, it's experimental right now.

Any suggestions are welcome

jsvine closed this as completed Jan 12, 2017

voltaxvoltax mentioned this issue Jan 24, 2020

make_sentence_that_contains #122

Closed

jsvine reopened this Jan 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make_sentence_that_contains #52

make_sentence_that_contains #52

ekt1701 commented Jan 11, 2017

jsvine commented Jan 12, 2017

ekt1701 commented Jan 12, 2017

voltaxvoltax commented Jan 24, 2020

jsvine commented Jan 27, 2020

caerulius commented Jan 27, 2020

JGCoelho commented Apr 22, 2020

brienna commented May 16, 2020

Sylv-Lej commented Aug 28, 2020

make_sentence_that_contains #52

make_sentence_that_contains #52

Comments

ekt1701 commented Jan 11, 2017

jsvine commented Jan 12, 2017

ekt1701 commented Jan 12, 2017

voltaxvoltax commented Jan 24, 2020

jsvine commented Jan 27, 2020

caerulius commented Jan 27, 2020

JGCoelho commented Apr 22, 2020

brienna commented May 16, 2020

Sylv-Lej commented Aug 28, 2020