Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about combining models #37

Closed
anjabeth opened this issue Oct 31, 2016 · 4 comments
Closed

Question about combining models #37

anjabeth opened this issue Oct 31, 2016 · 4 comments

Comments

@anjabeth
Copy link

If I do markovify.combine() with no weighting, is it effectively the same as training one model on the texts of all the combined models? Asking because I'd like to train a model on a lot of text files, and it works out easier to create a bunch of different ones and then combine them, as long as that works the way I'm expecting it to.

@anjabeth
Copy link
Author

Clarifying: I've been playing around with "combine" and have gotten it to work, but I'm curious - when it "combines" the models with weighting, is it just using those weights to choose which corpus the words come from, or do the corpuses actually mix? (For example, if I trained a model on the KJV and Moby Dick, could I get sentences that combine both texts? Or would I just get the right fraction of sentences that come from each text?)

@jsvine
Copy link
Owner

jsvine commented Nov 1, 2016

If I do markovify.combine() with no weighting, is it effectively the same as training one model on the texts of all the combined models?

Yep!

I'm curious - when it "combines" the models with weighting, is it just using those weights to choose which corpus the words come from, or do the corpuses actually mix?

The latter. The corpuses are, effectively, mixed.

For example, if I trained a model on the KJV and Moby Dick, could I get sentences that combine both texts?

Yep! That's what should happen. (Would be curious to see the output.)

Or would I just get the right fraction of sentences that come from each text?

Nope! There's currently no way to do that with markovify.

@anjabeth
Copy link
Author

anjabeth commented Nov 2, 2016

Thanks so much! That's what I was guessing - I think the length difference between the texts was just giving me lots more Bible words, but I wanted to make sure that it wasn't a weighting mistake.

KJV/Moby Dick didn't produce anything terribly interesting on the couple of test runs I did (I'm currently just setting up the skeleton of my project), but I got some pretty fun results with Moby Dick + Pride and Prejudice:

"I was sure you could not be married all day"
"The envelope contained a sheet of blubber."
"Hold the steak in one hand, and a still slighter shuffling of women's shoes, and all was soon right again."

@jsvine
Copy link
Owner

jsvine commented Nov 2, 2016

Love those examples. Thanks for sharing!

@jsvine jsvine closed this as completed Nov 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants