You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've written a script to recursively find all files with a given extension, generate a chain for each, and (once all files have an associated chain), combine them into one mega-chain and store it.
I'm running this on a vary large directory (~1.4 G), and while coding my script I was aware that holding all of that in ram (as markkovify.Text only accepts strings) would probably be an issue.
I was correct; not 2 seconds after having run it the process was killed.
Is there a way to modify .Text and .NewlineText so they can accept (and properly process, of course), a generator or file-like object to iterate over?
I have no problem implementing this myself and filing a pull request, I'm just unsure how to deal with sentence splitting along chunks.
The text was updated successfully, but these errors were encountered:
I'm just unsure how to deal with sentence splitting along chunks.
I'd recommend using a generator internally for this, where it runs over an iterable (a generator, list, or something else) and only yields a new sentence upon the discovery of a !, ? or ..
That way you're relying on Python for maintaining state for you, rather than maintaining state with local variables.
I've written a script to recursively find all files with a given extension, generate a chain for each, and (once all files have an associated chain), combine them into one mega-chain and store it.
I'm running this on a vary large directory (~1.4 G), and while coding my script I was aware that holding all of that in ram (as markkovify.Text only accepts strings) would probably be an issue.
I was correct; not 2 seconds after having run it the process was killed.
Is there a way to modify .Text and .NewlineText so they can accept (and properly process, of course), a generator or file-like object to iterate over?
I have no problem implementing this myself and filing a pull request, I'm just unsure how to deal with sentence splitting along chunks.
The text was updated successfully, but these errors were encountered: