Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dcgan - for text generation #3

Closed
johndpope opened this issue Oct 27, 2016 · 6 comments
Closed

dcgan - for text generation #3

johndpope opened this issue Oct 27, 2016 · 6 comments

Comments

@johndpope
Copy link

johndpope commented Oct 27, 2016

I've seen a lot of work around dcgan for images.
I wondered how this data science could apply to text generation.
I stubbled upon https://github.com/sherjilozair/char-rnn-tensorflow which will spit out a body of work - but I wondered if you had any thoughts how the descriminator vs generator could be used to forge text to simulate a specific author.

I found this CNN for classification.
https://github.com/dennybritz/cnn-text-classification-tf

Theoretically - could your code be plugged into these Covnets?
Let me know if there's interest. I'm building a Parsey McParseface docker api
https://github.com/dmansfield/parsey-mcparseface-api
and I'm looking to explore style transfer for text.

@chrisnovello
Copy link

+1
I've been wondering about this and just found this post via a github search!

@mqtlam
Copy link
Owner

mqtlam commented Oct 31, 2016

Thanks for the interest in my code! Theoretically, yes, you can change the discriminator and generator networks in my code to whatever you want and run my code for adversarial training. I think the trick would be designing the appropriate generator and discriminator networks for your task, then playing around with training hyperparameters. I think training might be a challenge since you have to "balance" the generator and discriminator during training (so one doesn't take over the other at first); I'm not quite sure how this would work in the text domain. Your idea sounds fascinating but at the moment I'm busy working on other projects. If you get something to work, please keep me posted!

@johndpope
Copy link
Author

johndpope commented Nov 4, 2016

so - I did some reading /youtube videos about this.
There's an entire domain of Natural Language Programming worth checking out on youtube.
https://www.youtube.com/watch?v=AqEF2HIMjYA
Bi-gram / n-grams / skip thought vectors.

From digging through white papers - one approach taken specific to CNN is to limit the vocab of an author to subset used in corpus. (Naturally - characters in a book will skew the vernacular / results - so you'd need to limit / exclude out of character text - you want to have the voice biased to male / female for it to illicit the corresponding tone) You'd then pass the sentence in multiple passes - switching out the words with synonyms used by author.
(You'd also want to use the word2vectors as well as break apart the sentence into tokens / parsey mcparseface. https://github.com/johndpope/DockerParseyMcParsefaceAPI )
Although this software overall looks more powerful
https://github.com/nltk/nltk

@llSourcell built this repo
https://github.com/llSourcell/AI_Writer

This is worth checking out - song generator based off of Taylor swift's song corpus.
https://github.com/RajahBimmy/Ghostwriter
(Spoiler - results were disappointing using markov chains)

Noteworthy -
https://github.com/paarthneekhara/byteNet-tensorflow

A supervised char - RNN is probably going to get furtherest fastest - but it's not going win any pulitzer prizes any time soon.

@johndpope
Copy link
Author

@rsingh43
Copy link

rsingh43 commented Jun 4, 2018

Did you get dcgan work for text generation?

@johndpope
Copy link
Author

johndpope commented Jun 4, 2018

there's some work in this area for text.
https://github.com/lancopku/DPGAN

checkout -
https://github.com/hindupuravinash/the-gan-zoo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants