Skip to content

A web application that generates an image for corresponding textual description. The user is required to enter a text description of a scene. The application will then generate an image that best corresponds to this description. The application uses Generative Adversarial Networks (GANs) trained on a large dataset of images consisting of multipl…

Notifications You must be signed in to change notification settings

jmagdum7/speech-to-image-chatbot

Repository files navigation

speech-to-image-chatbot

A web application that generates an image for corresponding textual description. The user is required to enter a text description of a scene. The application will then generate an image that best corresponds to this description. The application uses Generative Adversarial Networks (GANs) trained on a large dataset of images consisting of multiple everyday-object categories.

Explanation:

  1. Input is a text description of a scenario/ object.
  2. GANs accept input in the form of vector representations.
  3. Hence, it is necessary to convert the text description to word embeddings, which are basically vector representations of text description.
  4. Char CNN-RNN model used for conversion to word embeddings.
  5. These vector representations thus produced are then passed to the model through the AJAX calls.
  6. We have used a stacked architecture of Generative Adversarial Networks. This is represented in the form of 2 stages.
  7. Stage-I GAN sketches the primitive shape and colors of a scene.
  8. Stage-II GAN adds finer details to the low-resolution image from the Stage-I.
  9. Final image generated by model is passed back to the chatbot interface through use of AJAX calls.
  10. The image corresponding to the text description is thus rendered in the chabtot interface itself.

For more detailed explanations and involved concepts, please read the Project Report.pdf

How to Run:

  1. Clone/ Download the repository.
  2. Open the folder in terminal.
  3. Type the command : python new_main.py
  4. Open link in terminal in a web browser.
  5. Use the application.

About

A web application that generates an image for corresponding textual description. The user is required to enter a text description of a scene. The application will then generate an image that best corresponds to this description. The application uses Generative Adversarial Networks (GANs) trained on a large dataset of images consisting of multipl…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published