You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.
In theory it should be easy to support Image Captioning by just swapping out the encoder with something like ResNet/Inception (e.g. tensorflow.contrib.slim.python.slim.nets.inception_v3). However, there are a few things that need to happen to support problems other than text-to-text.
Currently, the parameters to the train/inference scripts are specific to text Sequence-To-Sequence, e.g. source_vocabulary, source_delimiter, etc. We probably need another abstraction layer that defines what kind of task the user is solving and adjust flags/parameters based on it. For example, I could imagine having a Task class, with TextToText, ImageToText, ..., subclasses. The user then passes the type of task as part of the config and the task class is responsible for setting the appropriate parameters and creating the model.
Support for pre-trained networks. For example, when training image captioning models one typically initializes the encoder network with pre-trained image classification network weights. This can probably the done through some kind of SessionRunHook that loads a subset of the variables. In other words, the hooks used in the training script must be configurable.
The text was updated successfully, but these errors were encountered:
In theory it should be easy to support Image Captioning by just swapping out the encoder with something like ResNet/Inception (e.g.
tensorflow.contrib.slim.python.slim.nets.inception_v3
). However, there are a few things that need to happen to support problems other than text-to-text.source_vocabulary
,source_delimiter
, etc. We probably need another abstraction layer that defines what kind of task the user is solving and adjust flags/parameters based on it. For example, I could imagine having aTask
class, withTextToText
,ImageToText
, ..., subclasses. The user then passes the type of task as part of the config and the task class is responsible for setting the appropriate parameters and creating the model.SessionRunHook
that loads a subset of the variables. In other words, the hooks used in the training script must be configurable.The text was updated successfully, but these errors were encountered: