Constraints for image captioning module #472

lorisbaz · 2018-07-10T13:50:37Z

Adding

~~Constrains input from plain file.~~
Image Captioning now supports constrained decoding.
Image Captioning: zero padding of features now allows input features of different shape for each image.

Pull Request Checklist

Changes are complete (if posting work-in-progress code, prefix your pull request title with '[WIP]'
until you can check this box.
Unit tests pass (pytest)
System tests pass (pytest test/system)
Passed code style checking (./style-check.sh)
You have considered writing a test
Updated major/minor version in sockeye/__init__.py. Major version bump if this is a backwards incompatible change.
Updated CHANGELOG.md

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Merge from upstream (original) repository

Merge from awslabs/sockeye

…aptioning module. Zero padding of features enables to have input features of different shape for each image.

fhieber

nice feature for image captioning!

fhieber · 2018-07-10T14:37:06Z

CHANGELOG.md

@@ -10,6 +10,12 @@ Note that Sockeye has checks in place to not translate with an old model that wa

 Each version section may have have subsections for: _Added_, _Changed_, _Removed_, _Deprecated_, and _Fixed_.

+## [1.18.34]
+### Added
+- Constrains input from plain file. 


This description isn't clear to me: what constrains the input?

I can update that description. The only way to input constraints before was via json file.

Now you can use a plain file. In each line, you have a list of space-separated strings (constraints) which correspond to an image (or source sentence). This is a parallel file to the one specified in --input

so are these then only single-word constraints? How about several multi-word constraints (which is supported by JSON)?

Oh I see. How about having them separated by some special character (e.g., comma or tab)?

E.g.: blue sky, cow, grass, ...

Then we would need to rewrite the entire parsing logic of separating tokens. You can see how complicated this already is with the factored strings. In my opinion, inputs with constraints and/or factors is inherently structured data and not just plain strings. JSON provides a good format for such use cases, whereas plain file parsing is quite error-prone.

I don't want to enforce my opinion on this though. Maybe others can chime in on this discussion? We could also separate the constrained decoding for image captioning feature from this API change to unblock this PR.

I understand your point. However, it seems that having the target sentence and constraints coupled (as in the JSON) is not flexible, because we might want to run experiments with different constraints. In image captioning we can have a target file and multiple constraints file (one for each experiment we want to run).

The parsing logic is simple: if json is not provided, we search for the constraint file; otherwise we use the json file as before. Is this too complex?

You mean "source sentence", right?

Creating different constraints for an input sentence is already easy to do. You can use the sockeye.lexical_constraints CLI to produce inputs with arbitrary constraints and pass/pipe them directly to sockeye.translate / sockeye.captioner with the --json-input flag.

I think we should hold off on changing the overall API / behavior of the API/translate CLI with this PR. This PR should concentrate on enabling constraints for image captioning.

Yes. I meant source sentence.

lexical_constraints expects a sentence as input + constraints. Would it work with the image relative path?

I wanted to avoid to use it, because in my case I have to first generate the constraints, then the file with source sentence (image path) + constraints, then apply lexical_constraints and now I can perform inference. It is a bit tricky sequence of steps.

I can revert the changes and remove the option --input-constraints, even though I think it might be useful.

I don't know how your captioner currently handles the content of the 'text' field in the JSON object. If it uses it as a relative image path this should already work.

In the end its up to you how you define the API for the image captioning use case, but I vote for keeping the general translation API (sockeye.translate) as is by reverting the changes in the files out side of sockeye/image_captioning/.

fhieber · 2018-07-10T14:37:58Z

CHANGELOG.md

+### Added
+- Constrains input from plain file. 
+- Contraints can be used for image captioning module. 
+- Zero padding of features enables to have input features of different shape for each image.


Could you rephrase to make clear that this refers to image captioning? Something like: "Image Captioning: zero padding of features now allows input features of different shape for each image."

fhieber · 2018-07-10T14:38:14Z

CHANGELOG.md

+## [1.18.34]
+### Added
+- Constrains input from plain file. 
+- Contraints can be used for image captioning module. 


maybe: Image Captioning now supports constrained decoding?

fhieber · 2018-07-10T14:39:55Z

sockeye/arguments.py

@@ -1079,6 +1079,12 @@ def add_inference_args(params):
                                    "Optionally, a list of factors can be provided: "
                                    "{'text': 'some input string', 'factors': ['C C C', 'X X X']}.")

+    decode_params.add_argument('--input-constraints',


afaik, this introduces an inconsistency with input from a file or stdin. I think we intentionally restricted constrained decoding to json input to avoid parsing multiple files and having to add tons of checks whether they have the same number of lines as the input file / stdin.

To me it makes sense to be flexible here. You can still use the old way with json.

fhieber · 2018-07-10T14:40:03Z

sockeye/image_captioning/captioner.py

@@ -109,6 +109,10 @@ def main():
    params = arguments.ConfigArgumentParser(description='Image Captioning CLI')
    arguments_image.add_image_caption_cli_args(params)
    args = params.parse_args()
+    run_captining(args)


typo: captioning

Ops, I'll correct it

fhieber · 2018-07-10T14:41:48Z

sockeye/image_captioning/utils.py

+            raise ValueError("Provided target shape must be bigger then the original "
+                             "shape. (provided: {}, original {})".format(len(tshape), len(fshape)))
+        diff_shape = np.subtract(tshape, fshape)
+        if np.any(diff_shape<0):


pep8: spacing

I know. For some reason, my brain wants compact lines 😄

fhieber · 2018-07-10T14:43:52Z

sockeye/image_captioning/utils.py

+        diff_shape = [[0, d] for d in diff_shape]  # pad format: ((before_1, after_1), ... (before_N, after_N))
+        p = np.pad(f, diff_shape, 'constant', constant_values=0)
+        pad_feat.append(p)
+    return pad_feat


missing newline

fhieber · 2018-07-10T14:44:08Z

sockeye/inference.py

-def make_input_from_multiple_strings(sentence_id: int, strings: List[str]) -> TranslatorInput:
+def make_input_from_multiple_strings(sentence_id: int,
+                                     strings: List[str],
+                                     constraints: str = None) -> TranslatorInput:


type: Optional[str] if it can be None

fhieber · 2018-07-10T14:44:20Z

sockeye/inference.py

    """
    Returns a TranslatorInput object from multiple strings, where the first element corresponds to the surface tokens
    and the remaining elements to additional factors. All strings must parse into token sequences of the same length.

    :param sentence_id: An integer id.
    :param strings: A list of strings representing a factored input sequence.
+    :param constraints: A string with constraints.


"Optional string with constraints" ?

Sounds good.

fhieber · 2018-07-10T14:44:56Z

sockeye/translate.py

@@ -166,6 +174,7 @@ def read_and_translate(translator: inference.Translator,
    :param input_file: Optional path to file which will be translated line-by-line if included, if none use stdin.
    :param input_factors: Optional list of paths to files that contain source factors.
    :param input_is_json: Whether the input is in json format.
+    :param input_constraints: Optional path to file which will contain constrains for each source sentence.


typo: constraints

lorisbaz · 2018-07-11T13:43:13Z

@fhieber I reverted the --input-constraints option and covered the other small issues listed above.

fhieber

Looks good to me! Thanks for iterating and for separating the two changes from each other!

Unless its urgent to get merged, I'd suggest waiting for approval from @mjpost who knows the whole constraints business best.

fhieber · 2018-07-11T14:01:42Z

sockeye/inference.py

@@ -754,7 +754,8 @@ def make_input_from_factored_string(sentence_id: int,
    return TranslatorInput(sentence_id=sentence_id, tokens=tokens, factors=factors)


-def make_input_from_multiple_strings(sentence_id: int, strings: List[str]) -> TranslatorInput:
+def make_input_from_multiple_strings(sentence_id: int,


unnecessary change

fhieber · 2018-07-11T14:02:57Z

test/unit/test_inference.py

@@ -311,7 +311,7 @@ def test_failed_make_input_from_valid_json_string(text, text_key, factors, facto
 @pytest.mark.parametrize("strings",
                         [
                             ["a b c"],
-                             ["a b c", "f1 f2 f3", "f3 f3 f3"]
+                             ["a b c", "f1 f2 f3", "f3 f3 f3"],


probably not an issue, but adding the comma could be reverted.

It was just because I remove the constraints manually. Fixing it.

fhieber · 2018-07-11T14:03:04Z

sockeye/image_captioning/utils.py

+        feature_shape = feature.shape
+        if len(feature_shape) < len(target_shape):  # add extra dimensions
+            for i in range(len(target_shape)-len(feature_shape)):
+                feature = np.expand_dims(feature, axis=len(feature.shape)+1)


tiny nit: spacing :D

Arggggg ☠️ 😄

lorisbaz · 2018-07-11T14:27:48Z

Not urgent. But let's not forget about it 😸

fhieber · 2018-07-11T14:30:36Z

I won't forget :)

mjpost · 2018-07-11T22:08:35Z

Looks good to me. Two thoughts:

We could consider using a different keyword other than "tokens" in TranslatorInput for the path to the image file. There's no need to overload on this and it could be clearer otherwise.
Can you add notes to the tutorials to list this feature, for both lexical constraints and image captioning?

mjpost · 2018-07-11T22:13:44Z

As an additional thought, note that the lexical constraints documentation provides a command-line module for generating the JSON object from tab-delimited data on STDIN of the form:

input sentence TAB constraint1 TAB another constraint TAB yet another constraint

If you ignored my thought #1, this would work transparently with image captioning (with field 1 as the path).

lorisbaz · 2018-07-17T12:49:23Z

I removed the option to have a plain file. We will use JSON files as you and Felix suggested.

I will add some notes about lexical constraints in the image captioning tutorial.

lorisbaz · 2018-07-26T08:18:14Z

Hey @mjpost, I added the instructions. Can you have a look and approve if it is ok?
Thanks.

mjpost · 2018-07-26T10:12:26Z

LGTM. You'll have to remerge master once more since we just fixed the pylint issues that caused Travis to fail.

lorisbaz · 2018-07-26T12:59:55Z

Done

lorisbaz and others added 4 commits May 11, 2018 10:08

Merge pull request #1 from awslabs/master

faa6704

Merge from upstream (original) repository

Merge pull request #2 from awslabs/master

11d6066

Merge from awslabs/sockeye

Merge pull request #3 from awslabs/master

efcd39f

Merge from awslabs/sockeye

Adding constrains input from plain file. Adding contraints in image c…

8bd2b0d

…aptioning module. Zero padding of features enables to have input features of different shape for each image.

lorisbaz requested review from davvil, fhieber, mjdenkowski and tdomhan as code owners July 10, 2018 13:50

Merge branch 'master' into master

41a38c9

lorisbaz changed the title ~~[WIP] Constraints for image captioning module~~ Constraints for image captioning module Jul 10, 2018

fhieber requested changes Jul 10, 2018

View reviewed changes

fhieber added the feature label Jul 10, 2018

Reverting --input-constraints option. Small fixes from code review.

6f67015

fhieber approved these changes Jul 11, 2018

View reviewed changes

Fix

8977c9e

Instructions constrains for image captioning.

0152932

Merge branch 'master' into master

b885f58

mjpost approved these changes Jul 26, 2018

View reviewed changes

lorisbaz and others added 2 commits July 26, 2018 14:57

Merge branch 'master' into master

66ff5ac

Version init

8c3e0f5

Merge branch 'master' into master

a8829aa

tdomhan merged commit e6cd190 into awslabs:master Jul 26, 2018

Constraints for image captioning module #472

Constraints for image captioning module #472

Conversation

lorisbaz commented Jul 10, 2018 • edited

Pull Request Checklist

fhieber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lorisbaz Jul 10, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lorisbaz commented Jul 11, 2018

fhieber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lorisbaz commented Jul 11, 2018

fhieber commented Jul 11, 2018

mjpost commented Jul 11, 2018

mjpost commented Jul 11, 2018

lorisbaz commented Jul 17, 2018

lorisbaz commented Jul 26, 2018

mjpost commented Jul 26, 2018

lorisbaz commented Jul 26, 2018

lorisbaz commented Jul 10, 2018 •

edited

lorisbaz Jul 10, 2018 •

edited