Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusion about the code #7

Closed
songpipi opened this issue Jul 9, 2019 · 4 comments
Closed

Confusion about the code #7

songpipi opened this issue Jul 9, 2019 · 4 comments

Comments

@songpipi
Copy link

songpipi commented Jul 9, 2019

In process_descriptions.py , the 'key' is the intersection between sentence and category_name. But in input_pipeline.py, the 'key' and 'sentence' is not such.
in process_descriptions.py :

>sentence
[1, 0, 58, 595, 10, 349, 12, 782, 0, 579, 3, 2]
>key
[595]

in input_pipeline.py

>sentence[0]
[   0,   65,   19, 1130,   37,  882,   10,  124,    5,   48,    5,
        345,    1,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0]
>key[0]
[8265, 2390,  878, 4930,   10,  436,    5,    7,  118, 2433,    8,
        388,  558,    5,  139,    6,    1,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0]

It seems that the 'sentence' completely unrelated to the' key '. Is that reasonable? and why the 'key' is different in the above two file?

@fengyang0317
Copy link
Owner

fengyang0317 commented Jul 9, 2019

Sorry about the naming. The key in input_pipeline.py is a noised version of the sentence. You may refer to
https://github.com/fengyang0317/unsupervised_captioning/blob/master/input_pipeline.py#L82

The objects are named classes
https://github.com/fengyang0317/unsupervised_captioning/blob/master/input_pipeline.py#L60

@songpipi
Copy link
Author

I'm still confused, e.g, a sentence:

ipdb> sess.run(sentence)
array([  1,   0,  29, 108,  42,  95,   4,  35,   5,   0,  97,  96,   3,
         2,   0,   0,   0,   0,   0], dtype=int32)

after parse_sentence function , its noisy version :

ipdb> sess.run(key)
array([  0, 108,  29,  95,  35,   4,  97,   0,  96,   2,   3,   0,   0,
         0,   0,   2], dtype=int32)

but why the 'sentence' completely unrelated to the' key ' at corresponding batch in input_pipeline.py?

@fengyang0317
Copy link
Owner

They are related somehow. You should read the paper about adding the noise or read the controlled_shuffle and random_drop functions.

@songpipi
Copy link
Author

OK, I'll take a closer look. Thank you very much for your friendly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants