Personality-Captions Dataset #3738

Neha10252018 · 2021-06-22T09:25:55Z

I have downloaded the dataset, but when i go through the data set training dataset has one caption per image, where as testing has 5 different columns, can someone tell me in the final output of the paper you guys have shown that one image has 5 different personality trait outputs, but from the dataset i can see that there is only one comment and personality trait per image, How is it possible to get 5 different output for a single image, can someone please explain

Neha10252018 · 2021-06-22T11:10:32Z

Please can the author respond to it( may be @klshuster ) please

klshuster · 2021-06-22T14:00:18Z

Hi there, you are correct that the training and validation splits only have 1 caption per image, whereas the test set has 5 captions for image; the test set was collected this way such that reference BLEU scores could be computed.

If you're referring to Table 6 in the paper, those are generated outputs, where the model outputs a response conditioned on the listed personality. This table demonstrates the flexibility of the model (and the efficacy of personality conditioning)

Neha10252018 · 2021-06-22T14:09:04Z

Hi Thanks for your Response, So its not the compulsion that we need to get 5 outputs for a single Image?

like 1 image with 1 personality trait and its caption has to be the out put in general is what your are meaning?

klshuster · 2021-06-22T16:37:02Z

i am not sure I understand the question, could you please elaborate?

Neha10252018 · 2021-06-22T17:39:48Z

well what my doubt was are the dataset designed in such a way that we need to compulsorily get 5 output for each image?

Or one trait and caption per image will also be okay?

Neha10252018 · 2021-06-22T17:41:55Z

so if i am implementing my model i am getting the output as below is that correct?

Neha10252018 · 2021-06-22T17:43:58Z

Also from the training set can i ignore the columns candidates and 500 candidates and carryout my work? will that affect anything?

klshuster · 2021-06-22T20:13:49Z

You only need to output one caption per image. The 5 captions on the test set are used for computing automated metrics. The additional captions on images in the paper are examples from the model with different personalities at inference time. They are merely showing that you can control the model.
The candidates and 500_candidates fields are used for response selection (retrieval-based models). They are used to determine retrieval-based metrics as specified in the paper (R@1). If you are training a generative model, you do not need to worry about them

Hope that answers your questions

Neha10252018 · 2021-06-22T20:42:52Z

so if i am implementing my model i am getting the output as below is that correct?

So is this output correct as per the dataset?

klshuster · 2021-06-23T14:53:04Z

yes!

Neha10252018 · 2021-06-24T09:24:24Z

is there a way i can get your complete code to check how it is working? i mean any link to the complete code?

klshuster · 2021-06-24T14:54:02Z

this project page details how to use the dataset within ParlAI: https://parl.ai/projects/personality_captions/

PineappleWill · 2021-06-28T03:30:12Z

I have downloaded the dataset, but when i go through the data set training dataset has one caption per image, where as testing has 5 different columns, can someone tell me in the final output of the paper you guys have shown that one image has 5 different personality trait outputs, but from the dataset i can see that there is only one comment and personality trait per image, How is it possible to get 5 different output for a single image, can someone please explain

Hi, could you please tell me how to download the personality-caption dataset? I can't find any clue in ParlAI

klshuster · 2021-06-28T13:46:15Z

Hi, could you please tell me how to download the personality-caption dataset? I can't find any clue in ParlAI

Hi @PineappleWill, as mentioned in the linked project page, the dataset can be accessed via -t personality_captions using ParlAI. As is also mentioned in the project page, you can take a look at the ParlAI Quick-Start page to understand how to setup and use ParlAI. Specifically, once you've installed ParlAI, simply run parlai display_data -t personality_captions; this will download the dataset for you.

Neha10252018 · 2021-06-29T10:28:44Z

Should we also divide image dataset to train test and validate? becoz we only have captions files as test train and validate in .json format

klshuster · 2021-06-29T13:55:31Z

the data entries in the json files include fields for the image id corresponding to the relevant image. The images are indeed unique by split

Neha10252018 · 2021-06-29T14:32:23Z

so you mean no need to split the images again? when i downloaded the data set i got 2 folders one is personality_captions and the other one is yfcc_images.

Neha10252018 · 2021-06-29T14:45:19Z

or you mean like i need to create a seperate test train and val folder for images taking its ID's from the json files?

Or should i just train the train folder? Please can you clear this doubt.

klshuster · 2021-06-29T15:07:16Z

the yfcc images folder has all of the images. the splits of the images are within the json files. You will need to look at the json files to determine which images correspond to which split.

Neha10252018 · 2021-06-29T16:08:04Z

so are you meaning like i need to create different folders again?

klshuster · 2021-06-29T17:15:03Z

how are you interacting with the dataset? if you are using parlai you don't need to create folders, it's all handled within the code.

If you are using the dataset outside of ParlAI, I can't really help you as I do not know what system you are using

Neha10252018 · 2021-06-29T19:09:09Z

i am working on google collab

Neha10252018 · 2021-06-29T19:09:30Z

outside ParlAi

Neha10252018 · 2021-07-01T09:05:23Z

i am getting below error when i run the code

Neha10252018 · 2021-07-01T14:00:50Z

@klshuster could you please let me know

klshuster · 2021-07-01T14:06:19Z

we don't support windows so I am unable to help you out with this

Neha10252018 · 2021-07-01T14:11:03Z

if you can have a look at that image i am inside ParlAi environment so is that anything you can help?

mojtaba-komeili · 2021-07-01T22:21:38Z

The error seems to be coming from urlib3 version mismatch. Have you tried using pip to install the particular version that it needs directly?

Neha10252018 · 2021-07-02T10:04:06Z

@mojtaba-komeili thanks for responding but which version i need to check and how?

mojtaba-komeili · 2021-07-06T14:50:48Z

Based on this I say anything 1.* that is higher than 1.25.9 should work (maybe 1.26.6).

Neha10252018 · 2021-07-16T15:38:06Z

You only need to output one caption per image. The 5 captions on the test set are used for computing automated metrics. The additional captions on images in the paper are examples from the model with different personalities at inference time. They are merely showing that you can control the model.

The candidates and 500_candidates fields are used for response selection (retrieval-based models). They are used to determine retrieval-based metrics as specified in the paper (R@1). If you are training a generative model, you do not need to worry about them

Hope that answers your questions

@klshuster As you mentioned in the first point regarding additional captions in the paper was an example to show how to control the model, may I know from where those 5 different captions for the same images taken?, as per the dataset one image has one caption so a bit confused, Could you please let me know in detail

Neha10252018 · 2021-07-17T11:43:51Z

@klshuster Please can you respond to this last question so it will be helpful to me for my project to carryon

Neha10252018 · 2021-07-19T12:23:12Z

Can someone please respond it would be of great help @stephenroller

klshuster · 2021-07-19T14:06:13Z

these are captions from the retrieval model during inference. the candidate set is all utterances from the training set. the model is given a test image and the shown personality, and then is asked to retrieve a relevant response (from the training utterances).

Neha10252018 · 2021-07-19T22:25:04Z

@klshuster I am really sorry could you please elaborate a bit as it s a bit confusing and tricky

klshuster · 2021-07-20T14:50:45Z

Are you familiar with how dialogue retrieval models work? A retrieval model is given a personality and an image and is asked to generate an answer. The model is a retrieval-based model, not a generative model. That means the model scores a set of candidate sentences and returns the highest scoring sentence as its response.

The model needs a set of utterances from which to choose for its response. We take all human utterances from the training set of Personality-Captions and allow the model to select a response from this set. Because the model was trained on several images with several personalities, the model can select several different top responses for the same image, if the personality input is varied (Happy, Sad, Angry, etc.)

Neha10252018 · 2021-07-21T08:59:03Z

@klshuster yeah i understand that thank you, but really confused regarding that candidates column in validation set and candidates, additional_comments, 500_candidates columns in testing set, could you please let me know where i need to use this becoz training set doesn't have any additional column and the model is trained based on that set only

klshuster · 2021-07-21T21:41:14Z

I described those additional columns here: #3738 (comment)

If you're training a generative model you don't need to worry about them.

Neha10252018 · 2021-07-22T14:27:44Z

One last question , for generative model should i use that additional comments column?

Because for training i am giving only 3 columns so.

Also can i concatenate the comment and additional comment to make it a single column and use it for testing?

Same for Retrieval model can i concatenate all the columns(candidates, 500_candidates, comments, additional comments) and make a single comment column?

klshuster · 2021-07-22T17:00:24Z

what do you mean by concatenating the columns?

Neha10252018 · 2021-07-22T21:44:15Z

merging the columns and making it as one comment column

klshuster · 2021-07-22T23:00:51Z

they are separate comments so i am not sure how that would work

additional_comments: For the test set, we collected 5 captions (with 5 styles) per image. This is for measuring reference BLEU scores.
candidates: For the valid and test set, we evaluate retrieval models by having them rank the 100 captions in this field (1 of the 100 is the gold caption)
500_candidates: For the test set, we have an additional ranking measure where the we place the 5 gold captions in a set of 500 (to mimic the ratio of the 1 in 100 for candidates)

I'm not sure how else I can describe this

Neha10252018 · 2021-07-28T11:46:14Z

@klshuster may i know on which OS u have run the code pls

klshuster · 2021-07-28T16:39:57Z

ubuntu

Neha10252018 · 2021-07-29T13:17:36Z

When i run the below code given by you guys to evaluate the model inside Parlai env then i am getting the below
Creating or loading model
13:59:31 | Opt:
13:59:31 | activation: relu
13:59:31 | adam_alpha: 0.0005
13:59:31 | additional_layer_dropout: 0.2
13:59:31 | additional_layer_text: 1
13:59:31 | aggregate_micro: False
13:59:31 | allow_missing_init_opts: False
13:59:31 | area_under_curve_class: None
13:59:31 | area_under_curve_digits: -1
13:59:31 | attention_dropout: 0.2
13:59:31 | batch_length_range: 5
13:59:31 | batch_sort: False
13:59:31 | batch_sort_cache: none
13:59:31 | batchsize: 500
13:59:31 | bpe_add_prefix_space: None
13:59:31 | bpe_debug: False
13:59:31 | bpe_dropout: None
13:59:31 | bpe_merge: None
13:59:31 | bpe_num_symbols: 30000
13:59:31 | bpe_vocab: None
13:59:31 | context_length: -1
13:59:31 | datapath: C:\Users\nehaj\anaconda3\Lib\site-packages\data
13:59:31 | datasplit: 200k
13:59:31 | datatype: valid
13:59:31 | dict_build_first: True
13:59:31 | dict_class: None
13:59:31 | dict_endtoken: END
13:59:31 | dict_file: C:\Users\nehaj\anaconda3\Lib\site-packages\data\models\personality_captions/transresnet/model.dict
13:59:31 | dict_include_test: False
13:59:31 | dict_include_valid: False
13:59:31 | dict_initpath: None
13:59:31 | dict_language: english
13:59:31 | dict_loaded: True
13:59:31 | dict_lower: False
13:59:31 | dict_max_ngram_size: -1
13:59:31 | dict_maxexs: -1
13:59:31 | dict_maxtokens: -1
13:59:31 | dict_minfreq: 0
13:59:31 | dict_nulltoken: NULL
13:59:31 | dict_starttoken: START
13:59:31 | dict_textfields: text,labels
13:59:31 | dict_tokenizer: re
13:59:31 | dict_unktoken: UNK
13:59:31 | display_examples: False
13:59:31 | download_path: None
13:59:31 | dropout: 0.4
13:59:31 | dynamic_batching: None
13:59:31 | embedding_size: 300
13:59:31 | embedding_type: None
13:59:31 | embeddings_scale: True
13:59:31 | encoder_type: transformer
13:59:31 | eval_batchsize: 18
13:59:31 | evaltask: None
13:59:31 | ffn_size: 1200
13:59:31 | fixed_cands_path: None
13:59:31 | freeze_patience: 2
13:59:31 | hidden_dim: 300
13:59:31 | hide_labels: False
13:59:31 | image_cropsize: 224
13:59:31 | image_features: resnet
13:59:31 | image_features_dim: 2048
13:59:31 | image_mode: resnet152
13:59:31 | image_size: 256
13:59:31 | include_image: True
13:59:31 | include_labels: True
13:59:31 | include_persona: True
13:59:31 | include_personality: True
13:59:31 | include_resnet_features: False
13:59:31 | include_uru_features: False
13:59:31 | init_model: none
13:59:31 | init_opt: None
13:59:31 | is_debug: False
13:59:31 | learn_positional_embeddings: False
13:59:31 | learningrate: 0.0005
13:59:31 | load_embeddings_from: /private/home/samuelhumeau/data/crawl-300d-2M.vec
13:59:31 | load_encoder_from: None
13:59:31 | load_transformer_from: /private/home/samuelhumeau/pretrained/encoder_reddit/redditbest.mdl
13:59:31 | log_every_n_secs: 5.0
13:59:31 | log_keep_fields: all
13:59:31 | loglevel: info
13:59:31 | max_length_sentence: 32
13:59:31 | max_train_time: 17280.0
13:59:31 | metrics: default
13:59:31 | model: projects.personality_captions.transresnet.transresnet:TransresnetAgent
13:59:31 | model_file: C:\Users\nehaj\anaconda3\Lib\site-packages\data\models\personality_captions/transresnet/model
13:59:31 | model_parallel: False
13:59:31 | multitask_weights: [1]
13:59:31 | mutators: None
13:59:31 | n_decoder_layers: -1
13:59:31 | n_encoder_layers: -1
13:59:31 | n_heads: 6
13:59:31 | n_layers: 4
13:59:31 | n_positions: 1000
13:59:31 | n_segments: 0
13:59:31 | no_cuda: False
13:59:31 | num_cands: 100
13:59:31 | num_epochs: -1
13:59:31 | num_examples: -1
13:59:31 | num_layers_all: 2
13:59:31 | num_layers_image_encoder: 1
13:59:31 | num_layers_text_encoder: 1
13:59:31 | num_test_labels: 5
13:59:31 | numthreads: 1
13:59:31 | numworkers: 4
13:59:31 | one_cand_set: False
13:59:31 | output_scaling: 1.0
13:59:31 | override: "{'datatype': 'valid', 'ffn_size': 1200, 'attention_dropout': 0.2, 'relu_dropout': 0.2, 'n_positions': 1000}"
13:59:31 | parlai_home: /private/home/kshuster/ParlAI
13:59:31 | pretrained: True
13:59:31 | pytorch_context_length: -1
13:59:31 | pytorch_datafile:
13:59:31 | pytorch_datapath: None
13:59:31 | pytorch_include_labels: True
13:59:31 | pytorch_preprocess: False
13:59:31 | pytorch_teacher_batch_sort: False
13:59:31 | pytorch_teacher_dataset: None
13:59:31 | pytorch_teacher_task: None
13:59:31 | relu_dropout: 0.2
13:59:31 | report_filename:
13:59:31 | save_after_valid: True
13:59:31 | save_every_n_secs: -1
13:59:31 | save_format: conversations
13:59:31 | share_word_embeddings: True
13:59:31 | short_final_eval: False
13:59:31 | show_advanced_args: False
13:59:31 | shuffle: False
13:59:31 | starttime: May02_13-51
13:59:31 | task: personality_captions
13:59:31 | tensorboard_comment:
13:59:31 | tensorboard_log: False
13:59:31 | tensorboard_logdir: None
13:59:31 | tensorboard_metrics: None
13:59:31 | tensorboard_tag: None
13:59:31 | truncate: 32
13:59:31 | use_provided_candidates: True
13:59:31 | validation_cutoff: 1.0
13:59:31 | validation_every_n_epochs: 1
13:59:31 | validation_every_n_secs: -1
13:59:31 | validation_max_exs: -1
13:59:31 | validation_metric: accuracy
13:59:31 | validation_metric_mode: max
13:59:31 | validation_patience: 5
13:59:31 | validation_share_agent: False
13:59:31 | variant: aiayn
13:59:31 | verbose: False
13:59:31 | world_logs:
13:59:31 | Evaluating task personality_captions using datatype valid.
13:59:31 | creating task(s): personality_captions
Please confirm that you have obtained permission to work with the YFCC100m dataset, as outlined by the steps listed at https://multimediacommons.wordpress.com/yfcc100m-core-dataset/ [Y/y]: y
NOTE: This script will download each image individually from the s3 server on which the images are hosted. This will take a very long time. Are you sure you would like to continue? [Y/y]: y
[downloading images to C:\Users\nehaj\anaconda3\Lib\site-packages\data\yfcc_images]

And once this is done it just keeps asking me the below

Please confirm that you have obtained permission to work with the YFCC100m dataset, as outlined by the steps listed at https://multimediacommons.wordpress.com/yfcc100m-core-dataset/ [Y/y]: y
NOTE: This script will download each image individually from the s3 server on which the images are hosted. This will take a very long time. Are you sure you would like to continue? [Y/y]: y
[downloading images to C:\Users\nehaj\anaconda3\Lib\site-packages\data\yfcc_images]
0%| | 0/201858 [00:00<?, ?it/s]14:14:32 | Of 201858 items, 201803 already existed; only going to download 55 items.
100%|███████████████████████████████████████████████████████████████████████▉| 201803/201858 [00:21<00:00, 9417.55it/s]14:14:32 | Going to download 1 chunks with 100 images per chunk using 32 processes.
Downloading: 100%|██████████████████████████████████████████████████████████▉| 201803/201858 [00:40<00:00, 9417.55it/s]14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac80c5633d76c27b352ee6352ddbb3.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac88d66ad654f2739bbfdfbe55c2bdb.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8b1beeb050fa26b970dc2fee5ef539.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac81b9680ba5cab4436ad2528555da.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8788acffc6226816827a943e69bbf.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8d98a8fab765d92a678295fe9b2d.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac866ff82c6319994c8568a91f8aa2a.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac845e9d3081d9415d8a4b49c7dca7.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8aaf341c293ed43ce33358c4801c.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8b31a49a2ed5375724ad7e8fd80ff.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac827b33adaaefb430ce867c4fec7732.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac84b4dc481dd318ae3977a755bd3742.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac83a308c357da220a394e7164da956.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8cfa458c978d903e87459ffd66c2a.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac84b63a2b27ddc08dd7e769593b25c2.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac83a907bc318b96b466179876ad093.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac854f4a9d99e34ec5223b991aa2c887.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac80835da2c2e5f021dd63ed56d0be93.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8a4530c32027b32d52bc899697d8.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8ea6f73a10a31c3d4920c174fe96.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac882dd35d8bf3cee5efeddbe1f399.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8d22b10c9dee47015761898ae75.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8ba82a7ae6d761e1d3582dddbdecdf.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac87681e52e3a709e48cb40ce18bedf.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac827d5424278a62aa6aad18d439a.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac83f9f225f695c2a633d44dcbbce55.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac833f723b92e13b6e314d644f76837.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8a241d2041cb26eba548ec1e7d128.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac849ffe6d25ee9bb21787a39c1926.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8287a425804c7c8cc56ca590de1435.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8567fbe08f7d825bf47ea6846693dc.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac85d4417242b9a36c3f45a6a32a138.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8bcdc098f3698665b91ab9146cc3.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8e6eca5713ee25f631e79d9a3355ac.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac817cd3ccfec3358265dee15ec616af.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac84ecd0fc1f3e17772d8f561a11add.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac822b755268b2a6ce231cc0e1ad588.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac83cdb55a9e6a79ecc879451dacb3.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8d7d1ce0f32d1e7ee4d838c9b1b94.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac80a9f51a66169dda8eec89cda2a289.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8ddea4829c7eba37ac0c81a7ef634.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac83184fd40c475501ef12160eefa1c.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac85a84da55bfb3497f038822344596d.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac86f15b386ea87d3d240fac81f166c.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8dd7f5795743061f480a56aec7c97.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac841340786841ee7e665051fe58d9.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8176a2fb143c79c22488d104ece72.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac89925318fa67f3e018da2547bbe2.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac87838e7846735b27d54a7d5dbc4ee.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8628ffeed36884336ceab1586cb1.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac81db90ac691dfdd275b2e6ec299ca4.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8f9ff369308ac4d3643d3114c6718b.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac8ee3225ea20433642b347f8fa8d81.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac83d37bf87d21f31bcbc3c3f7714f99.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:15:19 | ←[31mBad download - chunk: 0, dest_file: ac85ef996efd9aed1f91293c1552e9e5.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
Downloading: 100%|███████████████████████████████████████████████████████████| 201858/201858 [01:08<00:00, 2933.63it/s]
14:15:19 | Of 55 items attempted downloading, 55 had errors.
Please confirm that you have obtained permission to work with the YFCC100m dataset, as outlined by the steps listed at https://multimediacommons.wordpress.com/yfcc100m-core-dataset/ [Y/y]: y
NOTE: This script will download each image individually from the s3 server on which the images are hosted. This will take a very long time. Are you sure you would like to continue? [Y/y]: y
[downloading images to C:\Users\nehaj\anaconda3\Lib\site-packages\data\yfcc_images]
0%| | 0/201858 [00:00<?, ?it/s]14:15:56 | Of 201858 items, 201803 already existed; only going to download 55 items.
100%|███████████████████████████████████████████████████████████████████████▉| 201803/201858 [00:22<00:00, 9072.63it/s]14:15:56 | Going to download 1 chunks with 100 images per chunk using 32 processes.
Downloading: 100%|██████████████████████████████████████████████████████████▉| 201803/201858 [00:39<00:00, 9072.63it/s]14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac80c5633d76c27b352ee6352ddbb3.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac88d66ad654f2739bbfdfbe55c2bdb.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8b1beeb050fa26b970dc2fee5ef539.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac81b9680ba5cab4436ad2528555da.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8788acffc6226816827a943e69bbf.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8d98a8fab765d92a678295fe9b2d.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac866ff82c6319994c8568a91f8aa2a.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac845e9d3081d9415d8a4b49c7dca7.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8aaf341c293ed43ce33358c4801c.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8b31a49a2ed5375724ad7e8fd80ff.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac827b33adaaefb430ce867c4fec7732.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac84b4dc481dd318ae3977a755bd3742.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac83a308c357da220a394e7164da956.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8cfa458c978d903e87459ffd66c2a.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac84b63a2b27ddc08dd7e769593b25c2.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac83a907bc318b96b466179876ad093.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac854f4a9d99e34ec5223b991aa2c887.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac80835da2c2e5f021dd63ed56d0be93.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8a4530c32027b32d52bc899697d8.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8ea6f73a10a31c3d4920c174fe96.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac882dd35d8bf3cee5efeddbe1f399.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8d22b10c9dee47015761898ae75.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8ba82a7ae6d761e1d3582dddbdecdf.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac87681e52e3a709e48cb40ce18bedf.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac827d5424278a62aa6aad18d439a.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac83f9f225f695c2a633d44dcbbce55.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac833f723b92e13b6e314d644f76837.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8a241d2041cb26eba548ec1e7d128.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac849ffe6d25ee9bb21787a39c1926.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8287a425804c7c8cc56ca590de1435.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8567fbe08f7d825bf47ea6846693dc.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac85d4417242b9a36c3f45a6a32a138.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8bcdc098f3698665b91ab9146cc3.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8e6eca5713ee25f631e79d9a3355ac.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac817cd3ccfec3358265dee15ec616af.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac84ecd0fc1f3e17772d8f561a11add.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac822b755268b2a6ce231cc0e1ad588.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac83cdb55a9e6a79ecc879451dacb3.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8d7d1ce0f32d1e7ee4d838c9b1b94.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac80a9f51a66169dda8eec89cda2a289.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8ddea4829c7eba37ac0c81a7ef634.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac83184fd40c475501ef12160eefa1c.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac85a84da55bfb3497f038822344596d.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac86f15b386ea87d3d240fac81f166c.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8dd7f5795743061f480a56aec7c97.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac841340786841ee7e665051fe58d9.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8176a2fb143c79c22488d104ece72.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac89925318fa67f3e018da2547bbe2.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac87838e7846735b27d54a7d5dbc4ee.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8628ffeed36884336ceab1586cb1.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac81db90ac691dfdd275b2e6ec299ca4.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8f9ff369308ac4d3643d3114c6718b.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac8ee3225ea20433642b347f8fa8d81.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac83d37bf87d21f31bcbc3c3f7714f99.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
14:16:44 | ←[31mBad download - chunk: 0, dest_file: ac85ef996efd9aed1f91293c1552e9e5.jpg, http status code: 404, error_msg: [Response not OK] Response: <Response [404]>←[0m
Downloading: 100%|███████████████████████████████████████████████████████████| 201858/201858 [01:10<00:00, 2851.83it/s]
14:16:44 | Of 55 items attempted downloading, 55 had errors.
Please confirm that you have obtained permission to work with the YFCC100m dataset, as outlined by the steps listed at https://multimediacommons.wordpress.com/yfcc100m-core-dataset/ [Y/y]:

Neha10252018 · 2021-07-29T13:18:04Z

@klshuster could you please help me with this

klshuster · 2021-07-29T20:04:26Z

what happens if you enter y at the prompt? there are a few images with broken download links (we don't host the YFCC images) so this is not unexpected

Neha10252018 · 2021-07-30T10:23:51Z

the same things happens i.e it gets broken all the time and i need to again enter y and this repeats

Neha10252018 · 2021-07-30T10:24:49Z

When i run the interactive session i am getting the below error

(base) C:\Users\nehaj\anaconda3\Parlai_Project\ParlAI-ce02a0eb9e4d8bf38377d0908ed7bd3b47d7ab2a\projects\personality_captions>python interactive.py -mf models:personality_captions/transresnet/model
C:\Users\nehaj\anaconda3\lib\site-packages\torchvision\transforms\transforms.py:310: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
warnings.warn("The use of the transforms.Scale transform is deprecated, " +
11:21:28 | ←[33mOverriding opt["n_positions"] to 1000 (previously: None)←[0m
11:21:28 | loading dictionary from C:\Users\nehaj\anaconda3\Lib\site-packages\data\models\personality_captions/transresnet/model.dict
11:21:28 | num words = 250006
Creating or loading model
Traceback (most recent call last):
File "interactive.py", line 287, in
setup_interactive()
File "interactive.py", line 282, in setup_interactive
SHARED['agent'] = create_agent(opt, requireModelExists=True)
File "C:\Users\nehaj\anaconda3\lib\site-packages\parlai\core\agents.py", line 402, in create_agent
model = create_agent_from_opt_file(opt)
File "C:\Users\nehaj\anaconda3\lib\site-packages\parlai\core\agents.py", line 355, in create_agent_from_opt_file
return model_class(opt_from_file)
File "C:\Users\nehaj\anaconda3\lib\site-packages\projects\personality_captions\transresnet\transresnet.py", line 105, in init
self._setup_cands()
File "C:\Users\nehaj\anaconda3\lib\site-packages\projects\personality_captions\transresnet\transresnet.py", line 154, in _setup_cands
self.fixed_cands = [c.replace('\n', '') for c in f.readlines()]
File "C:\Users\nehaj\anaconda3\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 6742: character maps to

klshuster · 2021-07-30T16:07:46Z

it seems the candidates file is corrupted

Neha10252018 · 2021-07-31T09:11:34Z

Sorry then how can i proceed

klshuster · 2021-08-02T16:36:07Z

what is the value of the --fixed-cands-path parameter? that is the file that is corrupted

Neha10252018 · 2021-08-04T11:11:18Z

Sorry but not able to find that

Neha10252018 · 2021-08-17T10:35:52Z

@klshuster one last question, even for generative mode while testing ,we need to input an image and a personality trait and the model should generate the caption right

klshuster · 2021-08-17T15:34:07Z

that is correct

github-actions · 2021-09-17T00:04:35Z

This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.

github-actions bot added the stale label Sep 17, 2021

github-actions bot closed this as completed Sep 24, 2021

Personality-Captions Dataset #3738

Personality-Captions Dataset #3738

Comments

Neha10252018 commented Jun 22, 2021

Neha10252018 commented Jun 22, 2021

klshuster commented Jun 22, 2021

Neha10252018 commented Jun 22, 2021

klshuster commented Jun 22, 2021

Neha10252018 commented Jun 22, 2021

Neha10252018 commented Jun 22, 2021

Neha10252018 commented Jun 22, 2021

klshuster commented Jun 22, 2021

Neha10252018 commented Jun 22, 2021

klshuster commented Jun 23, 2021

Neha10252018 commented Jun 24, 2021

klshuster commented Jun 24, 2021

PineappleWill commented Jun 28, 2021

klshuster commented Jun 28, 2021

Neha10252018 commented Jun 29, 2021

klshuster commented Jun 29, 2021

Neha10252018 commented Jun 29, 2021

Neha10252018 commented Jun 29, 2021

klshuster commented Jun 29, 2021

Neha10252018 commented Jun 29, 2021

klshuster commented Jun 29, 2021

Neha10252018 commented Jun 29, 2021

Neha10252018 commented Jun 29, 2021

Neha10252018 commented Jul 1, 2021

Neha10252018 commented Jul 1, 2021

klshuster commented Jul 1, 2021

Neha10252018 commented Jul 1, 2021

mojtaba-komeili commented Jul 1, 2021

Neha10252018 commented Jul 2, 2021

mojtaba-komeili commented Jul 6, 2021

Neha10252018 commented Jul 16, 2021

Neha10252018 commented Jul 17, 2021

Neha10252018 commented Jul 19, 2021

klshuster commented Jul 19, 2021

Neha10252018 commented Jul 19, 2021

klshuster commented Jul 20, 2021

Neha10252018 commented Jul 21, 2021

klshuster commented Jul 21, 2021

Neha10252018 commented Jul 22, 2021

klshuster commented Jul 22, 2021

Neha10252018 commented Jul 22, 2021

klshuster commented Jul 22, 2021

Neha10252018 commented Jul 28, 2021

klshuster commented Jul 28, 2021

Neha10252018 commented Jul 29, 2021

Neha10252018 commented Jul 29, 2021

klshuster commented Jul 29, 2021

Neha10252018 commented Jul 30, 2021

Neha10252018 commented Jul 30, 2021

klshuster commented Jul 30, 2021

Neha10252018 commented Jul 31, 2021

klshuster commented Aug 2, 2021

Neha10252018 commented Aug 4, 2021

Neha10252018 commented Aug 17, 2021

klshuster commented Aug 17, 2021

github-actions bot commented Sep 17, 2021