Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Unable to override the options with single hyphen options in hallucination projects #3737

Closed
acha21 opened this issue Jun 22, 2021 · 3 comments · Fixed by #3841
Closed

Unable to override the options with single hyphen options in hallucination projects #3737

acha21 opened this issue Jun 22, 2021 · 3 comments · Fixed by #3841
Assignees

Comments

@acha21
Copy link

acha21 commented Jun 22, 2021

Bug description
When I use following command you provided in projects/hallucination/README.md,
I cannot train retrieval-based models.
When I use the option --m instead of -m, it works.

Reproduction steps

For example, Run the following:
parlai train_model -m rag -t wizard_of_wikipedia
--rag-model-type token --rag-retriever-type dpr --dpr-model-file zoo:hallucination/multiset_dpr/hf_bert_base.cp
--generation-model bart -o arch/bart_large
--retriever-debug-index compressed
--batchsize 2 --fp16 True --gradient-clip 0.1 --label-truncate 128
--log-every-n-secs 30 --lr-scheduler reduceonplateau --lr-scheduler-patience 1
--model-parallel True --optimizer adam --text-truncate 512 --truncate 512
-lr 1e-05 -vmm min -veps 0.25 -vme 1000 -vmt ppl -vp 5

Expected behavior
The options should be overridden by argument provided in command line

Logs

17:29:48 | building dictionary first...
17:29:48 | your model is being loaded with opts that do not exist in the model you are initializing the weights with: download_path: None,verbose: False,datapath: /home/acha21/codes/ParlAI/data,evaltask: None,eval_batchsize: None,eval_dynamic_batching: None,num_workers: 0,display_examples: False,num_epochs: -1,max_train_time: -1,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_secs: -1,validation_every_n_steps: -1,save_every_n_secs: -1,save_after_valid: False,validation_every_n_epochs: 0.25,validation_max_exs: 1000,short_final_eval: False,validation_patience: 5,validation_metric: ppl,validation_metric_mode: min,validation_cutoff: 1.0,load_from_checkpoint: True,validation_share_agent: False,metrics: default,aggregate_micro: False,tensorboard_log: False,tensorboard_logdir: None,wandb_log: False,wandb_name: None,wandb_project: None,wandb_entity: None,dict_maxexs: -1,dict_include_valid: False,dict_include_test: False,log_every_n_secs: 30.0,mutators: None,label_type: response,include_knowledge: True,include_checked_sentence: True,include_knowledge_separator: False,chosen_topic_delimiter: 
,num_topics: 5,add_missing_turns: none,candidates: inline,eval_candidates: inline,interactive_candidates: fixed,repeat_blocking_heuristic: True,fixed_candidates_path: None,fixed_candidate_vecs: reuse,encode_candidate_vecs: True,encode_candidate_vecs_batchsize: 256,train_predict: False,cap_num_predictions: 100,ignore_bad_candidates: False,rank_top_k: -1,return_cand_scores: False,use_memories: False,wrap_memory_encoder: False,memory_attention: sqrt,normalize_sent_emb: False,share_encoders: True,learn_embeddings: True,data_parallel: False,reduction_type: mean,polyencoder_type: codes,poly_n_codes: 64,poly_attention_type: basic,poly_attention_num_heads: 4,codes_attention_type: basic,codes_attention_num_heads: 4,generation_model: bart,query_model: bert,rag_model_type: token,thorough: False,n_extra_positions: 0,gold_knowledge_passage_key: checked_sentence,gold_knowledge_title_key: title,rag_retriever_query: full_history,rag_retriever_type: dpr,retriever_debug_index: compressed,n_docs: 5,min_doc_token_length: 64,max_doc_token_length: 256,rag_query_truncate: 512,print_docs: False,path_to_index: zoo:hallucination/wiki_index_compressed/compressed_pq,path_to_dense_embeddings: None,dpr_model_file: zoo:hallucination/multiset_dpr/hf_bert_base.cp,path_to_dpr_passages: zoo:hallucination/wiki_passages/psgs_w100.tsv,retriever_embedding_size: 768,tfidf_max_doc_paragraphs: -1,tfidf_model_path: zoo:wikipedia_full/tfidf_retriever/model,dpr_num_docs: 25,poly_score_initial_lambda: 0.5,polyencoder_init_model: wikito,poly_faiss_model_file: None,regret: False,regret_intermediate_maxlen: 32,regret_model_file: None,indexer_type: compressed,indexer_buffer_size: 65536,compressed_indexer_factory: IVF4096_HNSW128,PQ128,compressed_indexer_gpu_train: False,compressed_indexer_nprobe: 64,hnsw_indexer_store_n: 128,hnsw_ef_search: 128,hnsw_ef_construction: 200,rag_turn_n_turns: 2,rag_turn_marginalize: doc_then_turn,rag_turn_discount_factor: 1.0,interactive_mode: False,t5_model_arch: t5-base,t5_model_parallel: False,t5_dropout: 0.0,t5_generation_config: None
17:29:48 | your model is being loaded with opts that differ from the model you are initializing the weights with. Add the following args to your run command to change this: 
--init-opt None --task None --batchsize 1 --attention-dropout 0.1 --model-parallel False --optimizer sgd --learningrate 1 --truncate -1 --text-truncate None --label-truncate None --lr-scheduler-patience 3 --dict-loaded False
17:29:48 | your model is being loaded with opts that do not exist in the model you are initializing the weights with: download_path: None,verbose: False,datapath: /home/acha21/codes/ParlAI/data,evaltask: None,eval_batchsize: None,eval_dynamic_batching: None,num_workers: 0,display_examples: False,num_epochs: -1,max_train_time: -1,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_secs: -1,validation_every_n_steps: -1,save_every_n_secs: -1,save_after_valid: False,validation_every_n_epochs: 0.25,validation_max_exs: 1000,short_final_eval: False,validation_patience: 5,validation_metric: ppl,validation_metric_mode: min,validation_cutoff: 1.0,load_from_checkpoint: True,validation_share_agent: False,metrics: default,aggregate_micro: False,tensorboard_log: False,tensorboard_logdir: None,wandb_log: False,wandb_name: None,wandb_project: None,wandb_entity: None,dict_maxexs: -1,dict_include_valid: False,dict_include_test: False,log_every_n_secs: 30.0,mutators: None,label_type: response,include_knowledge: True,include_checked_sentence: True,include_knowledge_separator: False,chosen_topic_delimiter: 
,num_topics: 5,add_missing_turns: none,candidates: inline,eval_candidates: inline,interactive_candidates: fixed,repeat_blocking_heuristic: True,fixed_candidates_path: None,fixed_candidate_vecs: reuse,encode_candidate_vecs: True,encode_candidate_vecs_batchsize: 256,train_predict: False,cap_num_predictions: 100,ignore_bad_candidates: False,rank_top_k: -1,return_cand_scores: False,use_memories: False,wrap_memory_encoder: False,memory_attention: sqrt,normalize_sent_emb: False,share_encoders: True,learn_embeddings: True,data_parallel: False,reduction_type: mean,polyencoder_type: codes,poly_n_codes: 64,poly_attention_type: basic,poly_attention_num_heads: 4,codes_attention_type: basic,codes_attention_num_heads: 4,generation_model: bart,query_model: bert,rag_model_type: token,thorough: False,n_extra_positions: 0,gold_knowledge_passage_key: checked_sentence,gold_knowledge_title_key: title,rag_retriever_query: full_history,rag_retriever_type: dpr,retriever_debug_index: compressed,n_docs: 5,min_doc_token_length: 64,max_doc_token_length: 256,rag_query_truncate: 512,print_docs: False,path_to_index: zoo:hallucination/wiki_index_compressed/compressed_pq,path_to_dense_embeddings: None,dpr_model_file: zoo:hallucination/multiset_dpr/hf_bert_base.cp,path_to_dpr_passages: zoo:hallucination/wiki_passages/psgs_w100.tsv,retriever_embedding_size: 768,tfidf_max_doc_paragraphs: -1,tfidf_model_path: zoo:wikipedia_full/tfidf_retriever/model,dpr_num_docs: 25,poly_score_initial_lambda: 0.5,polyencoder_init_model: wikito,poly_faiss_model_file: None,regret: False,regret_intermediate_maxlen: 32,regret_model_file: None,indexer_type: compressed,indexer_buffer_size: 65536,compressed_indexer_factory: IVF4096_HNSW128,PQ128,compressed_indexer_gpu_train: False,compressed_indexer_nprobe: 64,hnsw_indexer_store_n: 128,hnsw_ef_search: 128,hnsw_ef_construction: 200,rag_turn_n_turns: 2,rag_turn_marginalize: doc_then_turn,rag_turn_discount_factor: 1.0,interactive_mode: False,t5_model_arch: t5-base,t5_model_parallel: False,t5_dropout: 0.0,t5_generation_config: None
17:29:48 | your model is being loaded with opts that differ from the model you are initializing the weights with. Add the following args to your run command to change this: 
--init-opt None --task None --batchsize 1 --attention-dropout 0.1 --model-parallel False --optimizer sgd --learningrate 1 --truncate -1 --text-truncate None --label-truncate None --lr-scheduler-patience 3 --dict-loaded False
17:29:49 | Using CUDA
17:29:49 | loading dictionary from /home/acha21/codes/ParlAI/data/models/bart/bart_large/model.dict
17:29:49 | num words = 50264
17:30:08 | Total parameters: 406,286,336 (406,286,336 trainable)
17:30:08 | Loading existing model params from /home/acha21/codes/ParlAI/data/models/bart/bart_large/model
17:30:08 | Detected a fine-tune run. Resetting the optimizer.
17:30:08 | Optimizer was reset. Also resetting LR scheduler.
17:30:08 | Opt:
17:30:08 |     activation: gelu
17:30:08 |     adafactor_eps: '(1e-30, 0.001)'
17:30:08 |     adam_eps: 1e-08
17:30:08 |     add_missing_turns: none
17:30:08 |     add_p1_after_newln: False
17:30:08 |     aggregate_micro: False
17:30:08 |     allow_missing_init_opts: False
17:30:08 |     attention_dropout: 0.0
17:30:08 |     batchsize: 2
17:30:08 |     beam_block_full_context: True
17:30:08 |     beam_block_list_filename: None
17:30:08 |     beam_block_ngram: -1
17:30:08 |     beam_context_block_ngram: -1
17:30:08 |     beam_delay: 30
17:30:08 |     beam_length_penalty: 0.65
17:30:08 |     beam_min_length: 1
17:30:08 |     beam_size: 1
17:30:08 |     betas: '(0.9, 0.999)'
17:30:08 |     bpe_add_prefix_space: None
17:30:08 |     bpe_debug: False
17:30:08 |     bpe_dropout: None
17:30:08 |     bpe_merge: None
17:30:08 |     bpe_vocab: None
17:30:08 |     candidates: inline
17:30:08 |     cap_num_predictions: 100
17:30:08 |     chosen_topic_delimiter: '\n'
17:30:08 |     codes_attention_num_heads: 4
17:30:08 |     codes_attention_type: basic
17:30:08 |     compressed_indexer_factory: IVF4096_HNSW128,PQ128
17:30:08 |     compressed_indexer_gpu_train: False
17:30:08 |     compressed_indexer_nprobe: 64
17:30:08 |     compute_tokenized_bleu: False
17:30:08 |     data_parallel: False
17:30:08 |     datapath: /home/acha21/codes/ParlAI/data
17:30:08 |     datatype: train
17:30:08 |     delimiter: '\n'
17:30:08 |     dict_class: parlai.core.dict:DictionaryAgent
17:30:08 |     dict_endtoken: __end__
17:30:08 |     dict_file: /home/acha21/codes/ParlAI/data/models/bart/bart_large/model.dict
17:30:08 |     dict_include_test: False
17:30:08 |     dict_include_valid: False
17:30:08 |     dict_initpath: None
17:30:08 |     dict_language: english
17:30:08 |     dict_loaded: True
17:30:08 |     dict_lower: False
17:30:08 |     dict_max_ngram_size: -1
17:30:08 |     dict_maxexs: -1
17:30:08 |     dict_maxtokens: -1
17:30:08 |     dict_minfreq: 0
17:30:08 |     dict_nulltoken: __null__
17:30:08 |     dict_starttoken: __start__
17:30:08 |     dict_textfields: text,labels
17:30:08 |     dict_tokenizer: gpt2
17:30:08 |     dict_unktoken: __unk__
17:30:08 |     display_examples: False
17:30:08 |     download_path: None
17:30:08 |     dpr_model_file: zoo:hallucination/multiset_dpr/hf_bert_base.cp
17:30:08 |     dpr_num_docs: 25
17:30:08 |     dropout: 0.1
17:30:08 |     dynamic_batching: None
17:30:08 |     embedding_projection: random
17:30:08 |     embedding_size: 1024
17:30:08 |     embedding_type: random
17:30:08 |     embeddings_scale: False
17:30:08 |     encode_candidate_vecs: True
17:30:08 |     encode_candidate_vecs_batchsize: 256
17:30:08 |     eval_batchsize: None
17:30:08 |     eval_candidates: inline
17:30:08 |     eval_dynamic_batching: None
17:30:08 |     evaltask: None
17:30:08 |     ffn_size: 4096
17:30:08 |     fixed_candidate_vecs: reuse
17:30:08 |     fixed_candidates_path: None
17:30:08 |     force_fp16_tokens: True
17:30:08 |     fp16: True
17:30:08 |     fp16_impl: safe
17:30:08 |     generation_model: bart
17:30:08 |     gold_knowledge_passage_key: checked_sentence
17:30:08 |     gold_knowledge_title_key: title
17:30:08 |     gpu: -1
17:30:08 |     gradient_clip: 0.1
17:30:08 |     hide_labels: False
17:30:08 |     history_add_global_end_token: None
17:30:08 |     history_reversed: False
17:30:08 |     history_size: -1
17:30:08 |     hnsw_ef_construction: 200
17:30:08 |     hnsw_ef_search: 128
17:30:08 |     hnsw_indexer_store_n: 128
17:30:08 |     ignore_bad_candidates: False
17:30:08 |     image_cropsize: 224
17:30:08 |     image_mode: raw
17:30:08 |     image_size: 256
17:30:08 |     include_checked_sentence: True
17:30:08 |     include_knowledge: True
17:30:08 |     include_knowledge_separator: False
17:30:08 |     indexer_buffer_size: 65536
17:30:08 |     indexer_type: compressed
17:30:08 |     inference: greedy
17:30:08 |     init_model: /home/acha21/codes/ParlAI/data/models/bart/bart_large/model
17:30:08 |     init_opt: arch/bart_large
17:30:08 |     interactive_candidates: fixed
17:30:08 |     interactive_mode: False
17:30:08 |     invsqrt_lr_decay_gamma: -1
17:30:08 |     is_debug: False
17:30:08 |     label_truncate: 128
17:30:08 |     label_type: response
17:30:08 |     learn_embeddings: True
17:30:08 |     learn_positional_embeddings: True
17:30:08 |     learningrate: 1e-05
17:30:08 |     load_from_checkpoint: True
17:30:08 |     log_every_n_secs: 30.0
17:30:08 |     log_every_n_steps: 50
17:30:08 |     loglevel: info
17:30:08 |     lr_scheduler: reduceonplateau
17:30:08 |     lr_scheduler_decay: 0.5
17:30:08 |     lr_scheduler_patience: 1
17:30:08 |     max_doc_token_length: 256
17:30:08 |     max_train_steps: -1
17:30:08 |     max_train_time: -1
17:30:08 |     memory_attention: sqrt
17:30:08 |     metrics: default
17:30:08 |     min_doc_token_length: 64
17:30:08 |     model: bart
17:30:08 |     model_file: None
17:30:08 |     model_parallel: True
17:30:08 |     momentum: 0
17:30:08 |     multitask_weights: [1]
17:30:08 |     mutators: None
17:30:08 |     n_decoder_layers: 12
17:30:08 |     n_docs: 5
17:30:08 |     n_encoder_layers: 12
17:30:08 |     n_extra_positions: 0
17:30:08 |     n_heads: 16
17:30:08 |     n_layers: 2
17:30:08 |     n_positions: 1024
17:30:08 |     n_segments: 0
17:30:08 |     nesterov: True
17:30:08 |     no_cuda: False
17:30:08 |     normalize_sent_emb: False
17:30:08 |     num_epochs: -1
17:30:08 |     num_topics: 5
17:30:08 |     num_workers: 0
17:30:08 |     nus: (0.7,)
17:30:08 |     optimizer: adam
17:30:08 |     output_scaling: 1.0
17:30:08 |     override: "{'rag_model_type': 'token', 'rag_retriever_type': 'dpr', 'dpr_model_file': 'zoo:hallucination/multiset_dpr/hf_bert_base.cp', 'generation_model': 'bart', 'retriever_debug_index': 'compressed', 'batchsize': 2, 'fp16': True, 'gradient_clip': 0.1, 'label_truncate': 128, 'log_every_n_secs': 30.0, 'lr_scheduler': 'reduceonplateau', 'lr_scheduler_patience': 1, 'model_parallel': True, 'optimizer': 'adam', 'text_truncate': 512, 'truncate': 512, 'activation': 'gelu', 'attention_dropout': 0.0, 'dict_file': '/home/acha21/codes/ParlAI/data/models/bart/bart_large/model.dict', 'dict_tokenizer': 'gpt2', 'dropout': 0.1, 'embedding_size': 1024, 'embeddings_scale': False, 'ffn_size': 4096, 'force_fp16_tokens': True, 'init_model': 'zoo:bart/bart_large/model', 'learn_positional_embeddings': True, 'model': 'bart', 'n_decoder_layers': 12, 'n_encoder_layers': 12, 'n_heads': 16, 'n_positions': 1024, 'variant': 'bart'}"
17:30:08 |     parlai_home: /home/acha21/codes/ParlAI
17:30:08 |     path_to_dense_embeddings: None
17:30:08 |     path_to_dpr_passages: zoo:hallucination/wiki_passages/psgs_w100.tsv
17:30:08 |     path_to_index: zoo:hallucination/wiki_index_compressed/compressed_pq
17:30:08 |     person_tokens: False
17:30:08 |     poly_attention_num_heads: 4
17:30:08 |     poly_attention_type: basic
17:30:08 |     poly_faiss_model_file: None
17:30:08 |     poly_n_codes: 64
17:30:08 |     poly_score_initial_lambda: 0.5
17:30:08 |     polyencoder_init_model: wikito
17:30:08 |     polyencoder_type: codes
17:30:08 |     print_docs: False
17:30:08 |     query_model: bert
17:30:08 |     rag_model_type: token
17:30:08 |     rag_query_truncate: 512
17:30:08 |     rag_retriever_query: full_history
17:30:08 |     rag_retriever_type: dpr
17:30:08 |     rag_turn_discount_factor: 1.0
17:30:08 |     rag_turn_marginalize: doc_then_turn
17:30:08 |     rag_turn_n_turns: 2
17:30:08 |     rank_candidates: False
17:30:08 |     rank_top_k: -1
17:30:08 |     reduction_type: mean
17:30:08 |     regret: False
17:30:08 |     regret_intermediate_maxlen: 32
17:30:08 |     regret_model_file: None
17:30:08 |     relu_dropout: 0.0
17:30:08 |     repeat_blocking_heuristic: True
17:30:08 |     retriever_debug_index: compressed
17:30:08 |     retriever_embedding_size: 768
17:30:08 |     return_cand_scores: False
17:30:08 |     save_after_valid: False
17:30:08 |     save_every_n_secs: -1
17:30:08 |     share_encoders: True
17:30:08 |     share_word_embeddings: True
17:30:08 |     short_final_eval: False
17:30:08 |     skip_generation: False
17:30:08 |     special_tok_lst: None
17:30:08 |     split_lines: False
17:30:08 |     starttime: Jun22_17-29
17:30:08 |     t5_dropout: 0.0
17:30:08 |     t5_generation_config: None
17:30:08 |     t5_model_arch: t5-base
17:30:08 |     t5_model_parallel: False
17:30:08 |     task: wizard_of_wikipedia
17:30:08 |     temperature: 1.0
17:30:08 |     tensorboard_log: False
17:30:08 |     tensorboard_logdir: None
17:30:08 |     text_truncate: 512
17:30:08 |     tfidf_max_doc_paragraphs: -1
17:30:08 |     tfidf_model_path: zoo:wikipedia_full/tfidf_retriever/model
17:30:08 |     thorough: False
17:30:08 |     topk: 10
17:30:08 |     topp: 0.9
17:30:08 |     train_predict: False
17:30:08 |     truncate: 512
17:30:08 |     update_freq: 1
17:30:08 |     use_memories: False
17:30:08 |     use_reply: label
17:30:08 |     validation_cutoff: 1.0
17:30:08 |     validation_every_n_epochs: 0.25
17:30:08 |     validation_every_n_secs: -1
17:30:08 |     validation_every_n_steps: -1
17:30:08 |     validation_max_exs: 1000
17:30:08 |     validation_metric: ppl
17:30:08 |     validation_metric_mode: min
17:30:08 |     validation_patience: 5
17:30:08 |     validation_share_agent: False
17:30:08 |     variant: bart
17:30:08 |     verbose: False
17:30:08 |     wandb_entity: None
17:30:08 |     wandb_log: False
17:30:08 |     wandb_name: None
17:30:08 |     wandb_project: None
17:30:08 |     warmup_rate: 0.0001
17:30:08 |     warmup_updates: -1
17:30:08 |     weight_decay: None
17:30:08 |     wrap_memory_encoder: False

Additional context
Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

@stephenroller
Copy link
Contributor

Can you tell me what version of python you're on?

@acha21
Copy link
Author

acha21 commented Jun 22, 2021

I use 3.8.10

@klshuster klshuster self-assigned this Jul 22, 2021
@klshuster
Copy link
Contributor

I'll change every command in the README to have their double-hypenated versions to comply with higher Python versions.

In the meantime, you should try using --model instead of -m (rather than --m)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants