How to train the captioning module on ground truth proposals #47

adeljalalyousif · 2023-03-10T13:48:01Z

Hi Iashin, I need to train the captioning module on ground truth proposals. What should I do?

v-iashin · 2023-03-10T13:51:26Z

Hi adeljalalyousif

To train the captioning module on ground truth proposals, run the following:

# conda activate bmt
python main.py \
    --procedure train_cap \
    --B 32

adeljalalyousif · 2023-03-10T20:13:35Z

Thanks for your response, but I got this error "FileNotFoundError: [Errno 2] No such file or directory './best_prop_model.pt' " :

{'B': 32,
'H': 4,
'N': 2,
'anchors_num_audio': 48,
'anchors_num_video': 128,
'audio_feature_name': 'vggish',
'audio_feature_timespan': 0.96,
'audio_features_path': './data/vggish_npy/',
'avail_mp4_path': './data/available_mp4.txt',
'betas': [0.9, 0.999],
'conv_layers_audio': [512, 512],
'conv_layers_video': [512, 512],
'd_aud': 128,
'd_ff_audio': None,
'd_ff_caps': None,
'd_ff_video': None,
'd_model': 1024,
'd_model_audio': None,
'd_model_caps': 300,
'd_model_video': None,
'd_vid': 1024,
'debug': False,
'device_ids': [0],
'dout_p': 0.1,
'early_stop_after': 30,
'end_token': '',
'epoch_num': 4,
'eps': 1e-08,
'feature_timespan_in_fps': 64,
'finetune_cap_encoder': False,
'finetune_prop_encoder': False,
'fps_at_extraction': 25,
'grad_clip': None,
'inf_B_coeff': 2,
'kernel_sizes_audio': [5, 13, 23, 35, 51, 69, 91, 121, 161, 211],
'kernel_sizes_video': [1, 5, 9, 13, 19, 25, 35, 45, 61, 79],
'layer_norm': False,
'log_dir': './log/',
'lr': 5e-05,
'lr_patience': None,
'lr_reduce_factor': None,
'max_len': 30,
'max_prop_per_vid': 100,
'min_freq_caps': 1,
'modality': 'audio_video',
'model': 'av_transformer',
'momentum': 0.0,
'nms_tiou_thresh': None,
'noobj_coeff': 100,
'obj_coeff': 1,
'one_by_one_starts_at': 1,
'optimizer': 'adam',
'pad_audio_feats_up_to': 800,
'pad_token': '',
'pad_video_feats_up_to': 300,
'pretrained_cap_model_path': './log/best_cap_model.pt',
'pretrained_prop_model_path': None,
'procedure': 'train_cap',
'prop_pred_path': './log/prop_results_val_1_e0_maxprop100.json',
'reference_paths': ['./data/val_1_no_missings.json',
'./data/val_2_no_missings.json'],
'scheduler': 'constant',
'smoothing': 0.7,
'start_token': '',
'tIoUs': [0.3, 0.5, 0.7, 0.9],
'to_log': True,
'train_json_path': './data/train.json',
'train_meta_path': './data/train.csv',
'unfreeze_word_emb': False,
'use_linear_embedder': False,
'val_1_meta_path': './data/val_1.csv',
'val_2_meta_path': './data/val_2.csv',
'val_prop_meta_path': None,
'video_feature_name': 'i3d',
'video_features_path': './data/i3d_25fps_stack64step64_2stream_npy/',
'weight_decay': 0,
'word_emb_caps': 'glove.840B.300d'}
Contructing caption_iterator for "train" phase
Contructing caption_iterator for "val_1" phase
Contructing caption_iterator for "val_2" phase
Using vanilla Generator
initialization: xavier
Glove emb of the same size as d_model_caps
Pretrained prop path:
./best_prop_model.pt
Traceback (most recent call last):
File "main.py", line 200, in
main(cfg)
File "main.py", line 11, in main
train_cap(cfg)
File "/media/adel/Data3/BMT_original/scripts/train_captioning_module.py", line 40, in train_cap
model = BiModalTransformer(cfg, train_dataset)
File "/media/adel/Data3/BMT_original/model/captioning_module.py", line 151, in init
cap_model_cpt = torch.load(cfg.pretrained_prop_model_path, map_location='cpu')
File "/home/adel/miniconda3/envs/tr_17/lib/python3.8/site-packages/torch/serialization.py", line 581, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/adel/miniconda3/envs/tr_17/lib/python3.8/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/adel/miniconda3/envs/tr_17/lib/python3.8/site-packages/torch/serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: './best_prop_model.pt'

###########################################################################

I need to train the captioning module on ground truth proposals without using learned proposals

adeljalalyousif · 2023-03-10T20:29:41Z

after downloading 'best_prop_model.pt' the training is work but on cpu, how to making training run on gpu, I have
(RTX-3060, 6G) I think my gpu RAM is insufficient .
So how to train the captioning module based on ground truth proposals without using learned proposals

v-iashin closed this as completed Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to train the captioning module on ground truth proposals #47

How to train the captioning module on ground truth proposals #47

adeljalalyousif commented Mar 10, 2023

v-iashin commented Mar 10, 2023

adeljalalyousif commented Mar 10, 2023 •

edited

Loading

adeljalalyousif commented Mar 10, 2023 •

edited

Loading

How to train the captioning module on ground truth proposals #47

How to train the captioning module on ground truth proposals #47

Comments

adeljalalyousif commented Mar 10, 2023

v-iashin commented Mar 10, 2023

adeljalalyousif commented Mar 10, 2023 • edited Loading

adeljalalyousif commented Mar 10, 2023 • edited Loading

adeljalalyousif commented Mar 10, 2023 •

edited

Loading

adeljalalyousif commented Mar 10, 2023 •

edited

Loading