Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError 'source' when finetuning #6

Closed
PhMeier opened this issue Apr 28, 2022 · 2 comments
Closed

KeyError 'source' when finetuning #6

PhMeier opened this issue Apr 28, 2022 · 2 comments

Comments

@PhMeier
Copy link

PhMeier commented Apr 28, 2022

Hello,
during testing finetuning in a conda environment on the example data I encountered the following exception:

Traceback (most recent call last):
  File "/home/students/meier/AMRBART/fine-tune/run_amrparsing.py", line 154, in <module>
    main(args)
  File "/home/students/meier/AMRBART/fine-tune/run_amrparsing.py", line 129, in main
    trainer.fit(model, datamodule=data_module)
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in fit
    self._call_and_handle_interrupt(
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1193, in _run
    self._dispatch()
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1272, in _dispatch
    self.training_type_plugin.start_training(self)
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
    self._results = trainer.run_stage()
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1282, in run_stage
    return self._run_train()
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1304, in _run_train
    self._run_sanity_check(self.lightning_module)
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1368, in _run_sanity_check
    self._evaluation_loop.run()
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 151, in run
    output = self.on_run_end()
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 130, in on_run_end
    self._evaluation_epoch_end(outputs)
  File "/home/students/meier/anaconda3/envs/my_AMRBART_env/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 235, in _evaluation_epoch_end
    model.validation_epoch_end(outputs)
  File "/home/students/meier/AMRBART/fine-tune/model_interface/model_amrparsing.py", line 320, in validation_epoch_end
    source = flatten_list(x["source"] for x in ori_outputs)
  File "/home/students/meier/AMRBART/fine-tune/common/utils.py", line 109, in flatten_list
    return [x for x in itertools.chain.from_iterable(summary_ids)]
  File "/home/students/meier/AMRBART/fine-tune/common/utils.py", line 109, in <listcomp>
    return [x for x in itertools.chain.from_iterable(summary_ids)]
  File "/home/students/meier/AMRBART/fine-tune/model_interface/model_amrparsing.py", line 320, in <genexpr>
    source = flatten_list(x["source"] for x in ori_outputs)
KeyError: 'source'

Printing out "ori_outputs" shows this:
ori outputs [{'loss': tensor(0.8626, device='cuda:0'), 'gen_time': 8.689491331577301, 'gen_len': 1024.0, 'preds': [[53842, 36, 53069, 51012, 52944, 36, 53070, 171, 4839, 52945, 36, 53071, 14195, 4839, 4839, 53843, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
The key 'source' is missing.

My sh script looks like this:

#!/bin/bash

ROOT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

GPUID=$2
MODEL=$1
eval_beam=5
modelcate=base
modelcate=large


lr=8e-6

datacate=/home/students/meier/AMRBART/examples/ #/home/students/meier/MA/data/ #AMR2.0
# datacate=AMR3.0


Tokenizer=facebook/bart-$modelcate  #../../../data/pretrained-model/bart-$modelcate
export OUTPUT_DIR_NAME=outputs/fine_tune_amrparse #${datacate}-AMRBart-${modelcate}-amrparsing-6taskPLM-5e-5-finetune-lr${lr}

export CURRENT_DIR=${ROOT_DIR}
export OUTPUT_DIR=${CURRENT_DIR}/${OUTPUT_DIR_NAME}
cache=~/.cache  #../../../data/.cache/

if [ ! -d $OUTPUT_DIR ];then
  mkdir -p $OUTPUT_DIR
else
  echo "${OUTPUT_DIR} already exists, change a new one or delete origin one"
  exit 0
fi

export OMP_NUM_THREADS=10
export CUDA_VISIBLE_DEVICES=${GPUID}
python -u ${ROOT_DIR}/run_amrparsing.py \
    --data_dir=$datacate \
    --train_data_file=$datacate/train.jsonl \
    --eval_data_file=$datacate/val.jsonl \
    --test_data_file=$datacate/test.jsonl \
    --model_type ${MODEL} \
    --model_name_or_path=${MODEL} \
    --tokenizer_name_or_path=${Tokenizer} \
    --val_metric "smatch" \
    --learning_rate=${lr} \
    --max_epochs 20 \
    --max_steps -1 \
    --per_gpu_train_batch_size=4 \
    --per_gpu_eval_batch_size=4 \
    --unified_input \
    --accumulate_grad_batches 2 \
    --early_stopping_patience 10 \
    --gpus 1 \
    --output_dir=${OUTPUT_DIR} \
    --cache_dir ${cache} \
    --num_sanity_val_steps 4 \
    --src_block_size=512 \
    --tgt_block_size=1024 \
    --eval_max_length=1024 \
    --train_num_workers 8 \
    --eval_num_workers 4 \
    --process_num_workers 8 \
    --do_train --do_predict \
    --seed 42 \
    --fp16 \
    --eval_beam ${eval_beam} 2>&1 | tee $OUTPUT_DIR/run.log

I run call the script in the following way:
srun ~/AMRBART/fine-tune/finetune_AMRbart_amrparsing_large.sh /workspace/students/meier/AMR_Bart_models/AMR-BART-LARGE 0

What can I do to solve the problem?
Thanks for reading!

@goodbai-nlp
Copy link
Owner

Hi, please clone the latest code and try again. If you still get errors, please post here.

@PhMeier
Copy link
Author

PhMeier commented Apr 28, 2022

Thank you very much, this solved the issue!

@PhMeier PhMeier closed this as completed Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants