Cannot run inference on PubMedQA-Large #23

VisionaryMind · 2023-01-31T07:51:49Z

Using your pre-trained model, the infer_large.sh script is failing as follows:

KeyError: "'_name'"
sed: can't read ../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt: No such file or directory
infer_large.sh: line 31: ../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt: No such file or directory
Traceback (most recent call last):
  File "/mnt/d/ml/biogpt/examples/QA-PubMedQA/postprocess.py", line 37, in <module>
    with open(out_file, "r", encoding="utf8") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt.detok'
Traceback (most recent call last):
  File "/mnt/d/ml/biogpt/examples/QA-PubMedQA/hard_match_evaluation.py", line 37, in <module>
    main()
  File "/mnt/d/ml/biogpt/examples/QA-PubMedQA/hard_match_evaluation.py", line 19, in main
    with open(pred_file) as reader:
FileNotFoundError: [Errno 2] No such file or directory: '../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt.detok.extracted.txt'

Please let me know if you have any suggestions to get it working. There seems to be a problem generating the output file.

The text was updated successfully, but these errors were encountered:

AdirthaBorgohain · 2023-02-01T12:48:50Z

You probably will need to run the preprocess_large.sh script first before running the infer_large.sh script.

bgriffen · 2023-02-03T04:11:41Z

I also have the same error, even after running preprocess_large.sh. That preprocessing script doesn't seem to create the first file that's missing generate_checkpoint_avg.pt.

AdirthaBorgohain · 2023-02-03T04:14:37Z

Did you download and extract the trained checkpoint tgz file in the required directory?

If not, you need to do these steps:

mkdir checkpoints  
cd checkpoints  
wget https://msramllasc.blob.core.windows.net/modelrelease/BioGPT/checkpoints/QA-PubMedQA-BioGPT-Large.tgz  
tar -zxvf QA-PubMedQA-BioGPT-Large.tgz

VisionaryMind · 2023-02-03T05:19:42Z

I both downloaded and extracted the QA-PubMedQA-BioGPT-Large.tgz into the checkpoints directory and ran preprocess_large.sh. This error still occurs.

AdirthaBorgohain · 2023-02-03T05:22:34Z

Did you put the checkpoints directory inside the BioGPT directory? Because the paths it uses is relative and all the necessary directories has to be inside the BioGPT folder. From your error, it seems that it is not able to find the generate_checkpoint_avg.pt checkpoint file under the correct path.

VisionaryMind · 2023-02-03T05:35:45Z

Did you put the checkpoints directory inside the BioGPT directory? Because the paths it uses is relative and all the necessary directories has to be inside the BioGPT folder. From your error, it seems that it is not able to find the generate_checkpoint_avg.pt checkpoint file under the correct path.

I am not sure I understand your response. Isn't generate_checkpoint_avg.pt created by the infer_large.sh script? These variables:

MODEL_DIR=../../checkpoints/QA-PubMedQA-BioGPT-Large
MODEL=checkpoint_avg.pt
OUTPUT_FILE=generate_${MODEL}
OUTPUT_FILE=${MODEL_DIR}/${OUTPUT_FILE}

In my case, MODEL_DIR is present under checkpoints and it only contains one file (checkpoint_avg.pt). Further, in the inference section of the script, inference.py is called to create OUTPUT_FILE if it does not exist:

# inference
if [ ! -f "${OUTPUT_FILE}" ]; then
    echo "Begin inferencing ${INPUT_FILE} using ${MODEL_DIR}/${MODEL}"
    python ../../inference.py --data_dir=${DATA_DIR} --model_dir=${MODEL_DIR} --model_file=${MODEL} --src_file=${INPUT_FILE} --output_file=${OUTPUT_FILE}
fi

In fact, inference will not run if the output file is there. Inference.py creates the file here at the bottom of main:

def main(args):
    src_inputs = []
    with open(args.src_file) as reader:
        for line in reader:
            src_inputs.append(line.strip())
    
    m = TransformerLanguageModelPrompt.from_pretrained(
        args.model_dir, 
        args.model_file, 
        args.data_dir,
        max_len_b=args.decoding_length,
        max_tokens=12000,)

    print(m.cfg)

    if m.cfg.common.fp16:
        print('Converting to float 16')
        m.half()
    m.cuda()

    outputs = m.sample(src_inputs, beam=args.beam)

    with open(f"{args.output_file}", "w", encoding='utf8') as fw:
        for i in range(len(outputs)):
            fw.write(outputs[i] + '\n')

The code appears to be failing in TransformerLanguageModelPrompt.from_pretrained. It doesn't create the output file.

VisionaryMind · 2023-02-03T05:50:27Z

The issue is not in the output file not being found --- that error is happening because debpe is being run in the script before it has been created. Here is the beginning of the error message (that I omitted above):

Begin inferencing ../../data/PubMedQA/raw/biogpt-large-ansis_test.tok.bpe.x using ../../checkpoints/QA-PubMedQA-BioGPT-Large/checkpoint_avg.pt
2023-02-02 23:41:54 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2023-02-02 23:41:54 | INFO | fairseq.file_utils | loading archive file ../../checkpoints/QA-PubMedQA-BioGPT-Large
2023-02-02 23:42:26 | INFO | src.language_modeling_prompt | dictionary: 57717 types
Traceback (most recent call last):
  File "/mnt/d/ML/biogpt/examples/QA-PubMedQA/../../inference.py", line 47, in <module>
    main(args)
  File "/mnt/d/ML/biogpt/examples/QA-PubMedQA/../../inference.py", line 25, in main
    m = TransformerLanguageModelPrompt.from_pretrained(
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/models/fairseq_model.py", line 267, in from_pretrained
    x = hub_utils.from_pretrained(
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/hub_utils.py", line 73, in from_pretrained
    models, args, task = checkpoint_utils.load_model_ensemble_and_task(
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/checkpoint_utils.py", line 469, in load_model_ensemble_and_task
    model = task.build_model(cfg.model, from_checkpoint=True)
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/tasks/language_modeling.py", line 191, in build_model
    model = super().build_model(args, from_checkpoint)
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/tasks/fairseq_task.py", line 671, in build_model
    model = models.build_model(args, self, from_checkpoint)
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/models/__init__.py", line 102, in build_model
    "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys())
KeyError: "'_name'"

It looks like FairSeq cannot find the "_name" key in the MODEL_DATACLASS_REGISTRY. My FairSeq version is 0.12.0, per your recommendation. It looks like the problem is with FairSeq, though I don't yet see where it is coming from. It's failing at this assertion:

    assert model is not None, (
        f"Could not infer model type from {cfg}. "
        "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys())
        + f" Requested model type: {model_type}"

renqianluo · 2023-02-03T12:19:14Z

Hi @VisionaryMind , this is due to a rename bug. We have fixed it now. Please pull the latest code and re-download the QA-PubMedQA-BioGPT-Large.tgz checkpoint

rpolicastro · 2023-02-03T17:32:41Z

I pulled the latest version from github and redownloaded the checkpoint file. I ended up getting the same error as previous, but the temporary fix here #17 (comment) still resolved the issue.

import torch
from src.transformer_lm_prompt import TransformerLanguageModelPrompt

m = TransformerLanguageModelPrompt.from_pretrained(
        "checkpoints/QA-PubMedQA-BioGPT-Large",
        "checkpoint_avg.pt",
        "data/PubMedQA/biogpt-large-ansis-bin",
        tokenizer='moses',
        bpe='fastbpe',
        bpe_codes="data/biogpt_large_bpecodes",
        min_len=100,
        max_len_b=1024)

VisionaryMind · 2023-02-03T18:40:10Z

@renqianluo I pulled down the latest repository, re-downloaded the QA-PubMedQA-BioGPT-Large.tgz checkpoint, and implemented the above fix listed by @rpolicastro, and encounter the same exact error message. Neither solutions work for me.

bgriffen · 2023-02-07T01:23:55Z

Ditto, did the exact steps (including running preprocess_large.sh and infer_large.sh + re-download checkpoint) and still get the error:

  File "/home/bgriffen/Desktop/biogpt/BioGPT/examples/QA-PubMedQA/hard_match_evaluation.py", line 19, in main
    with open(pred_file) as reader:
FileNotFoundError: [Errno 2] No such file or directory: '../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt.detok.extracted.txt'

shashank140195 · 2023-03-24T22:07:07Z

Ditto, did the exact steps (including running preprocess_large.sh and infer_large.sh + re-download checkpoint) and still get the error:
  File "/home/bgriffen/Desktop/biogpt/BioGPT/examples/QA-PubMedQA/hard_match_evaluation.py", line 19, in main
    with open(pred_file) as reader:
FileNotFoundError: [Errno 2] No such file or directory: '../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt.detok.extracted.txt'

Probably because your script didn't generate an average checkpoint. use the best checkpoint instead

shashank140195 · 2023-03-24T22:08:29Z

I do have a question. how did you download the BioGPT large? using the URL gives me an error that is unable to load the parameters from the checkpoint. Did you use something else to download itit?

armish mentioned this issue Feb 4, 2023

Can't initialize PubMedQA #35

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot run inference on PubMedQA-Large #23

Cannot run inference on PubMedQA-Large #23

VisionaryMind commented Jan 31, 2023 •

edited

Loading

AdirthaBorgohain commented Feb 1, 2023

bgriffen commented Feb 3, 2023 •

edited

Loading

AdirthaBorgohain commented Feb 3, 2023 •

edited

Loading

VisionaryMind commented Feb 3, 2023

AdirthaBorgohain commented Feb 3, 2023

VisionaryMind commented Feb 3, 2023 •

edited

Loading

VisionaryMind commented Feb 3, 2023 •

edited

Loading

renqianluo commented Feb 3, 2023

rpolicastro commented Feb 3, 2023 •

edited

Loading

VisionaryMind commented Feb 3, 2023 •

edited

Loading

bgriffen commented Feb 7, 2023 •

edited

Loading

shashank140195 commented Mar 24, 2023

shashank140195 commented Mar 24, 2023 •

edited

Loading

Cannot run inference on PubMedQA-Large #23

Cannot run inference on PubMedQA-Large #23

Comments

VisionaryMind commented Jan 31, 2023 • edited Loading

AdirthaBorgohain commented Feb 1, 2023

bgriffen commented Feb 3, 2023 • edited Loading

AdirthaBorgohain commented Feb 3, 2023 • edited Loading

VisionaryMind commented Feb 3, 2023

AdirthaBorgohain commented Feb 3, 2023

VisionaryMind commented Feb 3, 2023 • edited Loading

VisionaryMind commented Feb 3, 2023 • edited Loading

renqianluo commented Feb 3, 2023

rpolicastro commented Feb 3, 2023 • edited Loading

VisionaryMind commented Feb 3, 2023 • edited Loading

bgriffen commented Feb 7, 2023 • edited Loading

shashank140195 commented Mar 24, 2023

shashank140195 commented Mar 24, 2023 • edited Loading

VisionaryMind commented Jan 31, 2023 •

edited

Loading

bgriffen commented Feb 3, 2023 •

edited

Loading

AdirthaBorgohain commented Feb 3, 2023 •

edited

Loading

VisionaryMind commented Feb 3, 2023 •

edited

Loading

VisionaryMind commented Feb 3, 2023 •

edited

Loading

rpolicastro commented Feb 3, 2023 •

edited

Loading

VisionaryMind commented Feb 3, 2023 •

edited

Loading

bgriffen commented Feb 7, 2023 •

edited

Loading

shashank140195 commented Mar 24, 2023 •

edited

Loading