Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot run inference on PubMedQA-Large #23

Open
VisionaryMind opened this issue Jan 31, 2023 · 13 comments
Open

Cannot run inference on PubMedQA-Large #23

VisionaryMind opened this issue Jan 31, 2023 · 13 comments

Comments

@VisionaryMind
Copy link

VisionaryMind commented Jan 31, 2023

Using your pre-trained model, the infer_large.sh script is failing as follows:

KeyError: "'_name'"
sed: can't read ../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt: No such file or directory
infer_large.sh: line 31: ../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt: No such file or directory
Traceback (most recent call last):
  File "/mnt/d/ml/biogpt/examples/QA-PubMedQA/postprocess.py", line 37, in <module>
    with open(out_file, "r", encoding="utf8") as fr:
FileNotFoundError: [Errno 2] No such file or directory: '../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt.detok'
Traceback (most recent call last):
  File "/mnt/d/ml/biogpt/examples/QA-PubMedQA/hard_match_evaluation.py", line 37, in <module>
    main()
  File "/mnt/d/ml/biogpt/examples/QA-PubMedQA/hard_match_evaluation.py", line 19, in main
    with open(pred_file) as reader:
FileNotFoundError: [Errno 2] No such file or directory: '../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt.detok.extracted.txt'

Please let me know if you have any suggestions to get it working. There seems to be a problem generating the output file.

@AdirthaBorgohain
Copy link

You probably will need to run the preprocess_large.sh script first before running the infer_large.sh script.

@bgriffen
Copy link

bgriffen commented Feb 3, 2023

I also have the same error, even after running preprocess_large.sh. That preprocessing script doesn't seem to create the first file that's missing generate_checkpoint_avg.pt.

@AdirthaBorgohain
Copy link

AdirthaBorgohain commented Feb 3, 2023

Did you download and extract the trained checkpoint tgz file in the required directory?

If not, you need to do these steps:

mkdir checkpoints  
cd checkpoints  
wget https://msramllasc.blob.core.windows.net/modelrelease/BioGPT/checkpoints/QA-PubMedQA-BioGPT-Large.tgz  
tar -zxvf QA-PubMedQA-BioGPT-Large.tgz

@VisionaryMind
Copy link
Author

I both downloaded and extracted the QA-PubMedQA-BioGPT-Large.tgz into the checkpoints directory and ran preprocess_large.sh. This error still occurs.

@AdirthaBorgohain
Copy link

Did you put the checkpoints directory inside the BioGPT directory? Because the paths it uses is relative and all the necessary directories has to be inside the BioGPT folder. From your error, it seems that it is not able to find the generate_checkpoint_avg.pt checkpoint file under the correct path.

@VisionaryMind
Copy link
Author

VisionaryMind commented Feb 3, 2023

Did you put the checkpoints directory inside the BioGPT directory? Because the paths it uses is relative and all the necessary directories has to be inside the BioGPT folder. From your error, it seems that it is not able to find the generate_checkpoint_avg.pt checkpoint file under the correct path.

I am not sure I understand your response. Isn't generate_checkpoint_avg.pt created by the infer_large.sh script? These variables:

MODEL_DIR=../../checkpoints/QA-PubMedQA-BioGPT-Large
MODEL=checkpoint_avg.pt
OUTPUT_FILE=generate_${MODEL}
OUTPUT_FILE=${MODEL_DIR}/${OUTPUT_FILE}

In my case, MODEL_DIR is present under checkpoints and it only contains one file (checkpoint_avg.pt). Further, in the inference section of the script, inference.py is called to create OUTPUT_FILE if it does not exist:

# inference
if [ ! -f "${OUTPUT_FILE}" ]; then
    echo "Begin inferencing ${INPUT_FILE} using ${MODEL_DIR}/${MODEL}"
    python ../../inference.py --data_dir=${DATA_DIR} --model_dir=${MODEL_DIR} --model_file=${MODEL} --src_file=${INPUT_FILE} --output_file=${OUTPUT_FILE}
fi

In fact, inference will not run if the output file is there. Inference.py creates the file here at the bottom of main:

def main(args):
    src_inputs = []
    with open(args.src_file) as reader:
        for line in reader:
            src_inputs.append(line.strip())
    
    m = TransformerLanguageModelPrompt.from_pretrained(
        args.model_dir, 
        args.model_file, 
        args.data_dir,
        max_len_b=args.decoding_length,
        max_tokens=12000,)

    print(m.cfg)

    if m.cfg.common.fp16:
        print('Converting to float 16')
        m.half()
    m.cuda()

    outputs = m.sample(src_inputs, beam=args.beam)

    with open(f"{args.output_file}", "w", encoding='utf8') as fw:
        for i in range(len(outputs)):
            fw.write(outputs[i] + '\n')

The code appears to be failing in TransformerLanguageModelPrompt.from_pretrained. It doesn't create the output file.

@VisionaryMind
Copy link
Author

VisionaryMind commented Feb 3, 2023

The issue is not in the output file not being found --- that error is happening because debpe is being run in the script before it has been created. Here is the beginning of the error message (that I omitted above):

Begin inferencing ../../data/PubMedQA/raw/biogpt-large-ansis_test.tok.bpe.x using ../../checkpoints/QA-PubMedQA-BioGPT-Large/checkpoint_avg.pt
2023-02-02 23:41:54 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2023-02-02 23:41:54 | INFO | fairseq.file_utils | loading archive file ../../checkpoints/QA-PubMedQA-BioGPT-Large
2023-02-02 23:42:26 | INFO | src.language_modeling_prompt | dictionary: 57717 types
Traceback (most recent call last):
  File "/mnt/d/ML/biogpt/examples/QA-PubMedQA/../../inference.py", line 47, in <module>
    main(args)
  File "/mnt/d/ML/biogpt/examples/QA-PubMedQA/../../inference.py", line 25, in main
    m = TransformerLanguageModelPrompt.from_pretrained(
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/models/fairseq_model.py", line 267, in from_pretrained
    x = hub_utils.from_pretrained(
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/hub_utils.py", line 73, in from_pretrained
    models, args, task = checkpoint_utils.load_model_ensemble_and_task(
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/checkpoint_utils.py", line 469, in load_model_ensemble_and_task
    model = task.build_model(cfg.model, from_checkpoint=True)
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/tasks/language_modeling.py", line 191, in build_model
    model = super().build_model(args, from_checkpoint)
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/tasks/fairseq_task.py", line 671, in build_model
    model = models.build_model(args, self, from_checkpoint)
  File "/home/biogptuser/anaconda3/envs/biogpt/lib/python3.10/site-packages/fairseq/models/__init__.py", line 102, in build_model
    "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys())
KeyError: "'_name'"

It looks like FairSeq cannot find the "_name" key in the MODEL_DATACLASS_REGISTRY. My FairSeq version is 0.12.0, per your recommendation. It looks like the problem is with FairSeq, though I don't yet see where it is coming from. It's failing at this assertion:

    assert model is not None, (
        f"Could not infer model type from {cfg}. "
        "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys())
        + f" Requested model type: {model_type}"

@renqianluo
Copy link
Collaborator

Hi @VisionaryMind , this is due to a rename bug. We have fixed it now. Please pull the latest code and re-download the QA-PubMedQA-BioGPT-Large.tgz checkpoint

@rpolicastro
Copy link

rpolicastro commented Feb 3, 2023

I pulled the latest version from github and redownloaded the checkpoint file. I ended up getting the same error as previous, but the temporary fix here #17 (comment) still resolved the issue.

import torch
from src.transformer_lm_prompt import TransformerLanguageModelPrompt

m = TransformerLanguageModelPrompt.from_pretrained(
        "checkpoints/QA-PubMedQA-BioGPT-Large",
        "checkpoint_avg.pt",
        "data/PubMedQA/biogpt-large-ansis-bin",
        tokenizer='moses',
        bpe='fastbpe',
        bpe_codes="data/biogpt_large_bpecodes",
        min_len=100,
        max_len_b=1024)

@VisionaryMind
Copy link
Author

VisionaryMind commented Feb 3, 2023

@renqianluo I pulled down the latest repository, re-downloaded the QA-PubMedQA-BioGPT-Large.tgz checkpoint, and implemented the above fix listed by @rpolicastro, and encounter the same exact error message. Neither solutions work for me.

@bgriffen
Copy link

bgriffen commented Feb 7, 2023

Ditto, did the exact steps (including running preprocess_large.sh and infer_large.sh + re-download checkpoint) and still get the error:

  File "/home/bgriffen/Desktop/biogpt/BioGPT/examples/QA-PubMedQA/hard_match_evaluation.py", line 19, in main
    with open(pred_file) as reader:
FileNotFoundError: [Errno 2] No such file or directory: '../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt.detok.extracted.txt'

@shashank140195
Copy link

Ditto, did the exact steps (including running preprocess_large.sh and infer_large.sh + re-download checkpoint) and still get the error:

  File "/home/bgriffen/Desktop/biogpt/BioGPT/examples/QA-PubMedQA/hard_match_evaluation.py", line 19, in main
    with open(pred_file) as reader:
FileNotFoundError: [Errno 2] No such file or directory: '../../checkpoints/QA-PubMedQA-BioGPT-Large/generate_checkpoint_avg.pt.detok.extracted.txt'

Probably because your script didn't generate an average checkpoint. use the best checkpoint instead

@shashank140195
Copy link

shashank140195 commented Mar 24, 2023

I do have a question. how did you download the BioGPT large? using the URL gives me an error that is unable to load the parameters from the checkpoint. Did you use something else to download itit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants