Running run_phoBert.sh and run_predict.sh run into missing file ? #1

DuyquanDuc · 2021-12-27T09:01:26Z

Em Chào Anh !

So I was trying to git clone this version and following your instructions exactly. After changing the file direction, I was hoping it will work, but I found myself 2 bug for 2 separate runs:

!bash /content/ZaloAI2021_LTR/run_predict.sh

and here is the trouble shoot:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Traceback (most recent call last): File "/content/ZaloAI2021_LTR/predict.py", line 29, in load_model model = MODEL_CLASSES[args.model_type][1].from_pretrained(args.model_dir, args=args) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/checkpoint During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/content/ZaloAI2021_LTR/predict.py", line 170, in <module> predict(pred_config) File "/content/ZaloAI2021_LTR/predict.py", line 117, in predict model = load_model(pred_config, args, device) File "/content/ZaloAI2021_LTR/predict.py", line 34, in load_model raise Exception("Some model files might be missing...") Exception: Some model files might be missing...

The second one is trainning from scratch run_phobert.sh:

!bash /content/ZaloAI2021_LTR/run_phobert.sh

and here is the trouble shoot:

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. 12/27/2021 08:44:27 - INFO - data_loader - Creating features from dataset file at train_tokenize_clean.json Traceback (most recent call last): File "/content/ZaloAI2021_LTR/main.py", line 68, in <module> main(args_parse) File "/content/ZaloAI2021_LTR/main.py", line 12, in main train_dataset = load_and_cache_examples(args, tokenizer) File "/content/ZaloAI2021_LTR/data_loader.py", line 136, in load_and_cache_examples examples = create_examples(input_file) File "/content/ZaloAI2021_LTR/data_loader.py", line 36, in create_examples with open(input_file, "r", encoding='utf-8') as reader: FileNotFoundError: [Errno 2] No such file or directory: 'train_tokenize_clean.json'

From what I know, PhoBert of Vin AI dont have any directory that either name train_tokenize_clean.json or this link https://huggingface.co/api/models/checkpoint. My hypothesis is that I may have forgotten to install a library or something else, but I'm not sure what it is even after reading the code. So I'm here for help.

*Note: I'm running using google collab pro with gpu runtime

The text was updated successfully, but these errors were encountered:

hieudx149 · 2022-01-03T01:01:14Z

bạn đã tạo folder checkpoint và đẩy checkpoint của mình vào đó trước khi chạy chưa nhỉ ?
mình đã update lại file run_predict.sh và run_phobert.sh

DuyquanDuc changed the title ~~Running run_phoBert.sh and run_predict.sh keep missing file ?~~ Running run_phoBert.sh and run_predict.sh run into missing file ? Dec 27, 2021

hieudx149 closed this as completed Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running run_phoBert.sh and run_predict.sh run into missing file ? #1

Running run_phoBert.sh and run_predict.sh run into missing file ? #1

DuyquanDuc commented Dec 27, 2021 •

edited

hieudx149 commented Jan 3, 2022 •

edited

Running run_phoBert.sh and run_predict.sh run into missing file ? #1

Running run_phoBert.sh and run_predict.sh run into missing file ? #1

Comments

DuyquanDuc commented Dec 27, 2021 • edited

Em Chào Anh !

hieudx149 commented Jan 3, 2022 • edited

DuyquanDuc commented Dec 27, 2021 •

edited

hieudx149 commented Jan 3, 2022 •

edited